Information processing apparatus and method

CN116670716BActive Publication Date: 2026-06-30SONY GROUP CORP

Patent Information

Authority / Receiving Office
CN · China
Patent Type
Patents(China)
Current Assignee / Owner
SONY GROUP CORP
Filing Date
2021-12-24
Publication Date
2026-06-30

AI Technical Summary

Technical Problem

Encoding attribute data in LiDAR scan point cloud data in polar coordinates is less efficient because the LiDAR scan order affects the correlation between points in polar and Cartesian coordinate systems.

Method used

The geometric data of the point cloud is transformed from the polar coordinate system to the Cartesian coordinate system, and the predicted values ​​are set through the reference relationship in the Cartesian coordinate system. The prediction residuals of the attribute data are calculated and encoded.

Benefits of technology

It improves the encoding efficiency of attribute data, reduces the amount of encoding, reduces latency and circuit size, and reduces encoding costs.

✦ Generated by Eureka AI based on patent content.

Smart Images

  • Figure CN116670716B_ABST
    Figure CN116670716B_ABST
Patent Text Reader

Abstract

This disclosure relates to information processing apparatuses and methods that can suppress reductions in coding efficiency. For a point cloud representing a three-dimensional object as a set of points, the coordinate system of the geometric data is transformed from a polar coordinate system to a Cartesian coordinate system; a reference relationship is established using the geometric data in the generated Cartesian coordinate system, the reference relationship indicating a reference target for calculating predicted values ​​of attribute data for a target point; a prediction residual is calculated as the difference between the attribute data of the target point and the predicted values ​​derived based on the established reference relationship; and the calculated prediction residual is encoded. This disclosure can be applied to, for example, information processing apparatuses, encoding apparatuses, decoding apparatuses, electronic devices, information processing methods, or programs.
Need to check novelty before this filing date? Find Prior Art

Description

Technical Field

[0001] This disclosure relates to information processing apparatus and methods, and more particularly to information processing apparatus and methods capable of suppressing reductions in coding efficiency. Background Technology

[0002] To date, methods for encoding 3D data representing three-dimensional structures such as point clouds have been considered (e.g., see Non-Patent Literature 1). Furthermore, a method known as predictive geometry has been considered, in which, when encoding the geometric data of the point cloud, the difference between the predicted value and the predicted value (prediction residual) is calculated, and the prediction residual is encoded (e.g., see Non-Patent Literature 2). Additionally, for predictive geometry, patterns in which geometric data is represented by polar coordinates have been considered.

[0003] Citation List

[0004] Non-patent literature

[0005] [Non-patent literature 1]

[0006] R. Mekuria, IEEE Student Fellow; K. Blom, P. Cesar, IEEE Fellow; “Design, Implementation and Evaluation of a Point Cloud Codec for Tele-Immersive Video”, tcsvt_paper_submitted_february.pdf

[0007] [Non-patent literature 2]

[0008] Zhenzhen Gao, David Flynn, Alexis Tourapis, and Khaled Mammou, “[G-PCC][New proposal] Predictive Geometry Coding” ISO / IEC JTC1 / SC29 / WG11 MPEG2019 / m51012, October 2019, Geneva, Switzerland Summary of the Invention

[0009] Technical issues

[0010] However, when referencing these relationships, polar coordinates tend to be farther apart than in Cartesian coordinates due to the order of LiDAR scans. In contrast, attribute data tend to have higher correlations due to their shorter distances. Therefore, when calculating predicted values ​​for attribute data from surrounding points and encoding the prediction residuals, coding efficiency in polar coordinates may be lower than coding efficiency in Cartesian coordinates.

[0011] This invention is designed in view of such circumstances in order to suppress the reduction in coding efficiency.

[0012] Solution to the problem

[0013] An information processing apparatus according to one aspect of the present technology includes: a coordinate transformation unit that transforms the coordinate system of geometric data from a polar coordinate system to a Cartesian coordinate system for a point cloud representing a set of points as a three-dimensional object; a reference relationship setting unit that sets a reference relationship using geometric data in the Cartesian coordinate system generated by the coordinate transformation unit, the reference relationship indicating a reference target for calculating predicted values ​​of attribute data of a processed target point; a prediction residual calculation unit that calculates a prediction residual, the prediction residual being the difference between the attribute data of the processed target point and the predicted value calculated based on the reference relationship set by the reference relationship setting unit; and a prediction residual encoding unit that encodes the prediction residual calculated by the prediction residual calculation unit.

[0014] According to one aspect of the present technology, the information processing method includes: for a point cloud representing a set of points as three-dimensional objects, transforming the coordinate system of geometric data from a polar coordinate system to a Cartesian coordinate system; setting a reference relation using the geometric data in the generated Cartesian coordinate system, the reference relation indicating a reference target for calculating predicted values ​​of attribute data of a target point; calculating a prediction residual, the prediction residual being the difference between the attribute data of the target point and the predicted value calculated based on the set reference relation; and encoding the calculated prediction residual.

[0015] An information processing apparatus according to another aspect of the present technology includes: a coordinate transformation unit that transforms the coordinate system of geometric data from a polar coordinate system to a Cartesian coordinate system for a point cloud representing a set of points as a three-dimensional object; a reference relationship setting unit that sets a reference relationship using geometric data in the Cartesian coordinate system generated by the coordinate transformation unit, the reference relationship indicating a reference target for calculating predicted values ​​of attribute data for processing target points; a prediction residual decoding unit that decodes encoded data to calculate a prediction residual, the prediction residual being the difference between attribute data and predicted values; and an attribute data generation unit that generates attribute data by adding the prediction residual calculated by the prediction residual decoding unit to the predicted values ​​calculated based on the reference relationship set by the reference relationship setting unit.

[0016] According to another aspect of the present technology, the information processing method includes: for a point cloud representing a set of points as three-dimensional objects, transforming the coordinate system of geometric data from a polar coordinate system to a Cartesian coordinate system; setting a reference relation using the geometric data in the generated Cartesian coordinate system, the reference relation indicating a reference target for calculating predicted values ​​of attribute data for processing target points; decoding encoded data to calculate a prediction residual, the prediction residual being the difference between the attribute data and the predicted value; and generating attribute data by adding the calculated prediction residual to the predicted value calculated based on the set reference relation.

[0017] In an information processing apparatus and method according to one aspect of the present technology, for a point cloud representing a set of points as three-dimensional objects, the coordinate system of the geometric data is transformed from a polar coordinate system to a Cartesian coordinate system; a reference relationship is set using the geometric data in the generated Cartesian coordinate system, the reference relationship indicating a reference target for calculating predicted values ​​of attribute data of the target point; a prediction residual is calculated, which is the difference between the attribute data of the target point and the predicted value calculated based on the set reference relationship; and the calculated prediction residual is encoded.

[0018] In other aspects of the information processing apparatus and method according to the present technology, for a point cloud representing a three-dimensional object as a set of points, the coordinate system of the geometric data is transformed from a polar coordinate system to a Cartesian coordinate system; a reference relationship is set using the geometric data generated in the Cartesian coordinate system, the reference relationship indicating a reference target for calculating the predicted value of the attribute data of the target point; the encoded data is decoded to calculate the prediction residual, which is the difference between the attribute data and the predicted value; and the attribute data is generated by adding the calculated prediction residual and the predicted value calculated based on the set reference relationship. Attached Figure Description

[0019] [ Figure 1[ ] is a diagram illustrating the predictive geometric coding.

[0020] [ Figure 2 [ ] is a diagram illustrating the predictive geometric coding.

[0021] [ Figure 3 The image shows LiDAR data.

[0022] [ Figure 4 This shows an example of a method for encoding attribute data.

[0023] [ Figure 5 An example of coordinate transformation is shown.

[0024] [ Figure 6 An example of a prediction pattern is shown.

[0025] [ Figure 7 This shows an example of handling duplicate points.

[0026] [ Figure 8 An example without a surrounding loop is shown.

[0027] [ Figure 9 An example of a surround scenario is shown.

[0028] [ Figure 10 An example of a surround scenario is shown.

[0029] [ Figure 11 [ ] is a block diagram illustrating a main configuration example of the encoding device.

[0030] [ Figure 12 [ ] is a block diagram showing a main configuration example of a geometric data encoding unit.

[0031] [ Figure 13 [] is a block diagram showing a main configuration example of the attribute data encoding unit.

[0032] [ Figure 14 [] is a flowchart illustrating an example of the encoding process.

[0033] [ Figure 15 [ ] is a flowchart illustrating an example of the process of geometric data encoding and processing.

[0034] [ Figure 16 [ ] is a flowchart illustrating an example of the attribute data encoding process.

[0035] [ Figure 17 [ ] is a block diagram illustrating a main configuration example of a decoding device.

[0036] [ Figure 18 [ ] is a block diagram showing a main configuration example of the geometric data decoding unit.

[0037] [ Figure 19 [] is a block diagram showing a main configuration example of the attribute data decoding unit.

[0038] [ Figure 20 [] is a flowchart used to describe an example of the decoding process.

[0039] [ Figure 21 [ ] is a flowchart illustrating an example of the geometric data decoding process.

[0040] [ Figure 22 [ ] is a flowchart illustrating an example of the attribute data decoding process.

[0041] [ Figure 23 [A block diagram showing an example of a main computer configuration.] Detailed Implementation

[0042] The following describes a mode (hereinafter referred to as an implementation) for carrying out the contents of this disclosure. The description will be given in the following order:

[0043] 1. Coordinate system transformation

[0044] 2. First Embodiment (Encoding Device)

[0045] 3. Second Embodiment (Decoding Device)

[0046] 4. Supplementary Explanation

[0047] <1. Coordinate System Transformation>

[0048] Supporting documents and terminology, and others.

[0049] The scope of disclosure in this technology includes not only the details described in the embodiments, but also the details described in the following non-patent documents known at the time of filing of this application.

[0050] [Non-patent literature 1]

[0051] (As mentioned above)

[0052] [Non-patent literature 2]

[0053] (As mentioned above)

[0054] In other words, the details described in the above non-patent documents, the details of other documents referenced in the above non-patent documents, and others are also the basis for determining the supporting claims.

[0055] <Point Cloud>

[0056] To date, point clouds, representing three-dimensional structures (objects with three-dimensional shapes) as a collection of points, have existed as 3D data. Point cloud data (also known as point cloud data) includes the positional information (also known as geometry) and attribute information (also known as properties) of each point. Attributes can include any type of information. For example, attributes can include the color information, reflectivity information, normal information, etc., of each point. Therefore, point clouds have a relatively simple data structure and can represent any three-dimensional structure with sufficient accuracy using a sufficiently large number of points.

[0057] Predictive Geometric Coding

[0058] Because the amount of such point cloud data is relatively large, it is usually reduced through methods such as encoding for recording or transmitting the data. Various methods have been proposed for encoding. For example, Non-Patent Document 2 describes predictive geometry coding as a method for encoding geometric data.

[0059] In predictive geometry coding, the difference between the geometric data of each point and the predicted value of the geometric data (also known as the prediction residual) is calculated, and the prediction residual is encoded. The geometric data of other points are referenced when calculating the predicted value.

[0060] For example, such as Figure 1 As shown, a reference structure (also known as a prediction tree) is formed that indicates which point's geometric data to refer to when calculating the predicted value of each point's geometric data. Figure 1 In the diagram, circles indicate points, and arrows indicate reference relationships. Any method can be used to form this reference structure. For example, the reference structure is formed such that the geometry of nearby points can be referenced.

[0061] form Figure 1 The example prediction tree has a point 11 (root vertex) that does not reference any other point's geometry, a point 12 (branch vertex with one offspring) that is referenced by another point, a point 13 (branch vertex with three offspring) that is referenced by three other points, a point 14 (branch vertex with two offspring) that is referenced by two other points, and a point 15 (leaf vertex) that is not referenced by any other point.

[0062] Note that in Figure 1 In the diagram, although only one point is marked with 12, all points indicated by the white circle are point 12. Similarly, although only one point is marked with 14, Figure 1 All points indicated by the shaded circle are point 14. Similarly, although only one point is marked with 15, Figure 1 All points indicated by gray circles are point 15. Note that this prediction tree structure is an example, and prediction trees are not limited to this. Figure 1The example is shown in the image. Therefore, any number of points 11 to 15 can be used. The pattern for the number of points to be referenced is also not limited to this. Figure 1 Examples can be included, for instance, of points referenced by four or more points.

[0063] Based on this reference structure (prediction tree), a predicted value for the geometric data of each point is calculated. For example, the predicted value is calculated using four methods (four modes), and the best predicted value is selected from the predicted values.

[0064] For example, in such Figure 2 In the reference structure of points 21 to 24, the following assumptions are made: point 24 is set as the processing target point (target point pi), and the predicted value of the geometric data of point 24 is calculated. In the first mode, point 23 (P) is referenced by point 24, which serves as the reference target (also called the parent node) in such a reference structure. parent ) is the predicted point 31 of point 24, and the geometric data of the predicted point 31 is the predicted value of the geometric data of point 24. The geometric data of the predicted point 31 (that is, the predicted value of the geometric data of point 24 in the first mode) is called q. (Delta) .

[0065] In the second mode, assuming that in such a reference structure, point 23 is the starting point, and the reference vector (the arrow between point 23 and point 22) is at point 22 (P), which is the parent node of point 23. grandparent The reference vector's inverse vector starts at point 23 and ends at prediction point 32. The geometric data of prediction point 32 is the predicted value of the geometric data of point 24. This geometric data of prediction point 32 (i.e., the predicted value of the geometric data of point 24 in the second mode) is called q. (Linear) .

[0066] In the third mode, assuming a reference structure where point 22 is the starting point, the reference vector (the arrow between points 22 and 21) is at point 21 (P), which is the parent node of point 22. great-grandparent The reference vector's inverse vector starts at point 23 and ends at prediction point 33. The geometric data of prediction point 33 is the predicted value of the geometric data of point 24. This geometric data of prediction point 33 (i.e., the predicted value of the geometric data of point 24 in the third mode) is called q. (Parallelogram) .

[0067] In the fourth mode, point 24 is assumed to be the root point (root vertex), and the geometric data of other points are not referenced. In other words, for point 24, the geometric data of point 24 is encoded instead of the prediction residual. Figure 2 The reference structure in the example, point 24 references point 23, which causes the pattern to be excluded.

[0068] For each mode ( Figure 2 the three modes in the example), the prediction residual (the difference from the geometric data of point 24) is calculated for the predicted value, and the smallest prediction residual is selected. In other words, the predicted point closest to point 24 is selected, and the prediction residual corresponding to that predicted point is selected.

[0069] By performing such processing for each point, the prediction residual of the point is calculated. Then the prediction residual is encoded. Thus, an increase in the amount of encoding can be suppressed.

[0070] <Polar coordinate system>

[0071] In the geometric data of the point cloud, the three-dimensional position of each point is generally represented in the Cartesian coordinate system (x, y, z), but the three-dimensional position of each point can be represented in a coordinate system using angular components such as the polar coordinate system, for example. As Figure 3 shown in A of, in the case of the polar coordinate system, the three-dimensional position of a point is represented by the distance r from a reference point (origin), the angle in the horizontal direction (on the XY plane), and the angle θ with respect to the Z axis (the direction perpendicular to the XY plane).

[0072] <LiDAR data>

[0073] Incidentally, there is known LiDAR (Light Detection and Ranging, or Laser Imaging Detection and Ranging) data for analyzing the distance to a distant object and the attributes of the object by emitting light and measuring the scattered light.

[0074] To generate LiDAR data, for example, a linear scan is performed while changing the angle θ in the polar coordinate system. Then, while changing in the polar coordinate system, such scans are repeated to scan the entire surroundings. As Figure 3 shown in B of, by performing scans in such a process, LiDAR data 41 indicating the result of detecting objects around the observation point 41A is generated. In other words, this LiDAR data 41 consists of a set of linear scan data. Specifically, as Figure 3 in the example of B in, multiple linear scan data are radially distributed around the observation point 41A.

[0075] For geometric data having such a distribution, compared with the Cartesian coordinate system, using the polar coordinate system improves the correlation between points to a greater extent, thereby improving the encoding efficiency.

[0076] <Encoding of attribute data>

[0077] Incidentally, consider the following method, where attribute data is encoded by referencing other data, such as the geometric data mentioned above, to calculate prediction residuals. For attribute data, the correlation between points tends to be higher as the distance between them decreases. Furthermore, since attribute data does not contain location information, it does not change based on the coordinate system. In other words, the correlation of attribute data between points depends not on the coordinate system, but on the distance between the points (the shorter the distance, the higher the correlation).

[0078] In contrast, when referencing their relationships, polar coordinates tend to be further apart from Cartesian coordinates due to the order of LiDAR scans. Therefore, when calculating predicted values ​​and encoding the prediction residuals based on attribute data from surrounding points, the coding efficiency in polar coordinates may be lower than that in Cartesian coordinates.

[0079] Transformation to Cartesian coordinates

[0080] Therefore, as Figure 4 As shown in the top row of the table, when attribute data is encoded using references between points, the coordinate system of the geometric data used to form the reference relationships is set to a Cartesian coordinate system (Method 1). In other words, when the coordinate system of the geometric data is a polar coordinate system, the coordinate system is transformed into a Cartesian coordinate system.

[0081] For example, the information processing method includes: for a point cloud representing a set of points as three-dimensional objects, transforming the coordinate system of geometric data from a polar coordinate system to a Cartesian coordinate system; setting a reference relation using the geometric data in the generated Cartesian coordinate system, the reference relation indicating a reference target for calculating predicted values ​​of attribute data for processing target points; calculating a prediction residual, which is the difference between the attribute data of the processed target points and the predicted values ​​calculated based on the set reference relation; and encoding the calculated prediction residual.

[0082] For example, the information processing apparatus includes: a coordinate transformation unit that transforms the coordinate system of geometric data from a polar coordinate system to a Cartesian coordinate system for a point cloud representing a set of points as a three-dimensional object; a reference relationship setting unit that sets a reference relationship using geometric data in the Cartesian coordinate system generated by the coordinate transformation unit, the reference relationship indicating a reference target for calculating predicted values ​​of attribute data of a processed target point; a prediction residual calculation unit that calculates a prediction residual, the prediction residual being the difference between the attribute data of the processed target point and the predicted value calculated based on the reference relationship set by the reference relationship setting unit; and a prediction residual encoding unit that encodes the prediction residual calculated by the prediction residual calculation unit.

[0083] For example, the information processing method includes: for a point cloud representing a set of points as three-dimensional objects, transforming the coordinate system of geometric data from a polar coordinate system to a Cartesian coordinate system; setting a reference relation using geometric data generated in the Cartesian coordinate system, the reference relation indicating a reference target for calculating predicted values ​​of attribute data for processing target points; decoding encoded data to calculate a prediction residual, which is the difference between the attribute data and the predicted values; and generating attribute data by adding the calculated prediction residual to the predicted values ​​calculated based on the set reference relation.

[0084] For example, the information processing apparatus includes: a coordinate transformation unit that transforms the coordinate system of geometric data from a polar coordinate system to a Cartesian coordinate system for a point cloud representing a set of points as a three-dimensional object; a reference relationship setting unit that sets a reference relationship using geometric data in the Cartesian coordinate system generated by the coordinate transformation unit, the reference relationship indicating a reference target for calculating predicted values ​​of attribute data for processing target points; a prediction residual decoding unit that decodes encoded data to calculate a prediction residual, the prediction residual being the difference between attribute data and predicted values; and an attribute data generation unit that generates attribute data by adding the prediction residual calculated by the prediction residual decoding unit to the predicted values ​​calculated based on the reference relationship set by the reference relationship setting unit.

[0085] In this way, predicted values ​​can be calculated by referencing a closer point compared to the case of polar coordinates, which can suppress the reduction in the encoding efficiency of attribute data.

[0086] Note that any encoding / decoding method can be used for geometric data. Furthermore, any encoding / decoding method can be used for attribute data, as long as the encoding / decoding method references the attribute data of another point, calculates the prediction residual, and encodes / decodes the prediction residual.

[0087] <Form of Reference Relationships>

[0088] like Figure 4 As shown in the second row from the top of the table, the reference relationships for attribute data can be formed based on distances in a Cartesian coordinate system (Method 1-1). For example, reference relationships can be set based on distances to the target point in a Cartesian coordinate system. Thus, for example, reference relationships can be set to refer to points that are closer in distance, and the reduction in coding efficiency can also be suppressed.

[0089] like Figure 4As shown in the third row from the top of the table, the parent and grandparent nodes of the attribute data can be set based on the points whose geometry has already been decoded (Methods 1-2). For example, parent and grandparent nodes can be selected from points to be decoded before processing the target point. Note that, as used in this article, a decoded point refers to a point to be decoded before processing the target node during decoding. In this way, the decoding of the attribute data can begin before all the geometry data is decoded. This allows for decoding of the attribute data (in other words, the point cloud data) with lower latency.

[0090] <Prediction Pattern>

[0091] like Figure 4 As shown in the fourth row from the top of the table, geometric and attribute data can be encoded / decoded (methods 1-3) under the prediction mode that forms the prediction tree. In other words, the encoding / decoding method for geometric data can be performed under the prediction mode (predictive geometry encoding). Furthermore, the encoding / decoding method for attribute data can be performed under the prediction mode (also known as predictive attribute encoding).

[0092] For example, the coordinate system of the geometric data to be encoded in prediction mode can be transformed from polar coordinates to Cartesian coordinates. Alternatively, the coordinate system of the geometric data generated by decoding the encoded geometric data in prediction mode can be transformed from polar coordinates to Cartesian coordinates.

[0093] By associating the encoding / decoding methods for geometric data with those for attribute data, the attribute data corresponding to the geometric data to be decoded can be decoded before all geometric data is decoded. Therefore, point cloud data can be decoded with lower latency.

[0094] Parallelization

[0095] In this case, such as Figure 4 As shown in the fifth row from the top of the table, geometric and attribute data can be encoded or decoded for each node in the reference structure (Method 1-3-1). For example, a reference relationship can be set up to refer to the attribute data corresponding to the processing target node of the geometric encoding in the prediction tree in prediction mode. Alternatively, a reference relationship can be set up to refer to the attribute data corresponding to the processing target node of the geometric decoding in the prediction tree in prediction mode.

[0096] For example, such as Figure 5As shown, assuming the geometric tree information has already been encoded in the specified order, firstly, for the geometric data, the prediction pattern, coefficients, and location of the target node are decoded. Then, if the geometry is in polar coordinates, it is transformed to Cartesian coordinates. After the transformation, the reference source is identified from the decoded points. For example, the closest point in Euclidean distance may be the parent node, and the second closest point in Euclidean distance may be the grandparent node. The parent node indicates the node to which the target node belongs in the reference structure (tree structure). The grandparent node indicates the parent node's parent node. Then, for the attribute data, the prediction pattern, coefficients, and attributes of the target node are decoded.

[0097] In this way, corresponding geometric and attribute data can be decoded with low latency. Therefore, point cloud data can be decoded with low latency. Furthermore, the processing of geometric data and attribute data can be partially shared. This reduces the amount of encoding / decoding processing. Furthermore, encoding / decoding can be performed at higher speeds. Furthermore, the increase in encoding / decoding costs can be suppressed. Furthermore, in the case of hardware implementation, the increase in circuit size can be suppressed.

[0098] <Prediction patterns for attribute data>

[0099] like Figure 4 As shown in the sixth row from the top of the table, a prediction mode for attribute data can be set, and predicted values ​​for the attribute data can be calculated within the set prediction mode (Method 1-3-2). For example, a prediction mode can be set, predicted values ​​can be calculated based on the prediction mode, and prediction residuals can be calculated using the predicted values ​​and attribute data. For example, a prediction mode to be applied can be selected from several pre-prepared candidates. Alternatively, encoded data can be decoded to generate a prediction mode, predicted values ​​can be calculated using the prediction mode, and attribute data can be calculated using the predicted values ​​and prediction residuals.

[0100] Figure 6 It is a bar chart showing the attribute data of the target node (target), the attribute data of the parent node (parent), and the attribute data of the grandparent node (parent's parent).

[0101] For example, the attribute data of the parent node can be used as a predicted value for the attribute data of the target node. Specifically, the difference between the attribute data of the parent node and the attribute data of the target node ( Figure 6 The double-headed arrow 61 in the diagram can be used as the prediction residual (also referred to below as the first prediction pattern). The attribute data of the grandparent node can be used as the predicted value of the attribute data of the target node. Specifically, the difference between the attribute data of the grandparent node and the attribute data of the target node ( Figure 6The double-headed arrow 62 in the diagram can be used as the prediction residual (also referred to below as the second prediction mode). The average of the attribute data of the parent node and the attribute data of the grandparent node can be used as the predicted value of the attribute data of the target node. Specifically, the difference between this average and the attribute data of the target node ( Figure 6 The double-headed arrow 63 in the diagram can be used as the prediction residual (also referred to as the third prediction mode below). Linear prediction of the target node's attribute data can be performed based on the attribute data of the parent node and the attribute data of the grandparent node. Specifically, the difference between the predicted value calculated through linear prediction and the attribute data of the target node (…) Figure 6 The double-headed arrow 64 in the diagram can be used to predict residuals (also referred to below as the fourth prediction mode).

[0102] Of course, these are just examples, and any other prediction model can be used. For instance, a prediction model similar to the one applied to geometric data could be used.

[0103] The prediction mode to be applied can be freely selected from the examples above. For example, a predetermined prediction mode can be applied. The prediction mode applied to the prediction of geometric data can also be applied to the prediction of attribute data corresponding to that geometric data. In other words, the same prediction mode can be applied to both the geometric data and attribute data of a point. In these cases, the prediction of attribute data does not require signaling of the applied prediction mode. In other words, it is not necessary to encode the information indicating the applied prediction mode, and to provide this information to the decoding side, for example, by adding the information to the encoded data of the attribute data. Therefore, the reduction in coding efficiency can be suppressed.

[0104] For example, multiple candidate prediction models can be prepared, and the prediction model to be applied can be selected from these candidates. For example, the first to fourth prediction models described above can be prepared as candidates, and one of them can be selected as the prediction model for attribute data. For example, cost calculations can be used to select the best prediction model.

[0105] Attributes can include any type of information. For example, they can include color, reflectivity, normal vectors, and timestamps. Therefore, the trend in the prediction accuracy of an attribute depends on the information included in the attribute. In other words, which prediction pattern is best suited to improve coding efficiency depends on the nature of the information included in the attribute.

[0106] Therefore, preparing multiple candidates, as described above, allows for the application of a more appropriate prediction mode, regardless of the information included in the attributes. Note that in this example, signaling for the prediction mode to be applied is required. In other words, in this case, for example, information indicating the applied prediction mode is encoded and provided to the decoding side in such a way that this information is added to the encoded data of the attribute data.

[0107] Note that in this example, any number of candidates can be used, as long as there are two or more candidates. The number of candidates can vary depending on the circumstances.

[0108] Note that in any example, for attribute data that includes multiple types of information (for attribute data consisting of multiple elements), a prediction pattern can be set for each (each element) of the information. In other words, the prediction pattern used for all information (elements) can be different.

[0109] Thus, predicted values ​​can be calculated using a wider variety of computational methods. For example, predicted values ​​can be calculated based on the properties of the attribute data. Therefore, the reduction in coding efficiency can be suppressed.

[0110] <Repetition Points>

[0111] like Figure 4 As shown in the seventh row from the top of the table, the attribute data of the repeating points whose geometry matches each other can be sorted and then processed (method 1-3-3).

[0112] For example, for multiple attribute data with the same corresponding geometric information, the multiple attribute data can be sorted according to the magnitude of their values, and the difference between consecutive attribute data in the sorting order can be calculated.

[0113] In a point cloud, multiple points can have the same geometric values ​​but different attribute values. In other words, multiple points with different attribute values ​​can exist at the same location. Such points are also called duplicate points.

[0114] For geometric data of repeated points, only the geometric data of one point is encoded. Then, the number of repetitions is encoded. Therefore, on the decoding side, the geometric data is repeated by the number of repetitions. As described above, attributes can vary from point to point. Therefore, residuals in the attribute data between repeated points can be calculated, and these residuals can be encoded.

[0115] For example, such as Figure 7In the table shown in A, assume there are five duplicate points Idx0 to Idx4. In this case, the differences between the attributes of Idx0 and Idx1, the differences between the attributes of Idx1 and Idx2, the differences between the attributes of Idx2 and Idx3, and the differences between the attributes of Idx3 and Idx4 can be calculated and encoded.

[0116] like Figure 7 In the table shown in B, duplicate points can be sorted in descending order of attribute values, and then their differences can be calculated. Thus, each difference becomes "1", and the result is "15". Figure 7 The sum of the absolute values ​​of the differences in case A is reduced to Figure 7 In the case of B, "5" is used. Therefore, for attribute data with repeated points, the increase in coding volume can be suppressed. Therefore, the decrease in coding efficiency can be suppressed.

[0117] <surround>

[0118] like Figure 4 As shown at the bottom of the table, wrapping can be applied (methods 1-3-4). For example, wrapping can be applied to calculate predicted residuals. Alternatively, wrapping can be applied to generate attribute data.

[0119] For example, for unsigned N-bit attribute data, the residual portion of the attribute data is signed and extends up to (N+1) bits. Therefore, the bit length can increase. Instead of directly encoding the residual, it is better to encode a wraparound value that allows the residual to extend up to N bits.

[0120] For example, such as Figure 8 As shown, for unsigned 8-bit attribute data, when encoded without applying wrapping, the residual will be in the range of [-255, 255]. In other words, 9 bits are needed to represent the residual (the bit length of the residual is 9 bits).

[0121] In comparison, such as Figure 9 As shown, when wrapping is applied and then the residuals are encoded, if the residual between the attribute to be processed and the predicted value exceeds the gray range, the residual value is added to ±256 or subtracted from the residual value by ±256. Therefore, the residual falls within the range of [-128, 127]. In other words, the residual can be represented by 8 bits.

[0122] More specifically, when applying wraparound, it is performed in the encoding of the residual. Figure 10 The operation shown in the square on the left (processing of the encoding) is as follows. Therefore, when the residual exceeds the range of -128 to 127, 256 is added to the residual or 256 is subtracted from the residual to bring the residual back into that range (circumvention).

[0123] In contrast, performing during decoding Figure 10 The operation shown in the square on the right (decoding process) is as follows. Therefore, when the reconstructed value is outside the range of 0 to 255, 256 is added to the reconstructed value or 256 is subtracted from the reconstructed value to bring the reconstructed value back into that range (de-wrap).

[0124] For example, for the predicted value, predict = 200, and processing the target point's attribute value, target = 30, the residual is -170. A wrap around this residual gives a residual of 86.

[0125] For the residual, residual = 86, and the predicted value, predicted = 200, the reconstructed value, reconstructed = 286. The unwinding performed on this reconstructed value gives a reconstructed value, reconstructed = 30. Therefore, processing the attribute value of the target point, target = 30, is reconstructed.

[0126] In this way, the increase in the amount of coding for the prediction residual can be suppressed. Therefore, the decrease in coding efficiency can be suppressed.

[0127] <2. First Implementation Method>

[0128] <Encoding device>

[0129] Figure 11 This is a block diagram illustrating an example configuration of an encoding device for an information processing apparatus to which this technology is applied. Figure 11 The encoding device 100 shown is an apparatus for encoding point clouds (3D data). This technology can be applied to the encoding device 100 (e.g., refer to...). Figures 1 to 10 (The various methods described).

[0130] at the same time, Figure 11 The main components, such as processing units and data streams, are shown, but processing units and data streams are not limited to... Figure 11 The main components are shown. In other words, in Figure 11 The processing units not shown as blocks and in Figure 11 Processing and data streams not shown as arrows, etc., may exist in the encoding device 100.

[0131] like Figure 11 As shown, the encoding device 100 includes a reference structure forming unit 101, a stack 102, a geometric data encoding unit 103, a coordinate transformation unit 104, an attribute data encoding unit 105, and a child node processing unit 106.

[0132] The reference structure forming unit 101 generates a reference structure (prediction tree) for encoding the point cloud of the supplied geometric data. The reference structure forming unit 101 supplies the geometric data and attribute data of the processing target point (the processing target node in the prediction tree) to the stack 102 according to the formed reference structure.

[0133] Stack 102 stores information in a last-in, first-out (LIFO) manner. For example, stack 102 stores geometric data, attribute data, and other information for each point supplied from reference structure forming unit 101. Stack 102 also supplies the most recently saved information to subsequent processing units. For example, stack 102 supplies the geometric data of the most recently saved point to geometric data encoding unit 103 as the geometric data for the target point. Stack 102 also supplies the attribute data of the most recently saved point to attribute data encoding unit 105 as the attribute data for the target point. Furthermore, stack 102 supplies the child node information of the most recently saved point (the target node) to child node processing unit 106. Child node information is information about other nodes (also called child nodes) in the tree structure that belong to the target node.

[0134] The geometry data encoding unit 103 acquires the geometry data of the processing target point supplied from the stack 102. The geometry data encoding unit 103 encodes the geometry data to generate encoded data. For example, the geometry data encoding unit 103 can encode the geometry data in prediction mode.

[0135] The geometric data encoding unit 103 outputs the generated encoded data to the outside of the encoding device 100 as encoded geometric data. For example, the encoded geometric data is transmitted to any other device, such as a decoding device, via a transmission line. The encoded geometric data is also written to and stored, for example, in any storage medium. The geometric data encoding unit 103 also supplies the processed geometric data of the target point to the coordinate transformation unit 104.

[0136] The coordinate transformation unit 104 acquires the geometric data of the target point being processed, supplied by the geometric data encoding unit 103. As described above in <Transformation to Cartesian Angular Coordinates>, when the coordinate system of the geometric data is a polar coordinate system, the coordinate transformation unit 104 transforms the coordinate system of the geometric data from a polar coordinate system to a Cartesian coordinate system. In other words, the coordinate transformation unit 104 uses the geometric data in the polar coordinate system to generate the geometric data in the Cartesian coordinate system. The coordinate transformation unit 104 supplies the generated geometric data in the Cartesian coordinate system to the attribute data encoding unit 105.

[0137] The attribute data encoding unit 105 acquires attribute data of the processing target point supplied from the stack 102. The attribute data encoding unit 105 also acquires geometric data in a Cartesian coordinate system supplied from the coordinate transformation unit 104. The attribute data encoding unit 105 encodes the acquired attribute data to generate encoded data. Furthermore, the attribute data encoding unit 105 constructs reference relationships using the geometric data in the Cartesian coordinate system acquired from the coordinate transformation unit 104 and calculates predicted values ​​for the attribute data. Then, the attribute data encoding unit 105 calculates prediction residuals using the predicted values ​​and encodes the prediction residuals. In other words, the attribute data encoding unit 105 encodes the attribute data using geometric data to generate encoded data.

[0138] The attribute data encoding unit 105 outputs the generated encoded data to the outside of the encoding device 100 as encoded data of the attribute data of the processing target point. For example, the encoded data of the attribute data can be sent to any other device, such as a decoding device, via a transmission line. The encoded data of the attribute data can also be written to and stored in, for example, any storage medium.

[0139] The sub-node processing unit 106 acquires sub-node information of the processing target point supplied from the stack 102. The sub-node processing unit 106 encodes the acquired sub-node information to generate encoded data. The sub-node processing unit 106 outputs the generated encoded data to the outside of the encoding device 100 as encoded data of the sub-node information of the processing target point. For example, the encoded data of the sub-node information can be sent to any other device, such as a decoding device, via a transmission line. The encoded data of the sub-node information can also be written to and stored, for example, in any storage medium.

[0140] Furthermore, when encoding the child nodes of the target node, the child node processing unit 106 controls the reference structure forming unit 101 to supply the geometric data, attribute data, child node information, etc. of the child node to the stack 102, and the stack 102 then stores the geometric data, attribute data, child node information, etc. of the child node.

[0141] In the encoding device 100 described above, the geometric data encoding unit 103, the coordinate transformation unit 104, and the attribute data encoding unit 105, which apply the technology described in <1. Coordinate System Transformation> above, can perform their processing.

[0142] For example, when the coordinate transformation unit 104 transforms geometric data in the polar coordinate system to geometric data in the Cartesian coordinate system, the attribute data encoding unit 105 can encode the attribute data using a reference relationship set based on the geometric data in the Cartesian coordinate system. Therefore, the attribute data encoding unit 105 can reference a closer point compared to the case where the reference relationship is set based on the geometric data in the polar coordinate system. Therefore, compared to the case where the reference relationship is set based on the geometric data in the polar coordinate system, the attribute data encoding unit 105 can improve prediction accuracy. Thus, the reduction in the encoding efficiency of attribute data can be suppressed.

[0143] These processing units (referencing structure forming units 101 to sub-node processing units 106) can have any configuration. For example, each processing unit can be configured with logic circuitry to implement the above-described processes. Furthermore, each processing unit can include, for example, a central processing unit (CPU), read-only memory (ROM), and random access memory (RAM), and uses the CPU, ROM, and RAM to execute a program to implement the above-described processes. Of course, each processing unit can have two configurations, and some of the above-described processes can be implemented by logic circuitry, while others can be implemented by the executing program. Processing units can have independent configurations; for example, some processing units can implement some of the above-described processes according to logic circuitry, some other processing units can execute a program to implement the above-described processes, and even some other processing units can implement the above-described processes according to both logic circuitry and the executing program.

[0144] <Geometric Data Encoding Unit>

[0145] Figure 12 This is a block diagram illustrating a main configuration example of the geometric data encoding unit 103. Meanwhile, Figure 12 The main components, such as the processing unit and data stream, are shown, but the processing unit and data stream are not limited to... Figure 12 The main components are shown. In other words, in Figure 12 The processing units not shown as blocks and in Figure 12 Processing and data flow not shown as arrows, etc., may exist in the geometric data encoding unit 103.

[0146] like Figure 12 As shown, the geometric data encoding unit 103 includes a prediction mode setting unit 141, a prediction residual calculation unit 142, an encoding unit 143, and a prediction point generation unit 144.

[0147] The prediction mode setting unit 141 acquires the geometric data of the processing target point supplied from the stack 102. The prediction mode setting unit 141 sets the prediction mode using this geometric data. The prediction mode setting unit 141 supplies the geometric data and information indicating the set prediction mode to the prediction residual calculation unit 142. The prediction mode setting unit 141 also acquires the prediction points generated by the prediction point generation unit 144.

[0148] The prediction residual calculation unit 142 acquires geometric data and information indicating the prediction mode supplied by the prediction mode setting unit 141. The prediction residual calculation unit 142 also acquires prediction points generated by the prediction point generation unit 144. Under the prediction mode set by the prediction mode setting unit 141, the prediction residual calculation unit 142 calculates the predicted value of the geometric data of the target point and calculates the prediction residual using the geometric data of the target point and the predicted value. The prediction residual calculation unit 142 supplies the calculated prediction residual to the encoding unit 143.

[0149] Encoding unit 143 acquires the prediction residual of the processing target point supplied by prediction residual calculation unit 142. Encoding unit 143 encodes the prediction residual to generate coded data. Encoding unit 143 outputs the generated coded data as coded data of geometric data (coded data of geometric prediction residual).

[0150] The prediction point generation unit 144 acquires information indicating the prediction mode set by the prediction mode setting unit 141, and generates prediction points based on this information (that is, under the set prediction mode). The prediction point generation unit 144 supplies the generated prediction points to the prediction mode setting unit 141 and the prediction residual calculation unit 142.

[0151] The geometric data encoding unit 103 has the above configuration, encodes each node according to the prediction tree, and performs geometric data encoding in prediction mode.

[0152] <Attribute Data Encoding Unit>

[0153] Figure 13 This is a block diagram illustrating a main configuration example of the attribute data encoding unit 105. Meanwhile, Figure 13 The main components, such as processing units and data streams, are shown, but processing units and data streams are not limited to... Figure 13 The main components are shown. In other words, Figure 13 The processing units not shown as blocks and Figure 13 Processing and data flow not shown as arrows, etc., may exist in attribute data encoding unit 105.

[0154] like Figure 13As shown, the attribute data encoding unit 105 includes a reference relationship setting unit 161, a prediction mode setting unit 162, a prediction residual calculation unit 163, and an encoding unit 164.

[0155] The reference relationship setting unit 161 acquires geometric data in the Cartesian coordinate system supplied by the coordinate transformation unit 104. The reference relationship setting unit 161 sets a reference relationship that indicates a reference target for calculating the predicted values ​​of attribute data of the target point based on the geometric data in the Cartesian coordinate system.

[0156] For example, the reference relationship setting unit 161 can set reference relationships based on the distance from the processing target point in the Cartesian coordinate system. The reference relationship setting unit 161 can also set the parent node or grandparent node of the processing target point (processing target node). Furthermore, the reference relationship setting unit 161 can select parent and grandparent nodes from points to be decoded before the processing target point during decoding. Additionally, the reference relationship setting unit 161 can set reference relationships for attribute data corresponding to the processing target node of the geometric data encoding unit 103 in the prediction tree during prediction mode. The reference relationship setting unit 161 supplies information indicating the set reference relationships to the prediction mode setting unit 162.

[0157] The prediction mode setting unit 162 acquires attribute data of the processing target point supplied from the stack 102. The prediction mode setting unit 162 also acquires information indicating reference relationships from the reference relationship setting unit 161. The prediction mode setting unit 162 sets a prediction mode for calculating predicted values ​​based on this information. Simultaneously, the prediction mode setting unit 162 can select a prediction mode to apply from a plurality of pre-prepared candidates. The prediction mode setting unit 162 is not limited to this and can set a modified prediction mode as described in the <Prediction Mode for Attribute Data> section above. The prediction mode setting unit 162 supplies the information indicating the set prediction mode and the attribute data to the prediction residual calculation unit 163.

[0158] The prediction residual calculation unit 163 acquires information and attribute data indicating the prediction mode supplied from the prediction mode setting unit 162. The prediction residual calculation unit 163 uses the information and attribute data indicating the prediction mode supplied from the prediction mode setting unit 162 to calculate the predicted value of the processed target point. The prediction residual calculation unit 163 calculates the prediction residual using the calculated predicted value and attribute data of the processed target point. Specifically, the prediction residual calculation unit 163 calculates the prediction residual, which is the difference between the attribute data of the processed target point and the predicted value calculated based on the reference relationship set by the reference relationship setting unit 161.

[0159] In calculating the prediction residual, for multiple attribute data with the same corresponding geometric information, the prediction residual calculation unit 163 can sort the multiple attribute data according to the magnitude of the values ​​of the multiple attribute data, and calculate the difference between consecutive attribute data in the sorted order. For example, as shown in the reference... Figure 7 As described, the prediction residual calculation unit 163 sorts multiple attribute data with the same corresponding geometric information in descending order of the values ​​of the multiple attribute data. Then, the prediction residual calculation unit 163 calculates a first difference by subtracting the predicted value (e.g., "0") from the top attribute data in the sorting order. Next, the prediction residual calculation unit 163 calculates a second difference by subtracting the top attribute data from the second attribute data in the sorting order. In other words, the top attribute data (the attribute data immediately following the one to be processed) is used as the predicted value. Subsequently, in the same manner, the prediction residual calculation unit 163 calculates each difference by repeatedly subtracting one attribute data from two consecutive attribute data in the sorting order from the other attribute data. As will be described later, the thus calculated differences are supplied as prediction residuals to the encoding unit 164 and encoded. In this way, as described above in <Repetition Points>, the reduction in encoding efficiency can be suppressed.

[0160] When calculating the prediction residual, the prediction residual calculation unit 163 can apply a wraparound to calculate the prediction residual. In this way, as described in <Wraparound> above, the reduction in coding efficiency can be suppressed.

[0161] The prediction residual calculation unit 163 supplies the calculated prediction residual to the encoding unit 164.

[0162] Encoding unit 164 acquires the prediction residual supplied by prediction residual calculation unit 163. Encoding unit 164 encodes the prediction residual to generate coded data. Encoding unit 164 outputs the generated coded data as coded data for processing the attribute data of the target point.

[0163] Thus, coding unit 164 encodes the prediction residuals calculated using prediction values ​​based on geometric data in a Cartesian coordinate system. Therefore, compared to encoding the prediction residuals calculated using prediction values ​​based on geometric data in a polar coordinate system, coding unit 164 can suppress the reduction in coding efficiency to a greater extent.

[0164] <Encoding Process>

[0165] Next, the processing performed by the encoding device 100 will be described. The encoding device 100 encodes the point cloud data by performing encoding processing. (Refer to...) Figure 14 The flowchart is an example to describe the process of encoding.

[0166] When the encoding process begins, the reference structure forming unit 101 of the encoding device 100 performs a reference structure forming process to form a reference structure (prediction tree) of geometric data in step S101.

[0167] In step S102, the reference structure forming unit 101 stores the geometric data of the top node of the reference structure formed in step S101 in the stack 102.

[0168] In step S103, the geometric data encoding unit 103 retrieves the latest stored geometric data of a point (node) from the stack 102. The attribute data encoding unit 105 retrieves the attribute data of the point. The child node processing unit 106 retrieves the child node information of the point.

[0169] In step S104, the geometric data encoding unit 103 performs geometric data encoding processing to encode the geometric data.

[0170] In step S105, the coordinate transformation unit 104 determines whether the geometric data is in the polar coordinate system. If it is determined that the geometric data is in the polar coordinate system, the process proceeds to step S106.

[0171] In step S106, coordinate transformation unit 104 performs a coordinate transformation to change the coordinate system of the geometric data from polar coordinates to Cartesian coordinates. When the processing in step S106 ends, the process proceeds to step S107. If it is determined in step S105 that the coordinate system of the geometric data is not polar coordinates (but Cartesian coordinates), the processing in step S106 is skipped (bypassed), and the process then proceeds to step S107.

[0172] In step S107, the attribute data encoding unit 105 performs attribute data encoding processing to encode the attribute data.

[0173] In step S108, the sub-node processing unit 106 encodes the sub-node information.

[0174] In step S109, the child node processing unit 106 determines whether to encode the child nodes of the processing target point. If the processing target point is not a leaf node of a tree structure but has child nodes, and it is determined that the child nodes also need to be encoded, the processing proceeds to step S110.

[0175] In step S110, the child node processing unit 106 controls the reference structure forming unit 101 to supply child node information (geometric data, attribute data, child node information, etc.) to the stack 102, which then stores the child node information. When the processing in step S110 ends, the processing proceeds to step S111. If it is determined in step S109 that no child nodes are to be encoded, for example, if there are no child nodes of the processing target point (e.g., the processing target point is a leaf node of a tree structure), the processing in S110 is skipped (bypassed), and the processing proceeds to step S111.

[0176] In step S111, the geometric data encoding unit 103 determines whether the stack 102 is empty. If it is determined that the stack 102 is not empty (that is, it stores information of at least one point), the process returns to step S103. Therefore, the processing from step S103 to step S111 is performed with the most recently stored point in the stack 102 as the processing target point.

[0177] While repeating this process, the encoding process ends when it is determined in step S111 that the stack is empty.

[0178] <Geometric Encoding Process>

[0179] Next, we will refer to Figure 15 To describe using a flowchart Figure 14 An example of the geometric data encoding process performed in step S104.

[0180] When the geometric data encoding process begins, the prediction mode setting unit 141 sets the prediction mode of the geometric data in step S141.

[0181] In step S142, the prediction residual calculation unit 142 calculates the prediction residual of the geometric data.

[0182] In step S143, the encoding unit 143 encodes the prediction mode information that indicates the prediction mode of the geometric data set in step S141.

[0183] In step S144, the encoding unit 143 encodes the prediction residual of the geometric data calculated in step S142.

[0184] In step S145, the prediction point generation unit 144 calculates and adds prediction points.

[0185] When the processing in step S145 is completed, the geometric data encoding process ends, and then the processing returns to... Figure 14 .

[0186] <Attribute Data Encoding Process>

[0187] Next, we will refer to Figure 15 Flowchart description Figure 14 An example of the attribute data encoding process performed in step S107.

[0188] When the attribute data encoding process begins, in step S161, the reference relationship setting unit 161 sets the parent node and grandparent node from the decoded points based on the geometric data in the Cartesian coordinate system.

[0189] In step S162, the prediction mode setting unit 162 sets the prediction mode for the attribute data.

[0190] In step S163, the prediction residual calculation unit 163 calculates the predicted values ​​of the attribute data of the target point in the prediction mode set in step S162. Then, the prediction residual calculation unit 163 calculates the prediction residuals of the attribute data of the target point by using the predicted values ​​of the target point and the attribute data.

[0191] In step S164, the encoding unit 164 encodes the prediction mode information that indicates the prediction mode of the attribute data set in step S162.

[0192] In step S165, the encoding unit 164 encodes the prediction residuals of the attribute data calculated in step S163.

[0193] When step S165 is completed, the attribute data encoding process ends, and then the process returns to... Figure 14 .

[0194] Each of the processes performed as described above enables the setting of reference relationships for attribute data based on geometric data in a Cartesian coordinate system, thereby suppressing the reduction in coding efficiency as described in <1. Coordinate System Transformation> above.

[0195] <3. Second Implementation Method>

[0196] <Decoding device>

[0197] Figure 17 This is a block diagram illustrating an example configuration of a decoding device as an aspect of an information processing apparatus applying this technology. Figure 17 The decoding device 200 shown is an apparatus for decoding encoded data of point cloud (3D data). For example, the decoding device 200 decodes encoded data of point cloud generated by the encoding device 100.

[0198] Figure 17 The main components, such as the processing unit and data flow, are shown, and Figure 17 Not all components are shown. That is, in Figure 17 The processing units not shown as blocks and in Figure 17 Processing and data streams not shown as arrows, etc., may exist in the decoding device 200.

[0199] like Figure 17 As shown, the decoding device 200 includes a storage unit 201, a stack 202, a geometric data decoding unit 203, a coordinate transformation unit 204, an attribute data decoding unit 205, and a child node processing unit 206.

[0200] Storage unit 201 stores the encoded data to be supplied to decoding device 200. Storage unit 201 also supplies encoded data to stack 202 for each node (each point) of the reference structure, and stack 202 then stores the encoded data.

[0201] Stack 202 stores information in a last-in, first-out (LIFO) manner. For example, stack 202 stores the encoded data of each point (each node) supplied from storage unit 201. Stack 202 also supplies the geometric data of the most recently stored point (node) to decoding unit 203. Furthermore, stack 202 supplies the attribute data of the most recently stored point (node) to attribute data decoding unit 205. In addition, stack 202 supplies the child node information of the most recently stored point (node) to child node processing unit 206.

[0202] The geometric data decoding unit 203 acquires the encoded data of the geometric data of the last saved point in the stack 202. The geometric data decoding unit 203 also decodes the acquired encoded data to generate geometric data. For example, the geometric data decoding unit 203 can decode the encoded data of geometric data encoded in a prediction mode. The geometric data decoding unit 203 outputs the generated geometric data to the outside of the decoding device 200. The geometric data decoding unit 203 also supplies the geometric data to the coordinate transformation unit 204.

[0203] The coordinate transformation unit 204 acquires the geometric data of the target point supplied by the geometric data decoding unit 203. As described above in <Transformation to Cartesian Coordinates>, when the geometric data is in a polar coordinate system, the coordinate transformation unit 204 transforms the coordinate system of the geometric data from the polar coordinate system to the Cartesian coordinate system. In other words, the coordinate transformation unit 204 uses the geometric data in the polar coordinate system to generate the geometric data in the Cartesian coordinate system. The coordinate transformation unit 204 supplies the generated geometric data in the Cartesian coordinate system to the attribute data decoding unit 205.

[0204] The attribute data decoding unit 205 acquires encoded attribute data of the target point being processed, supplied from the stack 202. The attribute data decoding unit 205 also acquires geometric data in a Cartesian coordinate system supplied from the coordinate transformation unit 204. The attribute data decoding unit 205 decodes the acquired encoded data to generate attribute data. Simultaneously, the attribute data decoding unit 205 decodes the encoded data to generate prediction residuals for the attribute data. The attribute data decoding unit 205 also calculates predicted values ​​for the attribute data using the geometric data in the Cartesian coordinate system. Then, the attribute data decoding unit 205 adds the predicted values ​​to the prediction residuals to generate the attribute data of the target point being processed. In other words, the attribute data decoding unit 205 decodes the encoded data using geometric data in the Cartesian coordinate system to generate attribute data. The attribute data decoding unit 205 outputs the generated attribute data of the target point being processed to the outside of the decoding device 200.

[0205] The sub-node processing unit 206 acquires encoded data of the sub-node information of the processing target point supplied from the stack 202. The sub-node processing unit 206 decodes the acquired encoded data to generate sub-node information of the processing target point. The sub-node processing unit 206 outputs the generated sub-node information to the outside of the decoding device 200.

[0206] Furthermore, when decoding the child nodes of the target node, the child node processing unit 206 controls the storage unit 201 to supply the geometric data, attribute data, and child node information of the child node to the stack 202, and the stack 202 then saves the geometric data, attribute data, and child node information of the child node.

[0207] In the decoding device 200 described above, the geometric data decoding unit 203, coordinate transformation unit 204, and attribute data decoding unit 205 of the present technology described in <1. Coordinate System Transformation> above can perform the processing of the geometric data decoding unit 203, coordinate transformation unit 204, and attribute data decoding unit 205.

[0208] For example, when coordinate transformation unit 204 transforms geometric data in polar coordinates to geometric data in Cartesian coordinates, attribute data decoding unit 205 can reconstruct (generate) attribute data using reference relationships set based on the geometric data in Cartesian coordinates. Therefore, attribute data decoding unit 205 can reference closer points compared to the case where reference relationships are set based on geometric data in polar coordinates. Thus, attribute data decoding unit 205 can improve prediction accuracy compared to the case where reference relationships are set based on geometric data in polar coordinates. Therefore, the reduction in encoding efficiency of attribute data can be suppressed.

[0209] These processing units (storage unit 201 to child node processing unit 206) can have any configuration. For example, each processing unit can be configured with logic circuitry to implement the above-described processes. Furthermore, each processing unit can include, for example, a CPU, ROM, and RAM, and is used to execute programs to implement the above-described processes. Of course, each processing unit can have two configurations, and can implement some of the above-described processes via logic circuitry, and others via an executing program. Processing units can have independent configurations; for example, some processing units can implement some of the above-described processes according to logic circuitry, some other processing units can execute programs to implement the above-described processes, and even some other processing units can implement the above-described processes using both logic circuitry and an executing program.

[0210] <Geometric Data Decoding Unit>

[0211] Figure 18 This is a block diagram illustrating a main configuration example of the geometric data decoding unit 203. Meanwhile, Figure 18 The main components, such as processing units and data streams, are shown, but processing units and data streams are not limited to... Figure 18 The main components are shown. In other words, Figure 18 The processing units not shown as blocks and Figure 18 Processing and data streams not shown as arrows, etc., may exist in the geometric data decoding unit 203.

[0212] like Figure 18 As shown, the geometric data decoding unit 203 includes a decoding unit 241, a geometric data generation unit 242, and a prediction point generation unit 243.

[0213] Decoding unit 241 acquires encoded geometric data supplied from stack 202. Decoding unit 241 decodes the encoded data to generate prediction residuals, prediction mode information, etc. of the geometric data. Decoding unit 241 supplies the prediction residuals, prediction mode information, etc. to geometric data generation unit 242.

[0214] The geometry data generation unit 242 acquires prediction residuals, prediction mode information, etc., supplied by the decoding unit 241. Based on the prediction mode information, the geometry data generation unit 242 performs prediction using the prediction mode applied during encoding and calculates the predicted value of the target point. The geometry data generation unit 242 adds the predicted value and the prediction residual to generate attribute data for the target point. The geometry data generation unit 242 outputs the generated attribute data to the outside of the decoding device 200. The geometry data generation unit 242 also acquires prediction points generated by the prediction point generation unit 243.

[0215] The prediction point generation unit 243 generates prediction points based on prediction mode information (that is, the prediction mode applied during encoding). The prediction point generation unit 243 supplies the generated prediction points to the geometry data generation unit 242.

[0216] The geometric data decoding unit 203 has the above configuration, decodes the encoded data of each node according to the prediction tree, and performs geometric data decoding in prediction mode.

[0217] <Attribute Data Decoding Unit>

[0218] Figure 19 This is a block diagram illustrating a main configuration example of the attribute data decoding unit 205. Meanwhile, Figure 19 The main components, such as processing units and data streams, are shown, but processing units and data streams are not limited to... Figure 19 The main components are shown. In other words, Figure 19 The processing units not shown as blocks and Figure 19 Processing and data flow not shown as arrows, etc., may exist in the attribute data decoding unit 205.

[0219] like Figure 19 As shown, the attribute data decoding unit 205 includes a reference relationship setting unit 261, a decoding unit 262, and an attribute data generation unit 263.

[0220] The reference relationship setting unit 261 acquires geometric data in the Cartesian coordinate system supplied by the coordinate transformation unit 204. The reference relationship setting unit 261 sets a reference relationship that indicates a reference target for calculating the predicted values ​​of attribute data of the target point based on the geometric data in the Cartesian coordinate system.

[0221] For example, the reference relationship setting unit 261 can set reference relationships based on the distance from the processing target point in the Cartesian coordinate system. The reference relationship setting unit 261 can also set the parent node or grandparent node of the processing target point (processing target node). Furthermore, the reference relationship setting unit 261 can select parent and grandparent nodes from points to be decoded before the processing target point. Additionally, the reference relationship setting unit 261 can set reference relationships for attribute data corresponding to the processing target node of the geometric data decoding unit 203 in the prediction tree in prediction mode. The reference relationship setting unit 261 supplies information indicating the set reference relationships to the attribute data generation unit 263.

[0222] Decoding unit 262 acquires encoded attribute data of the target point supplied from stack 202. Decoding unit 262 also decodes the encoded data to generate prediction residuals of the attribute data. Furthermore, decoding unit 262 decodes the encoded data to generate prediction mode information. Decoding unit 262 supplies the acquired prediction residuals and prediction mode information to attribute data generation unit 263.

[0223] The attribute data generation unit 263 acquires reference relationship information (indicating parent and grandparent nodes) supplied by the reference relationship setting unit 261. The attribute data generation unit 263 acquires prediction residual and prediction mode information supplied by the decoding unit 262. The attribute data generation unit 263 generates attribute data for the target point using this information. For example, the attribute data generation unit 263 appropriately references the parent and grandparent nodes set by the reference relationship setting unit 261 and calculates predicted values ​​for the attribute data of the target point under the prediction mode indicated by the prediction mode information. The attribute data generation unit 263 adds the predicted values ​​to the prediction residual to generate the attribute data for the target point.

[0224] In calculating the prediction residual, for multiple attribute data that are sorted in a predetermined order and have the same corresponding geometric information, the attribute data generation unit 263 can generate each attribute data by adding the differences between consecutive attribute data in the sorting order. For example, suppose that in encoding attribute data, multiple attribute data with the same corresponding geometric information are sorted in descending order of the values ​​of the multiple attribute data, the differences between consecutive attribute data in the sorting order are calculated, and the differences are encoded as prediction residuals. In this case, the attribute data generation unit 263 calculates the top attribute data in the sorting order by adding the first difference (prediction residual) obtained by decoding the encoded data to the calculated prediction value (e.g., "0"). In the same way, the attribute data generation unit 263 calculates the second attribute data in the sorting order by adding the next difference (prediction residual) to the calculated attribute data. In other words, the calculated top attribute data (immediately following the previously calculated attribute data) is used as the prediction value. In the same manner, the attribute data generation unit 263 calculates each attribute data by repeatedly adding the difference (prediction residual) to the calculated attribute data (predicted value). Thus, as described in <Repetition Points> above, the reduction in coding efficiency can be suppressed.

[0225] In generating attribute data, the attribute data generation unit 263 can apply wrapping. Thus, as described in the <wrapping> section above, it is also possible to suppress the reduction in coding efficiency.

[0226] The attribute data generation unit 263 outputs the generated attribute data to the outside of the decoding device 200.

[0227] Thus, decoding unit 262 decodes the encoded data of the prediction residual calculated using the prediction values ​​calculated using geometric data in the Cartesian coordinate system. Therefore, compared to decoding the encoded data of the prediction residual calculated using the prediction values ​​calculated using geometric data in the polar coordinate system, decoding unit 262 can suppress the reduction in coding efficiency to a greater extent.

[0228] <Decoding Process>

[0229] Next, the processing performed by the decoding device 200 will be described. The decoding device 200 decodes the encoded data of the point cloud by performing decoding processing. (Refer to...) Figure 20 The flowchart below is an example of how the decoding process is described.

[0230] When the decoding process begins, the storage unit 201 of the decoding device 200 stores the encoded data of the supplied point cloud data. Then, in step S201, the storage unit 201 supplies the encoded data of the top node of the reference structure (prediction tree) of the geometric data to the stack 202, which in turn saves (stores) the encoded data of the top node of the reference structure (prediction tree) of the geometric data.

[0231] In step S202, the geometry data decoding unit 203 retrieves the encoded data of the geometry data of the latest stored point (node) from the stack 202. The attribute data decoding unit 205 retrieves the encoded data of the attribute data of the latest stored point (node) from the stack 202. The child node processing unit 206 retrieves the encoded data of the child node information of the latest stored point (node) from the stack 202.

[0232] In step S203, the geometric data decoding unit 203 performs geometric data decoding processing to decode the encoded data of the geometric data obtained in step S202.

[0233] In step S204, the coordinate transformation unit 204 determines whether the geometric data is in the polar coordinate system. If it is determined that the geometric data is in the polar coordinate system, the process proceeds to step S205.

[0234] In step S205, coordinate transformation unit 204 performs a coordinate transformation to change the coordinate system of the geometric data from polar coordinates to Cartesian coordinates. When the processing in step S205 ends, the process proceeds to step S206. If it is determined in step S204 that the coordinate system of the geometric data is not polar coordinates (but Cartesian coordinates), the processing in step S205 is skipped (bypassed), and the process then proceeds to step S206.

[0235] In step S206, the attribute data decoding unit 205 performs attribute data decoding processing to decode the encoded data of the attribute data, etc.

[0236] In step S207, the sub-node processing unit 206 decodes the encoded data of the sub-node information.

[0237] In step S208, the child node processing unit 206 determines whether to decode the child nodes of the processing target point. If the processing target point is not a leaf node of a tree structure but has child nodes, and it is determined that the child nodes also need to be decoded, the processing proceeds to step S209.

[0238] In step S209, the child node processing unit 206 controls the storage unit 201 to supply child node information (geometric data, attribute data, child node information, etc.) to the stack 202, which then stores the child node information. When the processing in step S209 ends, the processing proceeds to step S210. If it is determined in step S208 that no child node will be decoded, for example, if there is no child node of the processing target point (e.g., the processing target point is a leaf node of a tree structure), the processing in S209 is skipped (bypassed), and the processing then proceeds to step S210.

[0239] In step S210, the geometry data decoding unit 203 determines whether the stack 202 is empty. If it is determined that the stack 202 is not empty (that is, it stores information of at least one point), the process returns to step S202. Therefore, the processing from step S202 to step S210 is executed with the most recently stored point in the stack 202 as the processing target point.

[0240] While repeating this process, the decoding process ends when it is determined in step S210 that the stack is empty.

[0241] <Geometric Decoding Process>

[0242] Next, we will refer to Figure 21 Flowchart description Figure 20 An example of the flow of geometric data decoding processing performed in step S203.

[0243] When the geometric data decoding process begins, in step S241, the decoding unit 241 decodes the encoded data of the geometric data to generate the prediction residual of the geometric data.

[0244] In step S242, the decoding unit 241 decodes the encoded data of the prediction mode information to generate the prediction mode information of the geometric data.

[0245] In step S243, the geometry data generation unit 242 calculates the predicted value of the geometry data of the target point under the prediction mode indicated by the prediction mode information obtained through the processing in step S242. Then, the geometry data generation unit 242 adds the predicted value to the prediction residual obtained through the processing in step S241 to generate the geometry data of the target point.

[0246] In step S244, the prediction point generation unit 243 calculates and adds prediction points.

[0247] When the processing in step S244 ends, the geometric data decoding process ends, and then the processing returns to... Figure 20 .

[0248] <Attribute Data Decoding Process>

[0249] Next, we will refer to Figure 22 Flowchart description Figure 20 An example of the attribute data decoding process performed in step S206.

[0250] When the attribute data decoding process begins, in step S261, the reference relationship setting unit 261 sets the parent node and grandparent node from the decoding point based on the geometric data in the Cartesian coordinate system.

[0251] In step S262, the decoding unit 262 decodes the encoded data of the prediction residual of the attribute data to generate the prediction residual of the target point.

[0252] In step S263, the decoding unit 262 decodes the encoded data of the prediction mode information of the attribute data to generate the prediction mode information of the target point.

[0253] In step S264, the attribute data generation unit 263 generates attribute data for the target point using this information. For example, the attribute data generation unit 263 calculates the predicted value of the geometric data of the target point under the prediction mode indicated by the prediction mode information obtained through the processing in step S263. Then, the attribute data generation unit 263 adds the predicted value to the prediction residual obtained through the processing in step S262 to generate the attribute data for the target point.

[0254] When the processing in step S264 is completed, the attribute data decoding process ends, and then the processing returns to... Figure 20 .

[0255] Each of the processes performed as described above enables the setting of reference relationships for attribute data based on geometric data in the Cartesian coordinate system, thereby suppressing the reduction in coding efficiency as described in <1. Coordinate System Transformation> above.

[0256] <4. Supplementary Explanation>

[0257] Computer

[0258] The aforementioned series of processes can be performed by hardware or software. In the case where the processes are performed by software, a program configuring the software is installed on the computer. Here, "computer" includes, for example, computers built into dedicated hardware and general-purpose personal computers with various programs installed thereon to perform various functions.

[0259] Figure 23 This is a block diagram illustrating an example of the hardware configuration of a computer that performs the above series of processes according to a program.

[0260] exist Figure 23 In the computer 900 shown, the central processing unit (CPU) 901, read-only memory (ROM) 902 and random access memory (RAM) 903 are connected to each other via bus 904.

[0261] The input and output interface 910 is also connected to the bus 904. The input unit 911, output unit 912, storage unit 913, communication unit 914 and driver 915 are connected to the input and output interface 910.

[0262] Input unit 911 is, for example, a keyboard, mouse, microphone, touch panel, or input terminal. Output unit 912 is, for example, a display, speaker, or output terminal. Storage unit 913 includes, for example, a hard disk, RAM disk, and non-volatile memory. Communication unit 914 includes, for example, a network interface. Driver 915 drives removable medium 921 such as a hard disk, optical disk, magneto-optical disk, or semiconductor memory.

[0263] In the computer configured as described above, the CPU 901 loads the program stored in the storage unit 913 into the RAM 903 via the input and output interface 910 and the bus 904 and executes the program, thereby performing the series of processes described above. The RAM 903 also appropriately stores data required by the CPU 901 to perform various types of processing.

[0264] The program executed by the computer can be recorded in a removable medium 921, such as a packaging medium, and provided in this form. In this case, by installing the removable medium 921 in the drive 915, the program can be installed in the storage unit 913 via the input and output interface 910.

[0265] The program can also be provided via wired or wireless transmission media such as a local area network, the Internet, and digital satellite broadcasting. In this case, the program can be received by the communication unit 914 and installed in the storage unit 913.

[0266] In addition, the program can be pre-installed in ROM 902, storage unit 913, etc.

[0267] <Application Objectives of This Technology>

[0268] While the application of this technique to the encoding and decoding of point cloud data has been described above, this technique is not limited to these examples and can be applied to the encoding and decoding of any standard 3D data. For instance, in the encoding and decoding of mesh data, mesh data can be transformed into point cloud data, and encoding and decoding can be performed by applying this technique. In other words, any of the various types of processing, such as encoding and decoding schemes, and any of the various data specifications, such as 3D data or metadata, can be used, as long as the processing and specifications do not contradict the technique described above. Some of the aforementioned processing or specifications can be omitted, provided that the processing and specifications are consistent with this technique.

[0269] This technology can be applied to any configuration. For example, it can be applied to various electronic devices, such as transmitters and receivers in satellite broadcasting, cable television broadcasting, transmission over the Internet, and delivery to terminals via cellular communication (e.g., television receivers or mobile phones), or devices that record images on media such as optical discs, magnetic disks, and flash memory and reproduce images from these storage media (e.g., hard disk recorders or cameras).

[0270] Furthermore, for example, this technology can be implemented as part of the configuration of a device such as a processor (e.g., a video processor) that uses multiple processors (e.g., a video module), a unit (e.g., a video unit) that uses multiple modules, or a collection of other functions added to the unit (e.g., a video collection).

[0271] Furthermore, this technology can also be applied to network systems configured with multiple devices. For example, this technology can be implemented as cloud computing, where multiple devices share processing and perform processing jointly via a network. For example, this technology can be implemented in cloud services that provide services about images (moving images) to any terminal such as computers, audiovisual (AV) devices, portable information processing terminals, and Internet of Things (IoT) devices.

[0272] In this specification, a system is a group of multiple components (devices, modules (parts), etc.), and all components may not be housed in the same housing. Therefore, multiple devices housed in a separate housing and connected via a network, as well as a single device housed in a single housing and containing multiple modules, are both considered "systems".

[0273] <Fields and Purposes Where This Technology Can Be Applied>

[0274] For example, systems, devices, and processing units utilizing this technology can be used in any field, such as transportation, medical care, security, agriculture, animal husbandry, mining, beauty, factories, home appliances, weather and nature monitoring. Applications of this technology can also be tailored to specific needs.

[0275] <Other>

[0276] The implementation of this technology is not limited to the above-described implementation, and various modifications can be made without departing from the scope and spirit of this technology.

[0277] For example, a configuration described as a single device (or processing unit) can be divided and configured as multiple devices (or processing units). In contrast, the configurations described above as multiple devices (or processing units) can be collectively configured as a single device (or processing unit). Furthermore, components other than those in the above configurations can be added to the configuration of each device (or each processing unit). Moreover, provided that the overall system configuration and operation are substantially the same, one or more components of a device (or processing unit) can be included in the configuration of another device (or another processing unit).

[0278] Furthermore, for example, the above procedure can be executed in any device. In this case, the device only needs the necessary functions (function blocks, etc.) and the ability to obtain the necessary information.

[0279] For example, each step of a flowchart can be executed by a single device, or it can be shared and executed by multiple devices. Furthermore, when a step includes multiple types of processing, these multiple types of processing can be executed by a single device, or they can be shared and executed by multiple devices. In other words, multiple types of processing included in a step can also be executed as processing for multiple steps. In contrast, processing described as multiple steps can be executed collectively as a single step.

[0280] For example, for a program executed by a computer, the processing of the steps describing the program can be performed chronologically in the order described in this specification, or it can be performed in parallel or separately at necessary timings, such as the time of a call. In other words, as long as no inconsistency occurs, the processing of each step can be performed in an order different from the above-described order. Furthermore, the processing of the steps describing the program can be performed in parallel with the processing of another program, or it can be performed in combination with the processing of another program.

[0281] For example, as long as there is no inconsistency, multiple technologies related to this technology can be implemented independently. Of course, any multiple technologies can be implemented together. For example, some or all of the technologies described in several embodiments can be combined with some or all of the technologies described in other embodiments. Furthermore, any part or all of the above-described technologies can also be implemented together with another technology not described above.

[0282] This technology can also be configured as follows:

[0283] (1) An information processing device, comprising:

[0284] The coordinate transformation unit transforms the coordinate system of geometric data from polar coordinates to Cartesian coordinates for a point cloud that represents a set of points as a three-dimensional object.

[0285] A reference relationship setting unit sets a reference relationship using geometric data in a Cartesian coordinate system generated by the coordinate transformation unit. The reference relationship indicates a reference target for calculating the predicted values ​​of attribute data of the target point.

[0286] A prediction residual calculation unit calculates a prediction residual, which is the difference between the attribute data of the target point and the predicted value calculated based on the reference relationship set by the reference relationship setting unit; and

[0287] A prediction residual coding unit encodes the prediction residuals calculated by the prediction residual calculation unit.

[0288] (2) The information processing device according to (1), wherein the reference relationship setting unit sets the reference relationship based on the distance from the processing target point in the Cartesian coordinate system.

[0289] (3) The information processing apparatus according to (1) or (2), wherein the reference relationship setting unit selects a parent node and a grandparent node from points to be decoded before the processing target point during decoding.

[0290] (4) The information processing apparatus according to any one of (1) to (3) further includes a geometric data encoding unit for encoding the geometric data in a prediction mode.

[0291] The coordinate transformation unit transforms the coordinate system of the geometric data encoded by the geometric data encoding unit in the prediction mode from the polar coordinate system to the Cartesian coordinate system.

[0292] (5) According to the information processing device described in (4), wherein the reference relationship setting unit sets a reference relationship for attribute data corresponding to the processing target node of the geometric data encoding unit in the prediction tree under the prediction mode.

[0293] (6) The information processing apparatus according to (4) or (5) further includes a prediction mode setting unit, wherein the prediction mode setting unit sets a prediction mode for calculating the predicted value.

[0294] The prediction residual calculation unit calculates the prediction residual by using the prediction value calculated under the prediction mode set by the prediction mode setting unit.

[0295] (7) The information processing apparatus according to (6), wherein the prediction mode setting unit selects the prediction mode to be applied from a plurality of pre-prepared candidates.

[0296] (8) The information processing apparatus according to any one of (4) to (7), wherein, for multiple attribute data items with the same corresponding geometric information, the prediction residual calculation unit performs the following operation:

[0297] The attribute data items are sorted according to the range of their values, and

[0298] Calculate the difference between consecutive attribute data items in the sorted order.

[0299] (9) The information processing apparatus according to any one of (4) to (8), wherein the prediction residual calculation unit applies a surround to calculate the prediction residual.

[0300] (10) An information processing method, comprising:

[0301] For a point cloud that represents a set of points as a 3D object, the coordinate system of the geometric data is transformed from polar coordinates to Cartesian coordinates.

[0302] A reference relationship is established using geometric data generated in the Cartesian coordinate system, the reference relationship indicating a reference target for calculating the predicted values ​​of attribute data for processing target points;

[0303] Calculate the prediction residual, which is the difference between the attribute data of the target point and the predicted value calculated based on the set reference relationship; and

[0304] The calculated prediction residuals are encoded.

[0305] (11) An information processing apparatus, comprising:

[0306] The coordinate transformation unit transforms the coordinate system of the geometric data from the polar coordinate system to the Cartesian coordinate system for a point cloud that represents a set of points as a three-dimensional object.

[0307] A reference relationship setting unit sets a reference relationship using geometric data in a Cartesian coordinate system generated by the coordinate transformation unit. The reference relationship indicates a reference target for calculating the predicted values ​​of attribute data of the target point.

[0308] A prediction residual decoding unit decodes the encoded data to calculate a prediction residual, wherein the prediction residual is the difference between the attribute data and the predicted value; and

[0309] The attribute data generation unit generates the attribute data by adding the prediction residual calculated by the prediction residual decoding unit and the prediction value calculated based on the reference relationship set by the reference relationship setting unit.

[0310] (12) The information processing apparatus according to (11), wherein the reference relationship setting unit sets the reference relationship based on the distance from the processing target point in the Cartesian coordinate system.

[0311] (13) The information processing apparatus according to (11) or (12), wherein the reference relationship setting unit selects a parent node and a grandparent node from points to be decoded by the prediction residual decoding unit before the processing target point.

[0312] (14) The information processing apparatus according to any one of (11) to (13) further includes a geometric data decoding unit, which decodes the encoded data of the geometric data encoded in the prediction mode.

[0313] The coordinate transformation unit transforms the coordinate system of the geometric data generated by the geometric data decoding unit from the polar coordinate system to the Cartesian coordinate system.

[0314] (15) The information processing apparatus according to (14), wherein the reference relationship setting unit sets a reference relationship in the prediction tree in the prediction mode for the attribute data corresponding to the processing target node of the geometric data decoding unit.

[0315] (16) The information processing apparatus according to (14) or (15), wherein

[0316] The prediction residual decoding unit also decodes the encoded data of prediction mode information indicating the prediction mode used to calculate the predicted value, and

[0317] The attribute data generation unit generates the attribute data by applying the predicted value calculated using the prediction mode indicated by the prediction mode information generated by the prediction residual decoding unit.

[0318] (17) The information processing apparatus according to (16), wherein the attribute data generation unit calculates a prediction value by applying a prediction mode indicated by prediction mode information generated by the prediction residual decoding unit, and generates the attribute data by using the calculated prediction value.

[0319] (18) The information processing apparatus according to any one of (14) to (17), wherein, for a plurality of attribute data items with the same corresponding geometric information, the attribute data generation unit generates each of the attribute data items by adding the differences between consecutive attribute data items in the sorting order.

[0320] (19) The information processing apparatus according to any one of (14) to (18), wherein the attribute data generation unit applies a surround to generate the attribute data.

[0321] (20) An information processing method, comprising:

[0322] For a point cloud that represents a set of points as a 3D object, the coordinate system of the geometric data is transformed from polar coordinates to Cartesian coordinates.

[0323] A reference relationship is established by using the geometric data in the generated Cartesian coordinate system, the reference relationship indicating a reference target for calculating the predicted values ​​of the attribute data of the target point;

[0324] Decoding the encoded data to calculate the prediction residual, the prediction residual being the difference between the attribute data and the predicted value; and

[0325] The attribute data is generated by adding the calculated prediction residuals to the predicted values ​​calculated based on the set reference relationships.

[0326] List of reference numerals

[0327] 100 encoding device

[0328] 101 Reference Structure Forming Unit

[0329] 102 stack units

[0330] 103 Geometric Data Encoding Unit

[0331] 104 coordinate transformation units

[0332] 105 Attribute Data Encoding Unit

[0333] 106 child node processing units

[0334] 141 Prediction Mode Setting Unit

[0335] 142 Predictive Residual Calculation Unit

[0336] 143 coding units

[0337] 144 prediction point generation units

[0338] 161 Reference Relationship Setting Unit

[0339] 162 Prediction Mode Setting Unit

[0340] 163 Predictive Residual Calculation Unit

[0341] 164 coding unit

[0342] 200 decoding device

[0343] 201 storage unit

[0344] 202 stack

[0345] 203 Geometric Data Decoding Unit

[0346] 204 coordinate transformation units

[0347] 205 Attribute Data Decoding Unit

[0348] 206 child node processing units

[0349] 241 decoding units

[0350] 242 Geometric Data Generation Unit

[0351] 243 prediction point generation units

[0352] 261 Reference Relationship Setting Unit

[0353] 262 decoding units

[0354] 263 Attribute Data Generation Unit

[0355] 900 Computers

Claims

1. An information processing apparatus, comprising: The coordinate transformation unit transforms the coordinate system of geometric data from polar coordinates to Cartesian coordinates for a point cloud that represents a set of points as a three-dimensional object. A reference relationship setting unit sets a reference relationship using geometric data in the Cartesian coordinate system generated by the coordinate transformation unit. The reference relationship indicates a reference target for calculating the predicted values ​​of attribute data of the target point. A prediction residual calculation unit calculates a prediction residual, which is the difference between the attribute data of the processing target point and the predicted value calculated based on the reference relationship set by the reference relationship setting unit. as well as A prediction residual coding unit encodes the prediction residual calculated by the prediction residual calculation unit.

2. The information processing apparatus according to claim 1, wherein, The reference relationship setting unit sets the reference relationship based on the distance from the processing target point in the Cartesian coordinate system.

3. The information processing apparatus according to claim 1, wherein, The reference relationship setting unit selects the parent node and grandparent node from the points to be decoded before the processing target point during decoding.

4. The information processing apparatus according to claim 1 further includes a geometric data encoding unit for encoding the geometric data in a prediction mode. in, The coordinate transformation unit transforms the coordinate system of the geometric data encoded by the geometric data encoding unit in the prediction mode from the polar coordinate system to the Cartesian coordinate system.

5. The information processing apparatus according to claim 4, wherein, The reference relationship setting unit sets the reference relationship of the attribute data corresponding to the processing target point of the geometric data encoding unit in the prediction tree under the prediction mode.

6. The information processing apparatus according to claim 4 further includes a prediction mode setting unit, wherein the prediction mode setting unit sets a prediction mode for calculating the predicted value. in, The prediction residual calculation unit calculates the prediction residual by using the prediction value calculated under the prediction mode set by the prediction mode setting unit.

7. The information processing apparatus according to claim 6, wherein, The prediction mode setting unit selects the prediction mode to be applied from a plurality of pre-prepared candidates.

8. The information processing apparatus according to claim 4, wherein, For multiple attribute data items with the same corresponding geometric information, the prediction residual calculation unit performs the following operation: The attribute data items are sorted according to the range of their values, and Calculate the difference between consecutive attribute data items in the sorted order.

9. The information processing apparatus according to claim 4, wherein, The prediction residual calculation unit uses a wraparound to calculate the prediction residual.

10. An information processing method, comprising: For a point cloud that represents a set of points as a 3D object, the coordinate system of the geometric data is transformed from polar coordinates to Cartesian coordinates. A reference relationship is established using the geometric data in the generated Cartesian coordinate system, the reference relationship indicating a reference target for calculating the predicted values ​​of the attribute data of the target point; Calculate the prediction residual, which is the difference between the attribute data of the processing target point and the predicted value calculated based on the set reference relationship; as well as The calculated prediction residuals are encoded.

11. An information processing apparatus, comprising: The coordinate transformation unit transforms the coordinate system of the geometric data from the polar coordinate system to the Cartesian coordinate system for a point cloud that represents a set of points as a three-dimensional object. A reference relationship setting unit sets a reference relationship using geometric data in the Cartesian coordinate system generated by the coordinate transformation unit. The reference relationship indicates a reference target for calculating the predicted values ​​of attribute data of the target point. A prediction residual decoding unit decodes the encoded data to calculate the prediction residual, which is the difference between the attribute data and the predicted value. as well as An attribute data generation unit generates the attribute data by adding the prediction residual calculated by the prediction residual decoding unit and the prediction value calculated based on the reference relationship set by the reference relationship setting unit.

12. The information processing apparatus according to claim 11, wherein, The reference relationship setting unit sets the reference relationship based on the distance from the processing target point in the Cartesian coordinate system.

13. The information processing apparatus according to claim 11, wherein, The reference relationship setting unit selects parent and grandparent nodes from the points to be decoded by the prediction residual decoding unit before the processing target point.

14. The information processing apparatus according to claim 11, further comprising a geometric data decoding unit, wherein the geometric data decoding unit decodes the encoded data of the geometric data encoded in the prediction mode. in, The coordinate transformation unit transforms the coordinate system of the geometric data generated by the geometric data decoding unit from the polar coordinate system to the Cartesian coordinate system.

15. The information processing apparatus according to claim 14, wherein, The reference relationship setting unit sets the reference relationship of the attribute data corresponding to the processing target point of the geometric data decoding unit in the prediction tree under the prediction mode.

16. The information processing apparatus according to claim 14, wherein The prediction residual decoding unit also decodes the encoded data of prediction mode information indicating the prediction mode used to calculate the predicted value, and The attribute data generation unit generates the attribute data by applying the predicted value calculated using the prediction mode indicated by the prediction mode information generated by the prediction residual decoding unit.

17. The information processing apparatus according to claim 16, wherein, The attribute data generation unit calculates the predicted value by applying the prediction mode indicated by the prediction mode information generated by the prediction residual decoding unit, and generates the attribute data by using the calculated predicted value.

18. The information processing apparatus according to claim 14, wherein, For multiple attribute data items with the same corresponding geometric information, the attribute data generation unit generates each of the attribute data items by adding the differences between consecutive attribute data items in the sorting order.

19. The information processing apparatus according to claim 14, wherein, The attribute data generation unit applies a wrapper to generate the attribute data.

20. An information processing method, comprising: For a point cloud that represents a set of points as a 3D object, the coordinate system of the geometric data is transformed from polar coordinates to Cartesian coordinates. A reference relationship is established using the geometric data in the generated Cartesian coordinate system, the reference relationship indicating a reference target for calculating the predicted values ​​of the attribute data of the target point; The encoded data is decoded to calculate the prediction residual, which is the difference between the attribute data and the predicted value; as well as The attribute data is generated by adding the calculated prediction residuals to the prediction values ​​calculated based on the set reference relationships.

21. A computer program product comprising a computer program / instructions, wherein, When the computer program / instruction is executed by the computer, it implements the steps of the information processing method according to claim 10 or 20.

22. A computer-readable storage medium having a computer-executable program stored thereon, which, when executed, causes the computer to perform the information processing method according to claim 10 or 20.