Method for storing three-dimensional data, method for acquiring three-dimensional data, three-dimensional data storage device, and three-dimensional data acquisition device.
By storing encoding scheme information in the control information of the encoded stream, the method ensures correct decoding of three-dimensional data, addressing the challenge of multiple encoding schemes and enhancing encoding efficiency and data handling.
Patent Information
- Authority / Receiving Office
- JP · JP
- Patent Type
- Applications
- Current Assignee / Owner
- PANASONIC INTELLECTUAL PROPERTY CORP OF AMERICA
- Filing Date
- 2026-04-16
- Publication Date
- 2026-07-02
AI Technical Summary
Existing methods for encoding and decoding three-dimensional data fail to support correct decoding when multiple encoding schemes are used, particularly in the context of point cloud data, leading to issues with multiplexing, transmission, and storage.
A method and device that store information indicating the encoding scheme used for three-dimensional data in the control information of the encoded stream, allowing correct decoding even when different encoding schemes are employed, by using a common format for units that includes information about the type of data contained, independent of the specific encoding scheme.
Enables correct decoding of three-dimensional data across multiple encoding schemes, improving encoding efficiency and facilitating multiplexing, transmission, and storage of point cloud data.
Smart Images

Figure 2026110626000001_ABST
Abstract
Description
Technical Field
[0001] The present disclosure relates to a three-dimensional data encoding method, a three-dimensional data decoding method, a three-dimensional data encoding apparatus, and a three-dimensional data decoding apparatus.
Background Art
[0002] In the future, the spread of devices or services that utilize three-dimensional data is expected in a wide range of fields such as computer vision, map information, monitoring, infrastructure inspection, or video distribution for autonomous operation of automobiles or robots. Three-dimensional data is acquired by various methods such as a distance sensor such as a range finder, a stereo camera, or a combination of a plurality of monocular cameras.
[0003] As one method of representing three-dimensional data, there is a representation method called point cloud that represents the shape of a three-dimensional structure by a point group in a three-dimensional space. In a point cloud, the positions and colors of the point group are stored. Although point cloud is expected to become mainstream as a method of representing three-dimensional data, the amount of data in the point group is very large. Therefore, in the accumulation or transmission of three-dimensional data, as in the case of two-dimensional moving images (for example, MPEG-4 AVC or HEVC standardized by MPEG), compression of the amount of data by encoding is essential.
[0004] In addition, regarding the compression of point cloud, it is partially supported by a publicly available library (Point Cloud Library) that performs point cloud-related processing.
[0005] In addition, a technique for searching for and displaying facilities located around a vehicle using three-dimensional map data is known (see, for example, Patent Document 1).
Prior Art Documents
Patent Documents
[0006]
Patent Document 1
[0007] Multiple encoding schemes may be used in the encoding and decoding of three-dimensional data.
[0008] The purpose of this disclosure is to provide a three-dimensional data encoding method or three-dimensional data encoding device that can generate an encoded stream that can correctly decode three-dimensional data even when multiple encoding schemes are used, or a three-dimensional data decoding method or three-dimensional data decoding device that can correctly decode three-dimensional data. [Means for solving the problem]
[0009] A three-dimensional data storage method according to one aspect of the present disclosure includes acquiring an encoded stream of three-dimensional data including location information, generating one or more units that store the encoded stream, each unit including information indicating the encoding scheme used to encode the three-dimensional data, the information including first information and second information, and the first information and second information being stored at multiple locations within the unit.
[0010] A method for acquiring three-dimensional data according to one aspect of the present disclosure involves acquiring one or more units, and acquiring an encoded stream generated by encoding three-dimensional data including positional information from the one or more units, wherein each unit includes information indicating the encoding scheme used to encode the three-dimensional data, and the information includes first information and second information, and the first information and second information are stored at multiple locations within the unit.
[0011] A three-dimensional data encoding method according to one aspect of the present disclosure generates an encoded stream by encoding three-dimensional data, and stores information indicating the encoding method used for encoding, among a first encoding method and a second encoding method, in the control information of the encoded stream.
[0012] A three-dimensional data decoding method according to one aspect of the present disclosure determines the encoding method used to encode the encoded stream based on information indicating the encoding method used to encode the three-dimensional data, among a first encoding method and a second encoding method, which is included in the control information of the encoded stream generated by encoding the three-dimensional data, and decodes the encoded stream using the determined encoding method. [Effects of the Invention]
[0013] This disclosure provides a three-dimensional data encoding method or three-dimensional data encoding device that can generate an encoded stream that can correctly decode three-dimensional data even when multiple encoding schemes are used, or a three-dimensional data decoding method or three-dimensional data decoding device that can correctly decode three-dimensional data. [Brief explanation of the drawing]
[0014] [Figure 1] Figure 1 shows the configuration of a three-dimensional data encoding and decoding system according to Embodiment 1. [Figure 2] Figure 2 shows an example of the configuration of point cloud data according to Embodiment 1. [Figure 3] Figure 3 shows an example of the configuration of a data file containing point cloud data information according to Embodiment 1. [Figure 4] Figure 4 is a diagram showing the types of point cloud data according to Embodiment 1. [Figure 5] Figure 5 shows the configuration of the first encoding unit according to Embodiment 1. [Figure 6] Figure 6 is a block diagram of the first encoding unit according to Embodiment 1. [Figure 7] Figure 7 shows the configuration of the first decoding unit according to Embodiment 1. [Figure 8] Figure 8 is a block diagram of the first decoding unit according to Embodiment 1. [Figure 9] Figure 9 shows the configuration of the second encoding unit according to Embodiment 1. [Figure 10]FIG. 10 is a block diagram of a second encoding unit according to Embodiment 1. [Figure 11] FIG. 11 is a diagram showing the configuration of a second decoding unit according to Embodiment 1. [Figure 12] FIG. 12 is a block diagram of a second decoding unit according to Embodiment 1. [Figure 13] FIG. 13 is a diagram showing a protocol stack related to PCC encoded data according to Embodiment 1. [Figure 14] FIG. 14 is a diagram showing a protocol stack according to Embodiment 1. [Figure 15] FIG. 15 is a diagram showing a syntax example of a NAL unit according to Embodiment 1. [Figure 16] FIG. 16 is a diagram showing a syntax example of a NAL unit header according to Embodiment 1. [Figure 17] FIG. 17 is a diagram showing a semantics example of pcc_codec_type according to Embodiment 1. [Figure 18] FIG. 18 is a diagram showing a semantics example of pcc_nal_unit_type according to Embodiment 1. [Figure 19] FIG. 19 is a flowchart of an encoding process according to Embodiment 1. [Figure 20] FIG. 20 is a flowchart of a decoding process by a second decoding unit according to Embodiment 1. [Figure 21] FIG. 21 is a flowchart of a decoding process by a first decoding unit according to Embodiment 1. [Figure 22] FIG. 22 is a diagram showing a protocol stack according to Embodiment No. 2. [Figure 23] FIG. 23 is a diagram showing a syntax example of a NAL unit for codec 2 according to Embodiment 2. [Figure 24] FIG. 24 is a diagram showing a syntax example of a NAL unit header for codec 2 according to Embodiment 2. [Figure 25]Figure 25 shows an example of the semantics of codec2_nal_unit_type according to Embodiment 2. [Figure 26] Figure 26 shows an example of the syntax of the NAL unit for codec 1 according to Embodiment 2. [Figure 27] Figure 27 shows an example of the syntax of the NAL unit header for codec 1 according to Embodiment 2. [Figure 28] Figure 28 shows an example of the semantics of codec1_nal_unit_type according to Embodiment 2. [Figure 29] Figure 29 is a flowchart of the encoding process according to Embodiment 2. [Figure 30] Figure 30 is a flowchart of the decoding process according to Embodiment 2. [Figure 31] Figure 31 is a diagram showing the protocol stack according to Embodiment 3. [Figure 32] Figure 32 shows an example of the syntax of a NAL unit according to Embodiment 3. [Figure 33] Figure 33 shows an example of the syntax of a NAL unit header according to Embodiment 3. [Figure 34] Figure 34 shows an example of the semantics of pcc_nal_unit_type according to Embodiment 3. [Figure 35] Figure 35 is a flowchart of the encoding process according to Embodiment 3. [Figure 36] Figure 36 is a flowchart of the decoding process according to Embodiment 3. [Figure 37] Figure 37 is a flowchart of the encoding process according to a modified example of the embodiment. [Figure 38] Figure 38 is a flowchart of the decoding process according to a modified example of the embodiment. [Figure 39] Figure 39 is a block diagram of the encoding unit according to Embodiment 4. [Figure 40] Figure 40 is a block diagram of the decoding unit according to Embodiment 4. [Figure 41] Figure 41 is a flowchart of the encoding process according to Embodiment 4. [Figure 42] Figure 42 is a flowchart of the decoding process according to Embodiment 4. [Modes for carrying out the invention]
[0015] A three-dimensional data encoding method according to one aspect of the present disclosure generates an encoded stream by encoding three-dimensional data, and stores information indicating the encoding method used for encoding, among a first encoding method and a second encoding method, in the control information of the encoded stream.
[0016] According to this, when a three-dimensional data decoding device decodes an encoded stream generated by the three-dimensional data encoding method, it can determine the encoding scheme used for encoding using information stored in the control information. Therefore, the three-dimensional data decoding device can correctly decode the encoded stream even when multiple encoding schemes are used.
[0017] For example, the three-dimensional data may include location information, the encoding may encode the location information, and the storage may store information indicating which of the first and second encoding methods was used to encode the location information in the control information of the location information.
[0018] For example, the three-dimensional data includes location information and attribute information, the encoding encodes the location information and the attribute information respectively, the storage encodes the location information in the control information of the location information, the encoding method used for encoding the location information from among the first encoding method and the second encoding method, and the encoding method used for encoding the attribute information in the control information of the attribute information.
[0019] According to this method, different encoding schemes can be used for location information and attribute information, thereby improving encoding efficiency.
[0020] For example, the three-dimensional data encoding method may further store the encoded stream in one or more units.
[0021] For example, the unit may have a format common to both the first and second encoding schemes, and may include information indicating the type of data contained in the unit, which has definitions independent of the first and second encoding schemes.
[0022] For example, the unit may have a format independent of the first encoding scheme and the second encoding scheme, and may include information indicating the type of data contained in the unit, which has a definition independent of the first encoding scheme and the second encoding scheme.
[0023] For example, the unit may have a format common to both the first and second encoding schemes, and may include information indicating the type of data contained in the unit, which has a definition common to both the first and second encoding schemes.
[0024] A three-dimensional data decoding method according to one aspect of the present disclosure determines the encoding method used to encode the encoded stream based on information indicating the encoding method used to encode the three-dimensional data, among a first encoding method and a second encoding method, which is included in the control information of the encoded stream generated by encoding the three-dimensional data, and decodes the encoded stream using the determined encoding method.
[0025] According to this, the three-dimensional data decoding method can determine the encoding scheme used for encoding by using the information stored in the control information when decoding the encoded stream. Therefore, the three-dimensional data decoding method can correctly decode the encoded stream even when multiple encoding schemes are used.
[0026] For example, the unit may have a format common to both the first and second encoding schemes, and may include information indicating the type of data contained in the unit, which has a definition common to both the first and second encoding schemes.
[0027] For example, the three-dimensional data includes location information and attribute information, the encoded stream includes encoded data of the location information and encoded data of the attribute information, the determination determines the encoding method used to encode the location information based on information indicating which of the first and second encoding methods was used to encode the location information, included in the control information of the location information included in the encoded stream, the determination determines the encoding method used to encode the attribute information based on information indicating which of the first and second encoding methods was used to encode the attribute information, included in the control information of the attribute information included in the encoded stream, the decoding determines the encoding method used to encode the attribute information, the decoding determines the encoding method used to encode the location information and the decoding determines the encoding method used to encode the attribute information.
[0028] According to this method, different encoding schemes can be used for location information and attribute information, thereby improving encoding efficiency.
[0029] For example, the encoded stream is stored in one or more units, and the three-dimensional data decoding method may further acquire the encoded stream from the one or more units.
[0030] For example, the unit may have a format common to both the first and second encoding schemes, and may include information indicating the type of data contained in the unit, which has definitions independent of the first and second encoding schemes.
[0031] For example, the unit may have a format independent of the first encoding scheme and the second encoding scheme, and may include information indicating the type of data contained in the unit, which has a definition independent of the first encoding scheme and the second encoding scheme.
[0032] For example, the unit may have a format common to both the first and second encoding schemes, and may include information indicating the type of data contained in the unit, which has a definition common to both the first and second encoding schemes.
[0033] Furthermore, a three-dimensional data encoding device according to one aspect of the present disclosure is a three-dimensional data encoding device for encoding a plurality of three-dimensional points having attribute information, comprising a processor and a memory, wherein the processor generates an encoded stream by encoding three-dimensional data using the memory, and stores information indicating the encoding method used for the encoding, among a first encoding method and a second encoding method, in the control information of the encoded stream.
[0034] According to this, when decoding an encoded stream generated by a three-dimensional data encoding device, the three-dimensional data decoding device can determine the encoding method used for encoding using information stored in the control information. Therefore, the three-dimensional data decoding device can correctly decode the encoded stream even when multiple encoding methods are used.
[0035] Furthermore, a three-dimensional data decoding device according to one aspect of the present disclosure is a three-dimensional data decoding device for decoding a plurality of three-dimensional points having attribute information, comprising a processor and a memory, wherein the processor uses the memory to determine the encoding method used to encode the encoded stream based on information indicating the encoding method used to encode the three-dimensional data, among a first encoding method and a second encoding method, which is included in the control information of the encoded stream generated by encoding the three-dimensional data, and decodes the encoded stream using the determined encoding method.
[0036] According to this, when decoding an encoded stream, the three-dimensional data decoding device can determine the encoding scheme used by using the information stored in the control information. Therefore, the three-dimensional data decoding device can correctly decode an encoded stream even when multiple encoding schemes are used.
[0037] These comprehensive or specific embodiments may be implemented as a system, method, integrated circuit, computer program, or recording medium such as a computer-readable CD-ROM, or as any combination of a system, method, integrated circuit, computer program, and recording medium.
[0038] The embodiments will be described in detail below with reference to the drawings. Note that the embodiments described below are all specific examples of this disclosure. The numerical values, shapes, materials, components, arrangement and connection configurations of components, steps, and the order of steps shown in the following embodiments are examples only and are not intended to limit this disclosure. Furthermore, among the components in the following embodiments, those not described in the independent claim representing the highest-level concept will be described as optional components.
[0039] (Embodiment 1) When using encoded point cloud data in actual devices or services, it is desirable to send and receive necessary information depending on the application in order to reduce network bandwidth. However, until now, such functionality has not existed in the encoded structure of three-dimensional data, nor has there been an encoding method for that purpose.
[0040] This embodiment describes a three-dimensional data encoding method and a three-dimensional data encoding device for providing a function to send and receive information necessary for use in encoded data of a three-dimensional point cloud, a three-dimensional data decoding method and a three-dimensional data decoding device for decoding the encoded data, a three-dimensional data multiplexing method for multiplexing the encoded data, and a three-dimensional data transmission method for transmitting the encoded data.
[0041] In particular, while two encoding methods are currently being considered for encoding point cloud data, the structure of the encoded data and the method for storing the encoded data in a system format have not been defined. As a result, there is a problem in that MUX processing (multiplexing), transmission, or storage cannot be performed in the encoding unit.
[0042] Furthermore, there has been no existing method to support formats like PCC (Point Cloud Compression) that use a mixture of two codecs, a first encoding method and a second encoding method.
[0043] This embodiment describes the structure of PCC encoded data in which two codecs, a first encoding method and a second encoding method, coexist, and a method for storing the encoded data in a system format.
[0044] First, the configuration of the three-dimensional data (point cloud data) encoding and decoding system according to this embodiment will be described. Figure 1 is a diagram showing an example of the configuration of the three-dimensional data encoding and decoding system according to this embodiment. As shown in Figure 1, the three-dimensional data encoding and decoding system includes a three-dimensional data encoding system 4601, a three-dimensional data decoding system 4602, a sensor terminal 4603, and an external connection unit 4604.
[0045] The three-dimensional data encoding system 4601 generates encoded data or multiplexed data by encoding point cloud data, which is three-dimensional data. The three-dimensional data encoding system 4601 may be a three-dimensional data encoding device implemented by a single device, or it may be a system implemented by multiple devices. Furthermore, the three-dimensional data encoding device may include some of the multiple processing units included in the three-dimensional data encoding system 4601.
[0046] The three-dimensional data encoding system 4601 includes a point cloud data generation system 4611, a presentation unit 4612, an encoding unit 4613, a multiplexing unit 4614, an input / output unit 4615, and a control unit 4616. The point cloud data generation system 4611 includes a sensor information acquisition unit 4617 and a point cloud data generation unit 4618.
[0047] The sensor information acquisition unit 4617 acquires sensor information from the sensor terminal 4603 and outputs the sensor information to the point cloud data generation unit 4618. The point cloud data generation unit 4618 generates point cloud data from the sensor information and outputs the point cloud data to the encoding unit 4613.
[0048] The display unit 4612 presents sensor information or point cloud data to the user. For example, the display unit 4612 displays information or images based on sensor information or point cloud data.
[0049] The encoding unit 4613 encodes (compresses) the point cloud data and outputs the resulting encoded data, control information obtained during the encoding process, and other additional information to the multiplexing unit 4614. The additional information includes, for example, sensor information.
[0050] The multiplexing unit 4614 generates multiplexed data by multiplexing the encoded data input from the encoding unit 4613, control information, and additional information. The format of the multiplexed data is, for example, a file format for storage or a packet format for transmission.
[0051] The input / output unit 4615 (for example, the communication unit or interface) outputs the multiplexed data to the outside. Alternatively, the multiplexed data is stored in a storage unit such as internal memory. The control unit 4616 (or application execution unit) controls each processing unit. In other words, the control unit 4616 performs control such as encoding and multiplexing.
[0052] The sensor information may also be input to the encoding unit 4613 or the multiplexing unit 4614. Furthermore, the input / output unit 4615 may output the point cloud data or encoded data directly to the outside.
[0053] The transmission signal (multiplexed data) output from the three-dimensional data encoding system 4601 is input to the three-dimensional data decoding system 4602 via the external connection unit 4604.
[0054] The three-dimensional data decoding system 4602 generates point cloud data, which is three-dimensional data, by decoding encoded data or multiplexed data. The three-dimensional data decoding system 4602 may be a three-dimensional data decoding device implemented by a single device, or it may be a system implemented by multiple devices. Furthermore, the three-dimensional data decoding device may include some of the multiple processing units included in the three-dimensional data decoding system 4602.
[0055] The three-dimensional data decoding system 4602 includes a sensor information acquisition unit 4621, an input / output unit 4622, a demultiplexing unit 4623, a decoding unit 4624, a presentation unit 4625, a user interface 4626, and a control unit 4627.
[0056] The sensor information acquisition unit 4621 acquires sensor information from the sensor terminal 4603.
[0057] The input / output unit 4622 acquires the transmission signal, decodes the multiplexed data (file format or packet) from the transmission signal, and outputs the multiplexed data to the demultiplexing unit 4623.
[0058] The demultiplexing unit 4623 acquires encoded data, control information, and additional information from the multiplexed data, and outputs the encoded data, control information, and additional information to the decoding unit 4624.
[0059] The decoding unit 4624 reconstructs the point cloud data by decoding the encoded data.
[0060] The presentation unit 4625 presents point cloud data to the user. For example, the presentation unit 4625 displays information or images based on the point cloud data. The user interface 4626 acquires instructions based on user operations. The control unit 4627 (or application execution unit) controls each processing unit. In other words, the control unit 4627 performs control such as demultiplexing, decoding, and presentation.
[0061] The input / output unit 4622 may acquire point cloud data or encoded data directly from an external source. The presentation unit 4625 may acquire additional information such as sensor information and present information based on that additional information. The presentation unit 4625 may also make presentations based on user instructions acquired through the user interface 4626.
[0062] The sensor terminal 4603 generates sensor information, which is information obtained from the sensor. The sensor terminal 4603 is a terminal equipped with a sensor or camera, and may be, for example, a mobile object such as an automobile, an aerial object such as an airplane, a mobile terminal, or a camera.
[0063] The sensor information that can be acquired by the sensor terminal 4603 includes, for example, (1) the distance between the sensor terminal 4603 and the object, or the reflectivity of the object, obtained from a LiDAR, millimeter-wave radar, or infrared sensor, and (2) the distance between the camera and the object, or the reflectivity of the object, obtained from multiple monocular camera images or stereo camera images. The sensor information may also include the sensor's attitude, orientation, gyroscope (angular velocity), position (GPS information or altitude), speed, or acceleration. The sensor information may also include temperature, atmospheric pressure, humidity, or magnetism.
[0064] The external connection unit 4604 is implemented by an integrated circuit (LSI or IC), an external storage unit, communication with a cloud server via the internet, or broadcasting, etc.
[0065] Next, we will explain point cloud data. Figure 2 shows the structure of point cloud data. Figure 3 shows an example of the structure of a data file containing information about point cloud data.
[0066] Point cloud data contains data for multiple points. Each point's data includes location information (three-dimensional coordinates) and attribute information related to that location. A collection of these points is called a point cloud. For example, a point cloud represents the three-dimensional shape of an object.
[0067] Position information, such as three-dimensional coordinates, is sometimes referred to as geometry. Furthermore, the data for each point may include attribute information of multiple attribute types. Attribute types include, for example, color or reflectance.
[0068] One location information may be associated with one attribute information, or multiple attribute information of different attribute types may be associated with one location information. Furthermore, multiple attribute information of the same attribute type may be associated with one location information.
[0069] The example data file structure shown in Figure 3 represents a case where location information and attribute information correspond one-to-one, and it shows the location information and attribute information of the N points that make up the point cloud data.
[0070] Location information includes, for example, information for the three axes: x, y, and z. Attribute information includes, for example, RGB color information. A typical data file is a ply file.
[0071] Next, we will explain the types of point cloud data. Figure 4 is a diagram illustrating the types of point cloud data. As shown in Figure 4, point cloud data includes static objects and dynamic objects.
[0072] A static object is three-dimensional point cloud data for any given time (a specific moment). A dynamic object is three-dimensional point cloud data that changes over time. Hereafter, three-dimensional point cloud data for a given time will be referred to as a PCC frame, or simply a frame.
[0073] The object can be a point cloud with a somewhat limited area, like regular video data, or it can be a large-scale point cloud with no area limitations, like map information.
[0074] Furthermore, point cloud data of various densities may exist, including both sparse and dense point cloud data.
[0075] The details of each processing unit are described below. Sensor information is acquired by various methods, such as distance sensors like LIDAR or rangefinders, stereo cameras, or combinations of multiple monocular cameras. The point cloud data generation unit 4618 generates point cloud data based on the sensor information obtained by the sensor information acquisition unit 4617. The point cloud data generation unit 4618 generates position information as point cloud data and adds attribute information to the position information.
[0076] The point cloud data generation unit 4618 may process the point cloud data when generating position information or adding attribute information. For example, the point cloud data generation unit 4618 may reduce the amount of data by deleting point clouds with overlapping positions. The point cloud data generation unit 4618 may also transform the position information (such as position shifting, rotation, or normalization) or render the attribute information.
[0077] In Figure 1, the point cloud data generation system 4611 is included in the three-dimensional data encoding system 4601, but it may also be provided independently outside of the three-dimensional data encoding system 4601.
[0078] The encoding unit 4613 generates encoded data by encoding the point cloud data based on a predetermined encoding scheme. There are two main types of encoding schemes. The first is an encoding scheme that uses positional information, which will be referred to as the first encoding scheme hereafter. The second is an encoding scheme that uses a video codec, which will be referred to as the second encoding scheme hereafter.
[0079] The decoding unit 4624 decodes the point cloud data by decoding the encoded data based on a predetermined encoding scheme.
[0080] The multiplexing unit 4614 generates multiplexed data by multiplexing the encoded data using an existing multiplexing method. The generated multiplexed data is transmitted or stored. In addition to PCC encoded data, the multiplexing unit 4614 multiplexes other media such as video, audio, subtitles, applications, files, or reference time information. Furthermore, the multiplexing unit 4614 may also multiplex attribute information related to sensor information or point cloud data.
[0081] Multiplexing methods or file formats include ISOBMFF, ISOBMFF-based transmission methods such as MPEG-DASH, MMT, MPEG-2 TS Systems, and RMP.
[0082] The demultiplexing unit 4623 extracts PCC encoded data, other media, and time information from the multiplexed data.
[0083] The input / output unit 4615 transmits the multiplexed data using a method appropriate to the transmission medium or storage medium, such as broadcasting or communication. The input / output unit 4615 may communicate with other devices via the internet or with storage units such as cloud servers.
[0084] Communication protocols such as HTTP, FTP, TCP, or UDP can be used. Either a pull-type or push-type communication method may be employed.
[0085] Either wired or wireless transmission may be used. Wired transmission methods include Ethernet®, USB, RS-232C, HDMI®, or coaxial cable. Wireless transmission methods include wireless LAN, Wi-Fi®, Bluetooth®, or millimeter wave.
[0086] Furthermore, broadcasting formats such as DVB-T2, DVB-S2, DVB-C2, ATSC3.0, or ISDB-S3 may be used.
[0087] Figure 5 shows the configuration of a first encoding unit 4630, which is an example of an encoding unit 4613 that performs encoding using the first encoding scheme. Figure 6 is a block diagram of the first encoding unit 4630. The first encoding unit 4630 generates encoded data (encoded stream) by encoding point cloud data using the first encoding scheme. This first encoding unit 4630 includes a location information encoding unit 4631, an attribute information encoding unit 4632, an additional information encoding unit 4633, and a multiplexing unit 4634.
[0088] The first encoding unit 4630 is characterized by performing encoding while being aware of the three-dimensional structure. Furthermore, the first encoding unit 4630 is characterized by the attribute information encoding unit 4632 performing encoding using information obtained from the location information encoding unit 4631. The first encoding scheme is also called GPCC (Geometry based PCC).
[0089] The point cloud data is PCC point cloud data such as a PLY file, or PCC point cloud data generated from sensor information, and includes position information, attribute information, and other additional information (metadata). The position information is input to the position information encoding unit 4631, the attribute information is input to the attribute information encoding unit 4632, and the additional information is input to the additional information encoding unit 4633.
[0090] The location information encoding unit 4631 generates encoded location information (Compressed Geometry), which is encoded data, by encoding location information. For example, the location information encoding unit 4631 encodes location information using an N-tree structure such as an octree. Specifically, in an octree, the target space is divided into 8 nodes (subspaces), and 8 bits of information (occupancy code) are generated to indicate whether or not a point cloud is contained in each node. Furthermore, nodes containing point clouds are further divided into 8 nodes, and 8 bits of information are generated to indicate whether or not a point cloud is contained in each of these 8 nodes. This process is repeated until the number of point clouds contained in a predetermined hierarchy or node falls below a threshold.
[0091] The attribute information encoding unit 4632 generates encoded attribute information (Compressed Attribute), which is encoded data, by encoding it using the configuration information generated by the location information encoding unit 4631. For example, the attribute information encoding unit 4632 determines the reference point (reference node) to be referenced in encoding the target point (target node) to be processed, based on the octave tree structure generated by the location information encoding unit 4631. For example, the attribute information encoding unit 4632 references a surrounding node or adjacent node whose parent node in the octave tree is the same as the target node. Note that the method for determining the reference relationship is not limited to this.
[0092] Furthermore, the attribute information encoding process may include at least one of the following: quantization, prediction, and arithmetic encoding. In this case, a reference means using a reference node to calculate the predicted value of the attribute information, or using the state of a reference node (for example, occupancy information indicating whether or not the reference node contains a point cloud) to determine the encoding parameters. For example, encoding parameters may be quantization parameters in the quantization process, or context in arithmetic encoding.
[0093] The additional information encoding unit 4633 generates encoded data, or compressed additional information (Compressed MetaData), by encoding the compressible data from the additional information.
[0094] The multiplexing unit 4634 generates a compressed stream, which is encoded data, by multiplexing encoded position information, encoded attribute information, encoded additional information, and other additional information. The generated compressed stream is output to a processing unit of the system layer (not shown).
[0095] Next, we will describe a first decoding unit 4640, which is an example of a decoding unit 4624 that performs decoding of the first encoding scheme. Figure 7 is a diagram showing the configuration of the first decoding unit 4640. Figure 8 is a block diagram of the first decoding unit 4640. The first decoding unit 4640 generates point cloud data by decoding encoded data (encoded stream) encoded with the first encoding scheme using the first encoding scheme. This first decoding unit 4640 includes a demultiplexing unit 4641, a location information decoding unit 4642, an attribute information decoding unit 4643, and an additional information decoding unit 4644.
[0096] A compressed stream, which is encoded data, is input to the first decoding unit 4640 from a processing unit of the system layer (not shown).
[0097] The demultiplexing unit 4641 separates encoded location information (Compressed Geometry), encoded attribute information (Compressed Attribute), encoded additional information (Compressed MetaData), and other additional information from the encoded data.
[0098] The location information decoding unit 4642 generates location information by decoding the encoded location information. For example, the location information decoding unit 4642 reconstructs the location information of a point cloud represented by three-dimensional coordinates from encoded location information represented by an N-tree structure such as an octree.
[0099] The attribute information decoding unit 4643 decodes the encoded attribute information based on the configuration information generated by the location information decoding unit 4642. For example, the attribute information decoding unit 4643 determines the reference point (reference node) to be referenced in the decoding of the target point (target node) to be processed, based on the octave tree structure obtained by the location information decoding unit 4642. For example, the attribute information decoding unit 4643 references a surrounding node or adjacent node whose parent node in the octave tree is the same as the target node. Note that the method for determining the reference relationship is not limited to this.
[0100] Furthermore, the attribute information decoding process may include at least one of the following: inverse quantization, prediction, and arithmetic decoding. In this case, "reference" means using a reference node to calculate the predicted value of the attribute information, or using the state of the reference node (for example, occupancy information indicating whether or not the reference node contains a point cloud) to determine the decoding parameters. For example, decoding parameters may be quantization parameters in the inverse quantization process, or context in arithmetic decoding.
[0101] The additional information decoding unit 4644 generates additional information by decoding the encoded additional information. The first decoding unit 4640 uses the additional information necessary for decoding location information and attribute information during decoding and outputs the additional information necessary for the application to the outside.
[0102] Next, we will describe a second encoding unit 4650, which is an example of an encoding unit 4613 that performs encoding using a second encoding scheme. Figure 9 is a diagram showing the configuration of the second encoding unit 4650. Figure 10 is a block diagram of the second encoding unit 4650.
[0103] The second encoding unit 4650 generates encoded data (encoded stream) by encoding the point cloud data using a second encoding method. This second encoding unit 4650 includes an additional information generation unit 4651, a position image generation unit 4652, an attribute image generation unit 4653, a video encoding unit 4654, an additional information encoding unit 4655, and a multiplexing unit 4656.
[0104] The second encoding unit 4650 generates a position image and an attribute image by projecting a three-dimensional structure onto a two-dimensional image, and then encodes the generated position image and attribute image using an existing video encoding scheme. The second encoding scheme is also called VPCC (Video based PCC).
[0105] Point cloud data is PCC point cloud data such as a PLY file, or PCC point cloud data generated from sensor information, and includes position information, attribute information, and other additional information (metadata).
[0106] The additional information generation unit 4651 generates map information for multiple two-dimensional images by projecting a three-dimensional structure onto a two-dimensional image.
[0107] The position image generation unit 4652 generates a position image (geometry image) based on position information and map information generated by the additional information generation unit 4651. This position image is, for example, a depth image in which the distance is indicated as a pixel value. This depth image may be an image of multiple point clouds viewed from one viewpoint (an image of multiple point clouds projected onto a single two-dimensional plane), or multiple images of multiple point clouds viewed from multiple viewpoints, or a single image formed by integrating these multiple images.
[0108] The attribute image generation unit 4653 generates an attribute image based on attribute information and map information generated by the additional information generation unit 4651. This attribute image is, for example, an image in which attribute information (e.g., color (RGB)) is shown as pixel values. This image may be an image of multiple point clouds viewed from one viewpoint (an image of multiple point clouds projected onto a single two-dimensional plane), or multiple images of multiple point clouds viewed from multiple viewpoints, or a single image formed by integrating these multiple images.
[0109] The video encoding unit 4654 generates encoded data, namely a compressed geometric image and a compressed attribute image, by encoding the position image and attribute image using a video encoding scheme. Any known encoding scheme may be used as the video encoding scheme. For example, the video encoding scheme may be AVC or HEVC.
[0110] The additional information encoding unit 4655 generates encoded additional information (Compressed MetaData) by encoding additional information and map information included in the point cloud data.
[0111] The multiplexing unit 4656 generates a compressed stream, which is encoded data, by multiplexing the encoded position image, encoded attribute image, encoded additional information, and other additional information. The generated compressed stream is output to a processing unit of the system layer (not shown).
[0112] Next, we will describe a second decoding unit 4660, which is an example of a decoding unit 4624 that performs decoding of the second encoding scheme. Figure 11 is a diagram showing the configuration of the second decoding unit 4660. Figure 12 is a block diagram of the second decoding unit 4660. The second decoding unit 4660 generates point cloud data by decoding encoded data (encoded stream) encoded with the second encoding scheme using the second encoding scheme. This second decoding unit 4660 includes a demultiplexing unit 4661, a video decoding unit 4662, an additional information decoding unit 4663, a location information generation unit 4664, and an attribute information generation unit 4665.
[0113] A compressed stream, which is encoded data, is input to the second decoding unit 4660 from a processing unit of the system layer (not shown).
[0114] The demultiplexing unit 4661 separates the encoded location image (Compressed Geometry Image), encoded attribute image (Compressed Attribute Image), encoded additional information (Compressed MetaData), and other additional information from the encoded data.
[0115] The video decoding unit 4662 generates a position image and an attribute image by decoding the encoded position image and the encoded attribute image using a video encoding scheme. Any known encoding scheme may be used as the video encoding scheme. For example, the video encoding scheme may be AVC or HEVC.
[0116] The additional information decoding unit 4663 generates additional information, including map information, by decoding the encoded additional information.
[0117] The location information generation unit 4664 generates location information using the location image and map information. The attribute information generation unit 4665 generates attribute information using the attribute image and map information.
[0118] The second decoding unit 4660 uses the additional information necessary for decoding during the decoding process and outputs the additional information necessary for the application to the outside.
[0119] The following describes the challenges in the PCC encoding scheme. Figure 13 is a diagram showing the protocol stack involved in PCC encoded data. Figure 13 shows an example in which data from other media, such as video (e.g., HEVC) or audio, is multiplexed onto PCC encoded data and then transmitted or stored.
[0120] Multiplexing schemes and file formats have the function of multiplexing, transmitting, or storing various encoded data. In order to transmit or store encoded data, the encoded data must be converted into the format of the multiplexing scheme. For example, HEVC specifies a technique in which encoded data is stored in a data structure called a NAL unit, and the NAL unit is stored in ISOBMFF.
[0121] On the other hand, while two coding methods, Codec1 and Codec2, are currently being considered for encoding point cloud data, the structure of the encoded data and the method for storing the encoded data in a system format have not been defined. As a result, there is a problem in that MUX processing (multiplexing), transmission, and storage cannot be performed in the encoding unit.
[0122] In the following text, unless a specific encoding scheme is mentioned, either the first encoding scheme or the second encoding scheme will be referred to.
[0123] The following describes the method for defining NAL units according to this embodiment. For example, in conventional codecs such as HEVC, one NAL unit for one format is defined for one codec. However, there has been no method to support formats that mix two codecs (hereinafter referred to as PCC codecs), such as PCC, which have a first encoding method and a second encoding method.
[0124] In this embodiment, a common format for PCC codecs is defined as a NAL unit, and further, identifiers for NAL units that depend on the PCC codec are defined. Figure 14 shows the protocol stack in this case. Figures 15 to 17 show examples of codec-common NAL unit formats. Figure 15 shows an example of the syntax of a Common PCC NAL Unit. Figure 16 shows an example of the syntax of a Common PCC NAL Unit Header. Figure 17 shows an example of the semantics of pcc_codec_type. Figure 18 is a diagram showing an example of a codec-dependent NAL unit type definition, and shows an example of the semantics of pcc_nal_unit_type.
[0125] A common NAL unit format is defined for PCC codecs. A NAL unit (pcc_nal_unit) includes a header (pcc_nal_unit_header), a payload (pcc_nal_unit_payload), and trailing bits (trailing_bits). The same format is used regardless of whether the data is from the first or second encoding method codec.
[0126] The NAL unit header (pcc_nal_unit_header) stores the codec type (pcc_codec_tye) and the NAL unit type (pcc_nal_unit_type). The codec type indicates whether the PCC codec of the encoded data stored in the NAL unit is the first encoding method or the second encoding method.
[0127] The NAL unit type indicates the type of NAL unit that depends on the codec, and a type is defined for each codec. If the codec type is the first encoding method, the NAL unit type indicates the NAL unit type defined for the first encoding method. If the codec type is the second encoding method, the NAL unit type indicates the NAL unit type defined for the second encoding method. In other words, the same value is associated with different meanings for the NAL unit type defined for the first encoding method and the NAL unit type defined for the second encoding method.
[0128] Furthermore, the functionality of the codec type may be merged into the NAL unit type in the header. For example, some of the information in the NAL unit type may be used to indicate the codec type.
[0129] Next, the encoding process according to this embodiment will be described. Figure 19 is a flowchart of the encoding process according to this embodiment. The process in this figure shows the processing of the first encoding unit 4630 or the second encoding unit 4650 when using the above definition. In the following, the first encoding unit 4630 and the second encoding unit 4650 will not be distinguished and will also be referred to as the encoding unit 4613. Furthermore, the processing in this figure is mainly performed by the multiplexing unit 4634 shown in Figure 6 or the multiplexing unit 4656 shown in Figure 10.
[0130] The processing shown in the figure illustrates an example of encoding PCC data using either the first or second encoding method, and it is assumed that the choice of which PCC codec to use is known. For example, the choice of which PCC codec to use may be specified by the user or an external device.
[0131] First, the encoding unit 4613 encodes the PCC data using either the first encoding method or the second encoding method codec (S4601).
[0132] If the codec used is the second encoding method (second encoding method in S4602), the encoding unit 4613 sets the pcc_codec_type in the NAL unit header to a value indicating that the data in the NAL unit payload is data encoded using the second encoding method (S4603). The encoding unit 4613 also sets the pcc_nal_unit_type in the NAL unit header to an identifier for the NAL unit for the second encoding method (S4604). Then, the encoding unit 4613 generates an NAL unit having the set NAL unit header and containing encoded data in the payload. Finally, the encoding unit 4613 transmits the generated NAL unit (S4605).
[0133] On the other hand, if the codec used is the first encoding method (first encoding method in S4602), the encoding unit 4613 sets the pcc_codec_type of the NAL unit header to a value indicating that the data contained in the payload of the NAL unit is data encoded using the first encoding method (S4606). The encoding unit 4613 also sets the identifier of the NAL unit for the first encoding method in the pcc_nal_unit_type of the NAL unit header (S4607). Then, the encoding unit 4613 generates an NAL unit having the set NAL unit header and containing encoded data in the payload. Then, the encoding unit 4613 transmits the generated NAL unit (S4605).
[0134] Furthermore, in steps S4603 and S4606, if the function of pcc_code_type is included in pcc_nal_unit_type, the encoding unit 4613 may indicate in pcc_nal_unit_type whether the NAL unit is a first encoding method or a second encoding method.
[0135] Next, the decoding process by the first decoding unit 4640 and the second decoding unit 4660 according to this embodiment will be described. Figure 20 is a flowchart showing the decoding process by the second decoding unit 4660. The processing shown in the figure is mainly performed by the demultiplexing unit 4661 shown in Figure 12.
[0136] The processing shown in the figure illustrates an example of encoding PCC data using either the second encoding method or the first encoding method. In this method, the demultiplexing unit 4661 included in the second decoding unit 4660 can identify the codec type of the NAL unit by referring to the information contained in the NAL unit header. Therefore, the demultiplexing unit 4661 can output the necessary information to the video decoding unit 4662 according to the codec type.
[0137] First, the second decoding unit 4660 receives a NAL unit (S4611). For example, this NAL unit was generated by the processing in the encoding unit 4613 described above. In other words, the header of this NAL unit includes pcc_codec_type and pcc_nal_unit_type.
[0138] Next, the second decoding unit 4660 determines whether the pcc_codec_type included in the NAL unit header indicates the first encoding method or the second encoding method (S4612).
[0139] If pcc_codec_type indicates a second encoding method (second encoding method in S4612), the second decoding unit 4660 determines that the data contained in the NAL unit payload is data encoded using the second encoding method (S4613). The second decoding unit 4660 then identifies the data by determining that pcc_nal_unit_type contained in the NAL unit header is the identifier for the NAL unit for the second encoding method (S4614). The second decoding unit 4660 then decodes the PCC data using the decoding process of the second encoding method (S4615).
[0140] On the other hand, if pcc_codec_type indicates the first encoding method (first encoding method in S4612), the second decoding unit 4660 determines that the data contained in the payload of the NAL unit is data encoded using the first encoding method (S4616). In this case, the second decoding unit 4660 does not process the NAL unit (S4617).
[0141] In step S4612, if the functionality of pcc_code_type is included in pcc_nal_unit_type, the second decoding unit 4660 may refer to pcc_nal_unit_type to determine whether the codec used for the data included in the NAL unit is the first encoding method or the second encoding method.
[0142] Figure 21 is a flowchart showing the decoding process performed by the first decoding unit 4640. The processing shown in this figure is mainly carried out by the demultiplexing unit 4641 shown in Figure 8.
[0143] The processing shown in the figure illustrates an example of encoding PCC data using either the first encoding method or the second encoding method. In this method, the demultiplexing unit 4641 included in the first decoding unit 4640 can identify the codec type of the NAL unit by referring to the information contained in the NAL unit header. Therefore, the demultiplexing unit 4641 can output the necessary information according to the codec type to the location information decoding unit 4642 and the attribute information decoding unit 4643.
[0144] First, the first decoding unit 4640 receives a NAL unit (S4621). For example, this NAL unit was generated by the processing in the encoding unit 4613 described above. In other words, the header of this NAL unit includes pcc_codec_type and pcc_nal_unit_type.
[0145] Next, the first decoding unit 4640 determines whether the pcc_codec_type included in the NAL unit header indicates the first encoding method or the second encoding method (S4622).
[0146] If pcc_codec_type indicates a second encoding method (second encoding scheme in S4622), the first decoding unit 4640 determines that the data contained in the NAL unit's payload is data encoded using the second encoding method (S4623). In this case, the first decoding unit 4640 does not process the NAL unit (S4624).
[0147] On the other hand, if pcc_codec_type indicates the first encoding method (first encoding method in S4622), the first decoding unit 4640 determines that the data contained in the NAL unit payload is data encoded using the first encoding method (S4625). The first decoding unit 4640 then identifies the data by determining that pcc_nal_unit_type contained in the NAL unit header is the identifier for the NAL unit for the first encoding method (S4626). The first decoding unit 4640 then decodes the PCC data using the decoding process of the first encoding method (S4627).
[0148] (Embodiment 2) This embodiment describes an alternative method for defining NAL units. In this embodiment, different formats are defined for each PCC codec as NAL units. Furthermore, identifiers for NAL units are defined independently for each PCC codec.
[0149] Figure 22 shows the protocol stack in this case. Figure 23 shows an example syntax for a NAL unit (codec2_nal_unit) for Codec 2. Figure 24 shows an example syntax for a NAL unit header (codec2_nal_unit_header) for Codec 2. Figure 25 shows an example semantics for codec2_nal_unit_type.
[0150] Figure 26 shows an example of the syntax for a NAL unit (codec1_nal_unit) for codec 1. Figure 27 shows an example of the syntax for a NAL unit header (codec1_nal_unit_header) for codec 1. Figure 28 shows an example of the semantics for codec1_nal_unit_type.
[0151] A separate NAL unit format is defined for each PCC codec. A NAL unit (codec1_nal_unit, codec2_nal_unit) includes a header (codec1_nal_unit_header, codec2_nal_unit_header), a payload (codec1_nal_unit_payload, codec2_nal_unit_payload), and trailing bits (trailing_bits). The NAL unit for the first encoding method (codec1_nal_unit) and the NAL unit for the second encoding method (codec2_nal_unit) may have the same configuration or different configurations. The sizes of the NAL unit for the first encoding method and the NAL unit for the second encoding method may also differ.
[0152] Data encoded using the first encoding method is stored in the NAL unit for the first encoding method. Data encoded using the second encoding method is stored in the NAL unit for the second encoding method.
[0153] The NAL unit headers (codec1_nal_unit_header, codec2_nal_unit_header) store the NAL unit type (codec1_nal_unit_type, codec2_nal_unit_type). The NAL unit type is independent for each codec, and a type is defined for each codec. In other words, the NAL unit for the first encoding method contains the NAL unit type defined for the first encoding method. The NAL unit for the second encoding method contains the NAL unit type defined for the second encoding method.
[0154] By using this method, the first encoding method and the second encoding method can be treated as different codecs.
[0155] Next, the encoding process according to this embodiment will be described. Figure 29 is a flowchart of the encoding process according to this embodiment. The process shown in the figure represents the processing of the first encoding unit 4630 or the second encoding unit 4650 when the above definition is used. Furthermore, the processing shown in the figure is mainly performed by the multiplexing unit 4634 shown in Figure 6 or the multiplexing unit 4656 shown in Figure 10.
[0156] The processing shown in the figure illustrates an example of encoding PCC data using either the first or second encoding method, and it is assumed that the choice of which PCC codec to use is known. For example, the choice of which PCC codec to use may be specified by the user or an external device.
[0157] First, the encoding unit 4613 encodes the PCC data using either the first encoding method or the second encoding method codec (S4631).
[0158] If the codec used is the second encoding method (second encoding method in S4632), the encoding unit 4613 generates a NAL unit in the NAL unit format for the second encoding method (S4633). Next, the encoding unit 4613 sets the identifier of the NAL unit for the second encoding method in the codec2_nal_unit_type included in the NAL unit header (S4634). Then, the encoding unit 4613 generates a NAL unit having the set NAL unit header and containing encoded data in the payload. Finally, the encoding unit 4613 transmits the generated NAL unit (S4635).
[0159] On the other hand, if the codec used is the first encoding method (first encoding method in S4632), the encoding unit 4613 generates a NAL unit in the NAL unit format for the first encoding method (S4636). Next, the encoding unit 4613 sets the identifier of the NAL unit for the first encoding method in the codec1_nal_unit_type of the NAL unit header (S4637). Then, the encoding unit 4613 generates a NAL unit having the set NAL unit header and containing encoded data in the payload. Finally, the encoding unit 4613 transmits the generated NAL unit (S4635).
[0160] Next, the decoding process according to this embodiment will be described. Figure 30 is a flowchart of the decoding process according to this embodiment. The process in this figure shows the processing of the first decoding unit 4640 or the second decoding unit 4660 when the above definition is used. In the following, the first decoding unit 4640 or the second decoding unit 4660 will not be distinguished and will also be referred to as the decoding unit 4624. Furthermore, the processing in this figure is mainly performed by the demultiplexing unit 4641 shown in Figure 8 or the demultiplexing unit 4661 shown in Figure 12.
[0161] The processing shown in the figure illustrates an example of encoding PCC data using either the first encoding method or the second encoding method, and it is assumed that the PCC codec used for encoding is already known. For example, information indicating the codec being used is included in the transmission signal, multiplexed data, or encoded data, and the decoding unit 4624 determines the codec being used by referring to this information. Alternatively, the decoding unit 4624 may determine the codec being used based on a signal acquired separately from these signals.
[0162] If the codec being used is the second encoding method (second encoding method in S4641), the decoding unit 4624 receives a NAL unit in the format for the second encoding method (S4642). Next, the decoding unit 4624 identifies the data using the NAL unit format and codec2_nal_unit_type for the second encoding method, assuming that the NAL unit is for the second encoding method (S4643). Next, the decoding unit 4624 decodes the PCC data using the decoding process for the second encoding method (S4644).
[0163] On the other hand, if the codec being used is the first encoding method (first encoding method in S4641), the decoding unit 4624 receives a NAL unit in the format for the first encoding method (S4645). Next, the decoding unit 4624 identifies the data using the NAL unit format and codec1_nal_unit_type for the first encoding method, assuming that the NAL unit is for the first encoding method (S4646). Next, the decoding unit 4624 decodes the PCC data using the decoding process for the first encoding method (S4747).
[0164] (Embodiment 3) This embodiment describes an alternative method for defining NAL units. In this embodiment, a common format for PCC codecs is defined as a NAL unit. Furthermore, an identifier for the common NAL unit for PCC codecs is defined.
[0165] Figure 31 shows the protocol stack in this case. Figures 32 to 34 show examples of the codec common NAL unit format. Figure 32 shows an example syntax of the Common PCC NAL Unit. Figure 33 shows an example syntax of the Common PCC NAL Unit Header. Figure 34 shows an example semantics of pcc_codec_type.
[0166] A common NAL unit format is defined for PCC codecs. A NAL unit (pcc_nal_unit) includes a header (pcc_nal_unit_header), a payload (pcc_nal_unit_payload), and trailing bits (trailing_bits). The same format is used regardless of whether the data is from the first or second encoding method codec.
[0167] The NAL unit header (pcc_nal_unit_header) stores the NAL unit type (pcc_nal_unit_type). The NAL unit type is common to all codecs, and a common type is defined for all codecs. In other words, both the NAL units for the first encoding method and the NAL units for the second encoding method contain a commonly defined NAL unit type. In the example shown in Figure 34, for example, PCC DataA is the encoded data for codec 1, PCC DataB is the encoded data for codec 2, PCC MetaDataA is the additional information for codec 1, and PCC MetaDataB is the additional information for codec 2.
[0168] By using this method, the first encoding method and the second encoding method can be treated as the same codec.
[0169] Next, the encoding process according to this embodiment will be described. Figure 35 is a flowchart of the encoding process according to this embodiment. The process shown in the figure represents the processing of the first encoding unit 4630 or the second encoding unit 4650 when the above definition is used. Furthermore, the processing shown in the figure is mainly performed by the multiplexing unit 4634 shown in Figure 6 or the multiplexing unit 4656 shown in Figure 10.
[0170] The processing shown in the figure illustrates an example of encoding PCC data using either the second encoding method or the first encoding method, and it is assumed that the choice of which PCC codec to use is known. For example, the choice of which PCC codec to use may be specified by the user or an external device.
[0171] First, the encoding unit 4613 encodes the PCC data using either the second encoding method or the first encoding method codec (S4651). Next, the encoding unit 4613 generates NAL units in the PCC common NAL unit format (S4652).
[0172] Next, the encoding unit 4613 sets the identifier of a PCC common NAL unit in the pcc_nal_unit_type included in the NAL unit header (S4653). Then, it transmits an NAL unit having the set NAL unit header and containing encoded data in the payload (S4654).
[0173] Next, the decoding process according to this embodiment will be described. Figure 36 is a flowchart of the decoding process according to this embodiment. The process shown in the figure represents the processing of the first decoding unit 4640 or the second decoding unit 4660 when the above definition is used. Furthermore, the processing shown in the figure is mainly performed by the demultiplexing unit 4641 shown in Figure 8 or the demultiplexing unit 4661 shown in Figure 12.
[0174] The processing shown in the figure illustrates an example of encoding PCC data using either the second encoding method or the first encoding method.
[0175] First, the decoding unit 4624 determines the codec used to encode the data contained in the NAL unit (S4661). For example, the decoding unit 4624 determines the codec being used by referring to the pcc_nal_unit_type contained in the NAL unit header.
[0176] If the codec being used is the second encoding method (second encoding method in S4661), the decoding unit 4624 receives NAL units in the PCC common format (S4662). Next, the decoding unit 4624 identifies the data using the common NAL unit format and the common pcc_nal_unit_type, assuming that the NAL units are common (S4663). Next, the decoding unit 4624 decodes the PCC data using the decoding process of the second encoding method (S4664).
[0177] On the other hand, if the codec being used is the first encoding method (first encoding method in S4661), the decoding unit 4624 receives NAL units in a common PCC format (S4665). Next, the decoding unit 4624 identifies the data using a common NAL unit format and a common pcc_nal_unit_type, assuming that the NAL units are common (S4666). Next, the decoding unit 4624 decodes the PCC data using the decoding process of the first encoding method (S4667).
[0178] The following describes modifications of Embodiments 1 to 3 described above. The following methods may also be used as alternative methods for indicating the PCC codec type.
[0179] Embodiments 1, 2, and 3 described the case where two codecs, a first encoding method and a second encoding method, are mixed. However, the above method can also be applied when there are three or more PCC codecs.
[0180] Furthermore, in Embodiments 1 and 3, the identification information of the PCC codec (pcc_codec_type in Embodiment 1 and pcc_nal_unit_type in Embodiment 3b) was included in the NAL unit header, but the identification information of the codec may be stored in another location.
[0181] Furthermore, the first and second encoding methods are not limited to the examples above and may be any codec. For example, the first and second encoding methods may be multiple codecs subdivided from GPCC, or multiple codecs subdivided from VPCC. For example, both the first and second encoding methods may be VPCC, but the video encoding schemes used may be different. The video encoding scheme may be, for example, AVC or HEVC. Also, either or both of the first and second encoding methods may be encoding methods that include other encoding schemes such as video, audio, and text applications.
[0182] For example, codec identification information may be included in the control information contained in the PCC encoded stream. Here, control information includes, for example, parameter sets or metadata such as SEI (Supplemental Enhancement Information).
[0183] Figure 37 is a flowchart of the encoding process performed by the encoding unit 4613 in this case. First, the encoding unit 4613 encodes the PCC data (S4671) and writes the identification information of the PCC codec at a predetermined location (e.g., parameter set) within the encoded data (S4672). Next, the encoding unit 4613 generates a NAL unit containing the encoded data and transmits the generated NAL unit (S4673).
[0184] Furthermore, the identification information of the PCC codec may be defined as "profile," and the identification information of the PCC codec may be indicated in the metadata. Also, if the same codec is used throughout the entire sequence, the PCC codec identification information may be included in the sequence parameter set. Also, if each PCC frame is encoded with a different codec, the PCC codec identification information may be included in the parameter set that describes the information for each frame. For example, if different codecs are used for each piece of PCC data, such as when location information and attribute information use different codecs, the PCC codec identification information may be included in the parameter set that describes the information for each piece of data. In other words, information indicating the codec for location information may be included in the control information (parameter set, etc.) for location information, and information indicating the codec for attribute information may be included in the control information (parameter set, etc.) for attribute information.
[0185] The codec identification information may be stored in any of the above locations, or in multiple locations. For example, the codec identification information may be stored both in the encoded stream and in the NAL unit header. Furthermore, if the codec identification information is stored in multiple locations, the same information may be stored in each location, or different information may be stored there. Different information could be, for example, information indicating GPCC or VPCC, and information indicating one of the multiple codecs that are subdivisions of GPCC or VPCC.
[0186] The demultiplexing unit 4641 or 4661 included in the decoding unit 4624 can determine whether the data contained in the payload of a NAL unit is encoded using the first encoding method or the second encoding method by analyzing the description in the parameter set if the NAL unit includes a parameter set. This allows the decoding unit 4624 to quickly filter out NAL units that are not needed for decoding.
[0187] Figure 38 is a flowchart of the decoding process performed by the decoding unit 4624 in this case. First, the decoding unit 4624 receives the NAL unit (S4675) and identifies predetermined data (e.g., the parameter set) containing the identification information of the PCC codec using the pcc_nal_unit_type included in the NAL unit header (S4676). Next, the decoding unit 4624 identifies the PCC codec indicated in the predetermined data (e.g., the parameter set) by analyzing the predetermined data (S4677). Next, the decoding unit 4624 decodes the encoded data using the identified PCC codec (S4678).
[0188] Furthermore, although the above example shows an example where the encoded stream is stored in a NAL unit, a predetermined unit (unit) may be used instead of a NAL unit.
[0189] (Embodiment 4) In this embodiment, an encoding unit 4670 having the functions of both the first encoding unit 4630 and the second encoding unit 4650 described above, and a decoding unit 4680 having the functions of both the first decoding unit 4640 and the second decoding unit 4660 will be described.
[0190] Figure 39 is a block diagram of the encoding unit 4670 according to this embodiment. This encoding unit 4670 includes the first encoding unit 4630 and the second encoding unit 4650 described above, and a multiplexing unit 4671. The multiplexing unit 4671 multiplexes the encoded data generated by the first encoding unit 4630 and the encoded data generated by the second encoding unit 4650, and outputs the obtained encoded data.
[0191] Figure 40 is a block diagram of the decoding unit 4680 according to this embodiment. This decoding unit 4680 includes the first decoding unit 4640 and the second decoding unit 4660 described above, and a demultiplexing unit 4681. The demultiplexing unit 4681 extracts encoded data using the first encoding method and encoded data using the second encoding method from the input encoded data. The demultiplexing unit 4681 outputs the encoded data using the first encoding method to the first decoding unit 4640 and outputs the encoded data using the second encoding method to the second decoding unit 4660.
[0192] With the above configuration, the encoding unit 4670 can encode point cloud data by selectively using the first encoding method and the second encoding method. Furthermore, the decoding unit 4680 can decode encoded data encoded using the first encoding method, encoded data encoded using the second encoding method, and encoded data encoded using both the first and second encoding methods.
[0193] For example, the encoding unit 4670 may switch the encoding method (first encoding method and second encoding method) on a point cloud data basis or on a frame basis. Alternatively, the encoding unit 4670 may switch the encoding method on an encodingable unit basis.
[0194] The encoding unit 4670 generates encoded data (encoded stream) including PCC codec identification information, as described in Embodiment 1 or Embodiment 3 above.
[0195] The demultiplexing unit 4681 included in the decoding unit 4680 identifies the data using, for example, the identification information of the PCC codec described in Embodiment 1 or Embodiment 3. If the data is data encoded using the first encoding method, the demultiplexing unit 4681 outputs the data to the first decoding unit 4640, and if the data is data encoded using the second encoding method, it outputs the data to the second decoding unit 4660.
[0196] Furthermore, the encoding unit 4670 may also send control information indicating whether both encoding methods were used or only one of the encoding methods was used, in addition to the identification information of the PCC codec.
[0197] Next, the encoding process according to this embodiment will be described. Figure 41 is a flowchart of the encoding process according to this embodiment. By using the PCC codec identification information described in Embodiment 1, Embodiment 2, Embodiment 3, and the modified examples, encoding processing compatible with multiple codecs becomes possible. Although Figure 41 shows an example using the method of Embodiment 1, similar processing can be applied to other methods.
[0198] First, the encoding unit 4670 encodes the PCC data using either the first encoding method, the second encoding method, or both of these codecs (S4681).
[0199] If the codec used is the second encoding method (second encoding method in S4682), the encoding unit 4670 sets the pcc_codec_type in the NAL unit header to a value indicating that the data in the NAL unit payload is data encoded using the second encoding method (S4683). Next, the encoding unit 4670 sets the identifier of the NAL unit for the second encoding method in the pcc_nal_unit_type of the NAL unit header (S4684). Then, the encoding unit 4670 generates an NAL unit having the set NAL unit header and containing encoded data in the payload. Finally, the encoding unit 4670 transmits the generated NAL unit (S4685).
[0200] On the other hand, if the codec used is the first encoding method (first encoding method in S4682), the encoding unit 4670 sets the pcc_codec_type included in the NAL unit header to a value indicating that the data included in the NAL unit payload is data encoded using the first encoding method (S4686). Next, the encoding unit 4670 sets the identifier of the NAL unit for the first encoding method in the pcc_nal_unit_type included in the NAL unit header (S4687). Next, the encoding unit 4670 generates an NAL unit having the set NAL unit header and containing encoded data in the payload. Then, the encoding unit 4670 transmits the generated NAL unit (S4685).
[0201] Next, the decoding process according to this embodiment will be described. Figure 42 is a flowchart of the decoding process according to this embodiment. By using the PCC codec identification information described in Embodiment 1, Embodiment 2, Embodiment 3, and the modified examples, decoding processing compatible with multiple codecs becomes possible. Although Figure 42 shows an example using the method of Embodiment 1, similar processing can be applied to other methods.
[0202] First, the decoding unit 4680 receives a NAL unit (S4691). For example, this NAL unit is generated by the processing in the encoding unit 4670 described above.
[0203] Next, the decoding unit 4680 determines whether the pcc_codec_type included in the NAL unit header indicates the first encoding method or the second encoding method (S4692).
[0204] If pcc_codec_type indicates a second encoding method (second encoding method in S4692), the decoding unit 4680 determines that the data contained in the NAL unit payload is data encoded using the second encoding method (S4693). The second decoding unit 4660 then identifies the data, assuming that pcc_nal_unit_type contained in the NAL unit header is an identifier for the NAL unit used for the second encoding method (S4694). The decoding unit 4680 then decodes the PCC data using the decoding process of the second encoding method (S4695).
[0205] On the other hand, if pcc_codec_type indicates the first encoding method (first encoding method in S4692), the decoding unit 4680 determines that the data contained in the NAL unit payload is data encoded using the first encoding method (S4696). The decoding unit 4680 then identifies the data by assuming that pcc_nal_unit_type contained in the NAL unit header is the identifier for the NAL unit for the first encoding method (S4697). The decoding unit 4680 then decodes the PCC data using the decoding process of the first encoding method (S4698).
[0206] As described above, a three-dimensional data encoding device according to one aspect of the present disclosure generates an encoded stream by encoding three-dimensional data (e.g., point cloud data) (for example, S4671 in Figure 37), and stores information indicating which encoding method was used for the encoding from among the first and second encoding methods (e.g., codec identification information) in the control information (e.g., parameter set) of the encoded stream (for example, S4672 in Figure 37).
[0207] According to this, when decoding an encoded stream generated by a three-dimensional data encoding device, the three-dimensional data decoding device can determine the encoding method used for encoding using information stored in the control information. Therefore, the three-dimensional data decoding device can correctly decode the encoded stream even when multiple encoding methods are used.
[0208] For example, the three-dimensional data includes location information. The three-dimensional data encoding device encodes the location information during the encoding process. During storage, the three-dimensional data encoding device stores information in the control information of the location information indicating which of the first and second encoding methods was used to encode the location information.
[0209] For example, the three-dimensional data includes location information and attribute information. The three-dimensional data encoding device encodes the location information and the attribute information in the encoding process. In the storage process, the three-dimensional data encoding device stores in the control information for the location information information information that indicates which of the first and second encoding methods was used to encode the location information, and stores in the control information for the attribute information information information that indicates which of the first and second encoding methods was used to encode the attribute information.
[0210] According to this method, different encoding schemes can be used for location information and attribute information, thereby improving encoding efficiency.
[0211] For example, the three-dimensional data encoding method further stores the encoded stream in one or more units (e.g., NAL units) (e.g., S4673 in Figure 37).
[0212] For example, as described in Figures 15 to 18 of Embodiment 1, the unit has a format common to both the first encoding scheme and the second encoding scheme, and includes information indicating the type of data contained in the unit, which has an independent definition for the first encoding scheme and the second encoding scheme (e.g., pcc_nal_unit_type).
[0213] For example, as described in Figures 23 to 28 of Embodiment 2, the unit has a format independent of the first encoding scheme and the second encoding scheme, and includes information indicating the type of data contained in the unit, which has a definition independent of the first encoding scheme and the second encoding scheme (for example, codec1_nal_unit_type or codec2_nal_unit_type).
[0214] For example, as described in Figures 32 to 34 of Embodiment 3, the unit has a format common to both the first and second encoding schemes and includes information indicating the type of data contained in the unit, which has a definition common to both the first and second encoding schemes (e.g., pcc_nal_unit_type).
[0215] For example, a three-dimensional data encoding device comprises a processor and memory, and the processor uses the memory to perform the above processing.
[0216] Furthermore, the three-dimensional data decoding device according to this embodiment determines the encoding method used to encode the encoded stream (for example, S4677 in Figure 38) based on information indicating which encoding method was used to encode the three-dimensional data among the first and second encoding methods included in the control information (for example, parameter set) of the encoded stream generated by encoding the three-dimensional data (for example, codec identification information), and decodes the encoded stream using the determined encoding method (for example, S4678 in Figure 38).
[0217] According to this, when decoding an encoded stream, the three-dimensional data decoding device can determine the encoding scheme used by using the information stored in the control information. Therefore, the three-dimensional data decoding device can correctly decode an encoded stream even when multiple encoding schemes are used.
[0218] For example, the three-dimensional data includes location information, and the encoded stream includes encoded data of the location information. In the determination, the three-dimensional data decoding device determines the encoding method used to encode the location information based on information indicating which of the first and second encoding methods was used to encode the location information, which is included in the control information of the location information included in the encoded stream. In the decoding, the three-dimensional data decoding device decodes the encoded data of the location information using the determined encoding method used to encode the location information.
[0219] For example, the three-dimensional data includes location information and attribute information, and the encoded stream includes encoded data of the location information and encoded data of the attribute information. In the determination, the three-dimensional data decoding device determines the encoding method used to encode the location information based on information indicating which of the first and second encoding methods was used to encode the location information, which is included in the control information of the location information included in the encoded stream, and determines the encoding method used to encode the attribute information based on information indicating which of the first and second encoding methods was used to encode the attribute information, which is included in the control information of the attribute information included in the encoded stream. In the decoding, the three-dimensional data decoding device decodes the encoded data of the location information using the determined encoding method used to encode the location information, and decodes the encoded data of the attribute information using the determined encoding method used to encode the attribute information.
[0220] According to this method, different encoding schemes can be used for location information and attribute information, thereby improving encoding efficiency.
[0221] For example, the encoded stream is stored in one or more units (e.g., NAL units), and the three-dimensional data decoding device further acquires the encoded stream from the one or more units.
[0222] For example, as described in Figures 15 to 18 of Embodiment 1, the unit has a format common to both the first encoding scheme and the second encoding scheme, and includes information indicating the type of data contained in the unit, which has an independent definition for the first encoding scheme and the second encoding scheme (e.g., pcc_nal_unit_type).
[0223] For example, as described in Figures 23 to 28 of Embodiment 2, the unit has a format independent of the first encoding scheme and the second encoding scheme, and includes information indicating the type of data contained in the unit, which has a definition independent of the first encoding scheme and the second encoding scheme (for example, codec1_nal_unit_type or codec2_nal_unit_type).
[0224] For example, as described in Figures 32 to 34 of Embodiment 3, the unit has a format common to both the first and second encoding schemes and includes information indicating the type of data contained in the unit, which has a definition common to both the first and second encoding schemes (e.g., pcc_nal_unit_type).
[0225] For example, a three-dimensional data decoding device comprises a processor and memory, and the processor uses the memory to perform the above processing.
[0226] The three-dimensional data encoding device and three-dimensional data decoding device, etc., according to embodiments of the present disclosure have been described above, but the present disclosure is not limited to these embodiments.
[0227] Furthermore, each processing unit included in the three-dimensional data encoding device and the three-dimensional data decoding device, etc., according to the above embodiment is typically implemented as an integrated circuit (LSI). These may be individually integrated into a single chip, or some or all of them may be integrated into a single chip.
[0228] Furthermore, integrated circuit implementation is not limited to LSIs; it may also be achieved using dedicated circuits or general-purpose processors. Field-Programmable Gate Arrays (FPGAs), which can be programmed after LSI manufacturing, or reconfigurable processors, which allow for the reconfiguration of the connections and settings of circuit cells within the LSI, may also be used.
[0229] Furthermore, in each of the above embodiments, each component may be implemented by being composed of dedicated hardware or by executing a software program suitable for each component. Each component may also be implemented by a program execution unit such as a CPU or processor reading and executing a software program recorded on a recording medium such as a hard disk or semiconductor memory.
[0230] Furthermore, this disclosure may be implemented as a three-dimensional data encoding method or a three-dimensional data decoding method, etc., performed by a three-dimensional data encoding device and a three-dimensional data decoding device, etc.
[0231] Furthermore, the division of functional blocks in the block diagram is just one example; multiple functional blocks can be implemented as a single functional block, a single functional block can be divided into multiple parts, or some functions can be moved to other functional blocks. In addition, the functions of multiple functional blocks with similar functions can be processed in parallel or time-sharing by a single piece of hardware or software.
[0232] Furthermore, the order in which each step in the flowchart is performed is illustrative for the purpose of specifically illustrating this disclosure, and may be in a different order. Also, some of the above steps may be performed simultaneously (in parallel) with other steps.
[0233] Although a three-dimensional data encoding device and a three-dimensional data decoding device, etc., relating to one or more embodiments have been described above based on embodiments, this disclosure is not limited to these embodiments. Without departing from the spirit of this disclosure, various modifications that a person skilled in the art can conceive of may be applied to these embodiments, and forms constructed by combining components from different embodiments may also be included within the scope of one or more embodiments. [Industrial applicability]
[0234] This disclosure is applicable to three-dimensional data encoding devices and three-dimensional data decoding devices. [Explanation of Symbols]
[0235] 4601 Three-Dimensional Data Encoding System 4602 Three-dimensional data decoding system 4603 Sensor terminal 4604 External connection section 4611 Point Cloud Data Generation System 4612 Presentation section 4613 Encoding section 4614 Multiplexer 4615 Input / output section 4616 Control Unit 4617 Sensor Information Acquisition Unit 4618 Point Cloud Data Generation Unit 4621 Sensor Information Acquisition Unit 4622 Input / output section 4623 Demultiplexer 4624 Decoding section 4625 Presentation section 4626 User Interface 4627 Control Unit 4630 First encoding unit 4631 Location information encoder 4632 Attribute information encoder 4633 Additional Information Encoding Unit 4634 Multiplexer 4640 First Decoding Unit 4641 Demultiplexer 4642 Location Information Decoding Unit 4643 Attribute Information Decoding Unit 4644 Additional Information Decoding Unit 4650 Second encoding unit 4651 Additional Information Generation Unit 4652 Position image generation unit 4653 Attribute Image Generation Unit 4654 Video Encoding Unit 4655 Additional Information Encoding Unit 4656 Multiplexer 4660 Second decoding unit 4661 Demultiplexer 4662 Video Decoding Unit 4663 Additional Information Decoding Unit 4664 Location information generator 4665 Attribute information generation section 4670 Encoding section 4671 Multiplexer 4680 Decoding Unit 4681 Demultiplexer
Claims
1. Obtain an encoded stream containing three-dimensional data including location information, One or more units storing the aforementioned encoded stream are generated, The unit includes information indicating the encoding scheme used to encode the three-dimensional data, The aforementioned information includes the first information and the second information, The first and second pieces of information are stored in multiple locations within the unit. Method for generating three-dimensional data.
2. The first and second pieces of information are the same information. The method for generating three-dimensional data according to claim 1.
3. The first and second pieces of information are different pieces of information. The method for generating three-dimensional data according to claim 1.
4. The first information is stored in the header of the unit, The second information is stored within the encoded stream. The method for generating three-dimensional data according to claim 3.
5. The second information is stored in the control information of the encoded stream. The method for generating three-dimensional data according to claim 4.
6. The aforementioned information includes information indicating a first encoding scheme or a second encoding scheme. The aforementioned unit is The first encoding scheme and the second encoding scheme have a common format. A method for generating three-dimensional data according to any one of claims 1 to 5.
7. The aforementioned information includes information indicating a first encoding scheme or a second encoding scheme. The aforementioned unit is The first encoding scheme and the second encoding scheme have independent formats. A method for generating three-dimensional data according to any one of claims 1 to 5.
8. Obtain 1 or more units, An encoded stream generated by encoding three-dimensional data including location information is obtained from the one or more units described above. The unit includes information indicating the encoding scheme used to encode the three-dimensional data, The aforementioned information includes the first information and the second information, The first and second pieces of information are stored in multiple locations within the unit. Methods for acquiring three-dimensional data.
9. The first and second pieces of information are the same information. The method for acquiring three-dimensional data according to claim 8.
10. The first and second pieces of information are different pieces of information. The method for acquiring three-dimensional data according to claim 8.
11. The first information is stored in the header of the unit, The second information is stored within the encoded stream. The method for acquiring three-dimensional data according to claim 10.
12. The second information is stored in the control information of the encoded stream. The method for acquiring three-dimensional data according to claim 11.
13. The aforementioned information includes information indicating a first encoding scheme or a second encoding scheme. The aforementioned unit is The first encoding scheme and the second encoding scheme have a common format. A method for acquiring three-dimensional data according to any one of claims 8 to 12.
14. The aforementioned information includes information indicating a first encoding scheme or a second encoding scheme. The aforementioned unit is The first encoding scheme and the second encoding scheme have independent formats. A method for acquiring three-dimensional data according to any one of claims 8 to 12.
15. Processor and Equipped with memory, The processor uses the memory to: Obtain an encoded stream containing three-dimensional data including location information, One or more units storing the aforementioned encoded stream are generated, The unit includes information indicating the encoding scheme used to encode the three-dimensional data, The aforementioned information includes the first information and the second information, The first and second pieces of information are stored in multiple locations within the unit. A device for generating three-dimensional data.
16. Processor and Equipped with memory, The processor uses the memory to: Obtain 1 or more units, An encoded stream generated by encoding three-dimensional data including location information is obtained from the one or more units described above. The unit includes information indicating the encoding scheme used to encode the three-dimensional data, The aforementioned information includes the first information and the second information, The first and second pieces of information are stored in multiple locations within the unit. Three-dimensional data acquisition device.