Point cloud data compression method, device, equipment and storage medium
By merging the probability distributions of point cloud data and images to compress the octree sequence, the problem of low compression rate of point cloud data is solved, and a more efficient data compression effect is achieved.
Patent Information
- Authority / Receiving Office
- CN · China
- Patent Type
- Patents(China)
- Current Assignee / Owner
- TSINGHUA UNIVERSITY
- Filing Date
- 2023-06-02
- Publication Date
- 2026-06-26
Smart Images

Figure CN116843774B_ABST
Abstract
Description
Technical Field
[0001] This invention relates to the field of data processing technology, and in particular to a point cloud data compression method, apparatus, device, and storage medium. Background Technology
[0002] Point cloud data refers to a massive collection of data points representing the surface characteristics of a target object, capable of reflecting the object's true state with high accuracy. However, the amount of point cloud data corresponding to a single target object is generally large, making point cloud data compression a crucial issue. Existing technology provides a compression method based on self-attention. This method first converts the point cloud data into an octree sequence, then uses a neural network incorporating self-attention to obtain the probability distribution of each octree node in the sequence. This probability distribution is then used to compress the octree sequence, resulting in a compressed file. However, if the probability prediction of octree nodes using a self-attention neural network is not accurate enough, the compression ratio of the point cloud data will be low, leading to a large compressed file size. Summary of the Invention
[0003] This invention provides a point cloud data compression method, apparatus, device, and storage medium to address the shortcomings of low point cloud data compression rates in the prior art.
[0004] This invention provides a point cloud data compression method, comprising: acquiring an octree sequence corresponding to point cloud data of a target object, and acquiring an image of the target object; inputting the octree sequence into a preset first neural network model to acquire a first probability distribution output by the first neural network model, wherein the first neural network model is used to process the octree sequence to obtain the first probability distribution; inputting the image into a preset second neural network model to acquire a second probability distribution output by the second neural network model, wherein the second neural network model is used to process the image to obtain the second probability distribution; acquiring a merged probability distribution based on the first probability distribution and the second probability distribution; and compressing the octree sequence based on the merged probability distribution to obtain compressed point cloud data.
[0005] According to a point cloud data compression method provided by the present invention, after compressing the octree sequence based on the merged probability distribution to obtain compressed point cloud data, the method further includes: losslessly compressing the image to obtain compressed image data; losslessly decompressing the compressed image data to obtain the decompressed image; re-inputting the decompressed image into a second neural network model to re-obtain the second probability distribution output by the second neural network model; inputting a preset initial value of the sequence context into the first neural network model to obtain a first initial probability output by the first neural network model; decompressing the compressed point cloud data based on the first initial probability, the re-obtained second probability distribution, and the initial value of the sequence context to obtain the decompressed octree sequence; and converting the octree sequence into the corresponding point cloud data.
[0006] According to a point cloud data compression method provided by the present invention, the step of obtaining the octree sequence corresponding to the point cloud data of the target object includes: obtaining the point cloud data of the target object; dividing the point cloud data into at least one voxel based on a preset voxel size; and converting the point cloud data into the corresponding octree sequence based on the voxel in a breadth-first order, wherein the octree nodes in the octree sequence correspond one-to-one with the voxel.
[0007] According to a point cloud data compression method provided by the present invention, the step of inputting the octree sequence into a preset first neural network model to obtain a first probability distribution output by the first neural network model includes: sequentially obtaining at least one node subsequence in the octree sequence, wherein the node subsequence includes a consecutive preset number of octree nodes, and the octree nodes in any two node subsequences do not overlap; sequentially inputting the previous node subsequence into the first neural network model to obtain a first sub-probability distribution of the next node subsequence, wherein the first sub-probability distribution of the first node subsequence is obtained by inputting a preset initial value of the sequence context into the first neural network model; and obtaining the first probability distribution based on the first sub-probability distribution of each node subsequence.
[0008] According to a point cloud data compression method provided by the present invention, the merging probability distribution includes a merging sub-probability distribution for each node sub-sequence; the step of compressing the octree sequence based on the merging probability distribution to obtain compressed point cloud data includes: sequentially for each node sub-sequence: based on the merging sub-probability distribution and the node sub-sequence, performing entropy encoding on each octree node in the node sub-sequence to obtain compressed sub-point cloud data corresponding to the node sub-sequence; and obtaining the compressed point cloud data based on the compressed sub-point cloud data corresponding to each node sub-sequence.
[0009] According to a point cloud data compression method provided by the present invention, the step of decompressing the point cloud compressed data based on a first initial probability, a reacquired second probability distribution, and an initial value in the sequence to obtain the decompressed octree sequence includes: obtaining a merged sub-probability distribution corresponding to the first sub-point cloud compressed data in the point cloud compressed data based on the first initial probability and the reacquired second probability distribution; performing entropy decoding on the first sub-point cloud compressed data in the point cloud compressed data based on the merged sub-probability distribution of the first node sub-sequence to obtain the first node sub-sequence in the octree sequence, wherein the node sub-sequence includes a consecutive preset number of octree nodes, the octree nodes in any two node sub-sequences do not overlap, and the merged probability distribution includes the merged sub-probability distribution of each node sub-sequence; sequentially repeating the step of inputting the previous node sub-sequence into the first neural network model to obtain the first sub-probability distribution of the next node sub-sequence, until the entropy decoding of each sub-point cloud compressed data in the point cloud compressed data is completed; and obtaining the decompressed octree sequence based on each node sub-sequence.
[0010] According to a point cloud data compression method provided by the present invention, the step of acquiring the point cloud data of the target object includes: acquiring the point cloud data obtained by a lidar on the target object; the step of acquiring an image of the target object includes: acquiring the image obtained by a camera on the target object; wherein the spatial coordinate systems of the lidar and the camera are consistent.
[0011] The present invention also provides a point cloud data compression device, comprising: an acquisition module for acquiring an octree sequence corresponding to point cloud data of a target object, and acquiring an image of the target object; a sequence processing module for inputting the octree sequence into a preset first neural network model to acquire a first probability distribution output by the first neural network model; an image processing module for inputting the image into a preset second neural network model to acquire a second probability distribution output by the second neural network model; a probability processing module for acquiring a merged probability distribution based on the first probability distribution and the second probability distribution; and a compression module for compressing the octree sequence based on the merged probability distribution to obtain compressed point cloud data.
[0012] The present invention also provides an electronic device, including a memory, a processor, and a computer program stored in the memory and executable on the processor, wherein the processor executes the program to implement the point cloud data compression method as described above.
[0013] The present invention also provides a non-transitory computer-readable storage medium having a computer program stored thereon, which, when executed by a processor, implements the point cloud data compression method as described above.
[0014] The present invention provides a point cloud data compression method, apparatus, device, and storage medium that acquires an octree sequence corresponding to the point cloud data of a target object, and acquires an image of the target object; inputs the octree sequence into a preset first neural network model to obtain a first probability distribution output by the first neural network model; inputs the image into a preset second neural network model to obtain a second probability distribution output by the second neural network model; obtains a merged probability distribution based on the first and second probability distributions; and compresses the octree sequence based on the merged probability distribution to obtain compressed point cloud data. In the above process, the first and second probability distributions are obtained from the point cloud data and image of the same target object, respectively. Combining the first and second probability distributions yields a more accurate merged probability distribution. Compressing the octree sequence based on the more accurate merged probability distribution improves the overall compression rate of the point cloud data, thereby reducing the amount of compressed point cloud data. Attached Figure Description
[0015] To more clearly illustrate the technical solutions in this invention or the prior art, the drawings used in the description of the embodiments or the prior art will be briefly introduced below. Obviously, the drawings described below are some embodiments of this invention. For those skilled in the art, other drawings can be obtained from these drawings without creative effort.
[0016] Figure 1 This is a flowchart illustrating the point cloud data compression method provided by the present invention;
[0017] Figure 2 This is a schematic diagram of the point cloud data compression and decompression process provided by the present invention;
[0018] Figure 3 This is a schematic diagram illustrating the principle of image depth prediction provided by the present invention;
[0019] Figure 4 This is a schematic diagram illustrating the principle of inferring octree probability distribution from voxel probability distribution provided by the present invention;
[0020] Figure 5 This is a schematic diagram illustrating the principle of octree-based textual feature extraction provided by the present invention;
[0021] Figure 6 This is a schematic diagram illustrating the principle of merging the first probability distribution and the second probability distribution provided by the present invention;
[0022] Figure 7This is a schematic diagram of the point cloud data compression device provided by the present invention;
[0023] Figure 8 This is a schematic diagram of the structure of the electronic device provided by the present invention. Detailed Implementation
[0024] To make the objectives, technical solutions, and advantages of this invention clearer, the technical solutions of this invention will be clearly and completely described below with reference to the accompanying drawings. Obviously, the described embodiments are only some, not all, of the embodiments of this invention. All other embodiments obtained by those skilled in the art based on the embodiments of this invention without creative effort are within the scope of protection of this invention.
[0025] The following is combined with Figures 1 to 8 The present invention describes a point cloud data compression method, apparatus, device, and storage medium.
[0026] In one embodiment, such as Figure 1 As shown, the process steps for implementing the point cloud data compression method are as follows:
[0027] Step 101: Obtain the octree sequence corresponding to the point cloud data of the target object, and obtain the image of the target object.
[0028] In this embodiment, point cloud data and images of the target object are acquired using appropriate methods. An octree is a tree-like data structure used to describe three-dimensional space. Each node in an octree represents a volume element of a cube, and each node has eight child nodes, corresponding to the eight equal-sized spaces of the cube. The sum of the volume elements represented by the eight child nodes equals the volume of the parent node. Representing the octree corresponding to the point cloud data as a sequence forms an octree sequence.
[0029] In one embodiment, the octree sequence corresponding to the point cloud data of the target object is obtained. The specific implementation process is as follows: obtain the point cloud data of the target object; divide the point cloud data into at least one voxel based on a preset voxel size; based on the voxel, convert the point cloud data into the corresponding octree sequence in breadth-first order, wherein the octree nodes in the octree sequence correspond one-to-one with the voxels.
[0030] In this embodiment, a voxel refers to a data structure that uses a fixed-size cube as the smallest unit to represent a three-dimensional object. By presetting the voxel size, the point cloud data is divided into at least one voxel, and then an octree is constructed based on each voxel formed. Specifically, there is a one-to-one correspondence between voxels and octree nodes.
[0031] Furthermore, based on voxels, the point cloud data is transformed into an octree. The octree is then flattened into an octree sequence in breadth-first order. The information of each octree node includes its level in the octree, its position within its parent node, and whether it has any child nodes. The information of any octree node is represented by an 8-bit binary number (denoted as OctValue). In the 8-bit binary number formed by the eight child nodes of an octree node, a 1 is recorded if the child node (i.e., the corresponding sub-cube in the voxel) has a point, and a 0 is recorded if it has no points.
[0032] In one embodiment, the point cloud data of the target object is acquired, specifically through the following process: acquiring the point cloud data of the target object obtained by the LiDAR. The image of the target object is then acquired, specifically through the following process: acquiring the image of the target object obtained by the camera. The spatial coordinate systems of the LiDAR and the camera are consistent.
[0033] In this embodiment, point cloud data and images of the same target object are acquired using separate devices. Preferably, point cloud data is acquired using a LiDAR (Light Detection and Ranging) sensor, and images are acquired using a camera. Of course, LiDAR and camera are only one preferred implementation; other suitable methods can be used to acquire point cloud data or images depending on the actual situation and needs. To facilitate subsequent probability distribution merging operations, preferably, the LiDAR and camera need to be pre-calibrated to ensure that their spatial coordinate systems are consistent.
[0034] Step 102: Input the octree sequence into a preset first neural network model and obtain the first probability distribution output by the first neural network model. The first neural network model is used to process the octree sequence to obtain the first probability distribution.
[0035] In this embodiment, the first neural network model is mainly used to process the octree sequence to obtain the first probability distribution of each octree node in the octree sequence. The model framework of this first neural network model can be selected according to actual conditions and needs; for example, a neural network framework including self-attention can be used. Training is performed based on the selected framework to finally obtain the first neural network model required in practice.
[0036] In one embodiment, an octree sequence is input into a preset first neural network model to obtain a first probability distribution output by the first neural network model. The specific implementation process is as follows: at least one node subsequence in the octree sequence is obtained sequentially, wherein the node subsequence includes a preset number of consecutive octree nodes, and the octree nodes in any two node subsequences do not overlap; the previous node subsequence is input into the first neural network model sequentially to obtain the first sub-probability distribution of the next node subsequence, wherein the first sub-probability distribution of the first node subsequence is obtained by inputting a preset initial value of the sequence context into the first neural network model; based on the first sub-probability distributions of each node subsequence, a first probability distribution is obtained.
[0037] In this embodiment, the first probability distribution is obtained using the context of the octree. That is, the previous node subsequence is input into the first neural network model in turn to obtain the first sub-probability distribution of the next node subsequence, and then the first probability distribution is obtained based on each first sub-probability distribution.
[0038] In this embodiment, the preset number of node subsequences can be set according to actual conditions and needs. For example, when the preset number is set to 1, it indicates that each octree node will be input into the first neural network model for processing in sequence. Alternatively, to shorten the number of runs of the first neural network model and reduce processing time, the preset number can be set to a positive integer greater than 1.
[0039] In this embodiment, the preset initial value of the sequence context is used to predict the first sub-probability distribution of the first node sub-sequence. The initial value of the sequence context can be set according to the actual situation and needs. Preferably, the initial value of the sequence context is set to 0. The dimension of the initial value of the sequence context is the same as the dimension of the node sub-sequence.
[0040] Step 103: Input the image into a preset second neural network model and obtain the second probability distribution output by the second neural network model. The second neural network model is used to process the image to obtain the second probability distribution.
[0041] In this embodiment, the second neural network model is mainly used to process the image to obtain the second probability distribution corresponding to the image. The model framework of this second neural network model can be selected according to actual conditions and needs; for example, a 101-layer residual network (denoted as ResNet-101) can be used. Training is performed based on the selected framework to finally obtain the required second neural network model. It should be noted that the output dimension of the second probability distribution is set according to the spatial dimension of the octree. That is, the point cloud data and image of the target object are labeled based on the relationship between three-dimensional and two-dimensional dimensions. When the point cloud data is divided into at least one voxel, the image processing details in the second neural network model are adjusted accordingly to make the second probability distribution and the first probability distribution have the same dimension. That is, the second probability distribution includes the second probability (depth probability) of each octree node.
[0042] Step 104: Obtain the merged probability distribution based on the first probability distribution and the second probability distribution.
[0043] In this embodiment, after obtaining the first probability distribution from point cloud data and the first probability distribution from the image, the first probability distribution and the second probability distribution are merged to obtain a new merged probability distribution. The merging method of the first probability distribution and the second probability distribution can be set according to the actual situation and needs. For example, the merging of the first probability distribution and the second probability distribution can be achieved by calculating the dot product of the first probability distribution and the second probability distribution.
[0044] In this embodiment, the spatial angles of the point cloud data and the image acquisition are the same. Based on the first probability distribution corresponding to the point cloud data, image depth prediction is achieved through the second probability distribution corresponding to the image, that is, the probability of the existence of each voxel is predicted. By merging the first and second probability distributions, it is equivalent to correcting the first probability distribution through the second probability distribution to obtain a more accurate merged probability distribution, thereby improving the compression rate of the point cloud data.
[0045] Step 105: Based on the merging probability distribution, compress the octree sequence to obtain compressed point cloud data.
[0046] In this embodiment, the key to improving the compression ratio lies in the accuracy of the probability distribution used during the compression process. Based on a more accurate merging probability distribution, the octree sequence corresponding to the point cloud data is compressed, thereby obtaining point cloud compressed data with a higher compression ratio.
[0047] In this embodiment, the compression method can be selected according to the actual situation and needs. Preferably, entropy coding is used to compress the octree sequence.
[0048] In one embodiment, the merging probability distribution includes a merging sub-probability distribution for each node subsequence. Based on the merging probability distribution, the octree sequence is compressed to obtain compressed point cloud data. The specific implementation process is as follows: For each node subsequence: based on the merging sub-probability distribution and the node subsequence, entropy encoding is performed on each octree node in the node subsequence to obtain the compressed sub-point cloud data corresponding to the node subsequence; based on the compressed sub-point cloud data corresponding to each node subsequence, the compressed point cloud data is obtained.
[0049] In this embodiment, the context information of the octree is used to encode and compress the subsequences of each node using entropy coding, thereby obtaining the compressed point cloud data corresponding to the entire point cloud data.
[0050] In this embodiment, the probability distribution of the merged subsequence corresponding to each node subsequence can be calculated before entropy encoding, or the probability distribution of the merged subsequence corresponding to each node subsequence can be calculated sequentially according to the progress of entropy encoding.
[0051] In this embodiment, based on the first probability distribution obtained by predicting the octree sequence context using the first neural network model (i.e., the first probability distribution of voxels existing in voxel space), a second neural network model is used to predict the depth of the image, resulting in a second probability distribution with different depths. This second probability distribution corresponds to the voxel space of the point cloud data. The second probability distribution obtained through depth prediction is combined with the first probability distribution obtained from the octree context information to obtain a more accurate merged probability distribution, thereby improving the compression ratio. The compression ratio can be characterized by bits per point (bpp), which represents the average number of bits occupied by each point after point cloud data compression, reflecting the compression ratio. The smaller the bpp, the higher the compression ratio.
[0052] In one embodiment, point cloud data is compressed to obtain compressed point cloud data, which facilitates data transmission and storage. When the original point cloud data is needed, the compressed point cloud data can be decompressed through a decompression process that is the reverse of the compression process described above.
[0053] Specifically, based on the merging probability distribution, the octree sequence is compressed to obtain point cloud compressed data. Then, the image is losslessly compressed to obtain image compressed data. The image compressed data is losslessly decompressed to obtain the decompressed image. The decompressed image is then re-inputted into the second neural network model to obtain the second probability distribution output by the second neural network model. The preset initial values of the sequence context are input into the first neural network model to obtain the first initial probability output by the first neural network model. Based on the first initial probability, the re-obtained second probability distribution, and the initial values of the sequence context, the point cloud compressed data is decompressed to obtain the decompressed octree sequence. The octree sequence is then converted into the corresponding point cloud data.
[0054] In this embodiment, decompressing the point cloud compressed data still requires the second probability distribution of the image. Therefore, to facilitate the decompression process, the image needs to be compressed and stored simultaneously. When decompressing the point cloud compressed data, the image compressed data is first decompressed without loss to obtain the second probability distribution. It should be noted that the compression method used when compressing the image can be any commonly used image compression method. Preferably, a lossless compression method is used to compress the image, thereby ensuring the accuracy of the second probability distribution obtained by re-obtaining the decompressed image.
[0055] In this embodiment, the point cloud compressed data is decompressed based on the context information of the octree sequence to obtain a complete octree sequence.
[0056] In one embodiment, based on a first initial probability, a reacquired second probability distribution, and initial values above the sequence, the point cloud compressed data is decompressed to obtain a decompressed octree sequence. The specific implementation process is as follows: Based on the first initial probability and the reacquired second probability distribution, the merged sub-probability distribution corresponding to the first sub-point cloud compressed data in the point cloud compressed data is obtained; based on the merged sub-probability distribution of the first node sub-sequence, entropy decoding is performed on the first sub-point cloud compressed data in the point cloud compressed data to obtain the first node sub-sequence in the octree sequence, wherein the node sub-sequence includes a consecutive preset number of octree nodes, the octree nodes in any two node sub-sequences do not overlap, and the merged probability distribution includes the merged sub-probability distribution of each node sub-sequence; the steps of inputting the previous node sub-sequence into the first neural network model to obtain the first sub-probability distribution of the next node sub-sequence are repeated sequentially until the entropy decoding of each sub-point cloud compressed data in the point cloud compressed data is completed; based on each node sub-sequence, the decompressed octree sequence is obtained.
[0057] In this embodiment, if entropy encoding is used for compression, then entropy decoding is used for decompression. During decoding, each node subsequence is decoded, and the context of the next node subsequence prediction is added. By decoding sequentially, lossless compression of the point cloud data can be achieved.
[0058] In an overall embodiment, such as Figure 2 As shown, the point cloud data compression and decompression process is as follows:
[0059] During compression:
[0060] Step 201: First, convert the point cloud data into an octree and flatten it into an octree sequence according to breadth-first order.
[0061] Step 202: Input the octree sequence into the first neural network model to obtain the first probability distribution of the octree nodes;
[0062] Step 203: Input the image into the second neural network model to obtain the second probability distribution (depth probability distribution);
[0063] Step 204: Based on the first probability distribution and the second probability distribution, obtain the merged probability distribution, perform entropy encoding on the octree sequence based on the merged probability distribution, and simultaneously perform lossless compression on the image to obtain the final point cloud compressed data and image compressed data.
[0064] During the decompression process:
[0065] Step 205: Set the initial value of the sequence context to 0, and simultaneously decompress the image compressed data to obtain the image;
[0066] Step 206: Input the image into the second neural network model to re-acquire the second probability distribution;
[0067] Step 207: Using the decoded node subsequence as context, the second sub-probability distribution of the node subsequence to be decoded is obtained through the second neural network model;
[0068] Step 208: Merge the first probability distribution and each of the second sub-probability distributions again to obtain each merged sub-probability distribution. Perform entropy decoding on the point cloud compressed data based on the merged sub-probability distributions, thereby gradually obtaining the decompressed octree sequence based on the node sub-sequences.
[0069] Step 209: The entire octree sequence is decoded to obtain the original octree, which is then converted into a point cloud structure to obtain the original point cloud data.
[0070] In a specific embodiment, the point cloud data compression method employs three main modules to implement the processing: an image depth prediction module, an octree context feature extraction module, and an entropy coding module. The image depth prediction module first obtains the depth prediction result from the image, and then obtains a second probability distribution through the correspondence between voxels and octree nodes. The context extraction module obtains a first probability distribution from the context of the octree sequence. The first and second probability distributions are fused to obtain a merged probability distribution. Finally, the entropy coding module performs entropy coding on the octree sequence based on the merged probability distribution.
[0071] In this embodiment, as Figure 3 As shown, the implementation principle of the Image-based Depth EstimationModule is as follows:
[0072] The raw image is downsampled by a 101-layer ResNet-101 residual network with 3 channels. The image height (H) and image width (W) are set as needed. Then, it is upsampled by Atrous Spatial Pyramid Pooling (ASPP), which uses 2048 channels. The image height (H) and image width (W) are set to 32.
[0073] The first neural network model is trained using ResNet-101 and ASPP to perform image-based depth estimation, yielding a prediction result (denoted as D). During training, camera calibration and upsampling are required to ensure that the probability of searching the 8 child cubes within each voxel is calculated using the same octree node layer and coordinate system. It should be noted that the dimensionality of the prediction result in the above voxel-style depth estimation process can be configured based on the voxel partitioning rules of the point cloud data. For example, based on the voxel partitioning dimension, the prediction result can be configured with l dimensions (denoted as D). Alternatively, the prediction results can be configured with (l-1) dimensions (denoted as l-1). ).
[0074] By rearranging the expression of the depth prediction results, the second probability distribution can be obtained, denoted as G.θ (x i |D). This second probability distribution is based on eight sub-cubes of each voxel, with each sub-cube containing either a value of 0 or 1. Thus, the second probability distribution is an n×255-dimensional probability distribution (i.e., an n×255 probability distribution), where n is a positive integer.
[0075] In this embodiment, as Figure 4 As shown, in the process of inferring octet probability distribution from voxel probability distribution (Child Cubes Distribution Infer Occupy), a voxel space consists of 8 subspaces (Child Cubes). Based on whether each subspace contains points, an 8-bit binary code (denoted as Occupancy Codes) is constructed for each voxel space. Specifically, a voxel is divided into eight sub-cubes. If a sub-cube contains points of the point cloud, it is denoted as 1; otherwise, it is denoted as 0, forming an 8-bit binary code. For example, 00000001, 110011100, and 11111111 represent the binary codes of different voxels. The voxel space is transformed into an octree. Any node obtains the probability distribution of each octree node based on the features of the 8 child cubes contained in each voxel. Each child cube contains two values: 0 or 1. That is, the probability distribution of the current node obtaining 255 occupancy codes from the 8 child cubes invoxel representation.
[0076] In this embodiment, as Figure 5 As shown, the implementation principle of the octree-based ContextFeature Extraction Module is as follows: In an octree sequence, the binary code of any octree node includes the level of the octree node, its position in the parent node, and whether it has any child nodes. For example, if the binary code of an octree node is 00010010 (decimal 40), this octree node is the 7th child node under its parent node and is located at the 2nd level of its parent node. Correspondingly, 132, 106, 197, 5, 7, 15, 13, 11, 6, and 15 represent the decimal values corresponding to the binary codes of different octree nodes.
[0077] In the octree sequence compression process, the n octree nodes preceding the position to be compressed (n octree nodes form a subsequence, where n is a positive integer) are used as input to the first neural network model. This yields the first sub-probability distribution (Content window with length n) corresponding to the subsequence of the position to be compressed, which is essentially predicting the probability distribution of the following sequence based on the features of the preceding sequence. The features of each octree node can be represented in vector form, for example, Figure 5 In This represents the feature vector of the (i-1)th octree node at level 0. This represents the feature vector of the (i-1)th octree node in the first level. This represents the feature vector of the (i-1)th octree node in the second layer. Let represent the feature vector of the (i-1)th node in the octree at level 3, where i is a positive integer. Assume the octree has only nodes at levels 0 to 3. and Constructing the eigenvector f i-1 Multiple f i-1 This results in an octree sequence containing (K-1) parent nodes, where each node subsequence contains n octree nodes (content window length n).
[0078] By sequentially inputting each node subsequence from the octree sequence into the first neural network model (assuming the model uses a Transformer Encoder framework), the first sub-probability distribution corresponding to each node subsequence can be obtained, thus yielding the first probability distribution, denoted as F. θ (x i |X).
[0079] like Figure 6 As shown, the first probability distribution F is obtained. θ (x i |X) and the second probability distribution G θ (x i After |D), the two probability distributions are merged by vector multiplication and vector addition to obtain the merged probability distribution, denoted as ProbabilityDistribution(n, 255).
[0080] In the entropy coding module, the octree sequence is entropy-coded based on the merging probability distribution to complete the compression process and obtain point cloud compressed data.
[0081] The point cloud data compression method provided by this invention involves acquiring an octree sequence corresponding to the point cloud data of a target object, and acquiring an image of the target object; inputting the octree sequence into a preset first neural network model to obtain a first probability distribution output by the first neural network model; inputting the image into a preset second neural network model to obtain a second probability distribution output by the second neural network model; obtaining a merged probability distribution based on the first and second probability distributions; and compressing the octree sequence based on the merged probability distribution to obtain compressed point cloud data. In this process, the first and second probability distributions are obtained from the point cloud data and image of the same target object, respectively. Combining the first and second probability distributions yields a more accurate merged probability distribution. Compressing the octree sequence based on the more accurate merged probability distribution improves the overall compression rate of the point cloud data, thereby reducing the amount of compressed point cloud data.
[0082] In particular, multimodal point cloud data compression based on point cloud data and images, which is the process of using image information to assist point cloud data in compression, can be achieved in many environments such as autonomous driving, where images and point cloud data can be obtained synchronously through sensors. In practical applications, massive pairs of images and point cloud data will appear, which greatly facilitates joint compression of images and point cloud data, improves the compression rate, and has great application prospects.
[0083] The point cloud data compression apparatus provided by this invention is described below. The point cloud data compression apparatus described below can be referred to in correspondence with the point cloud data compression method described above. For example... Figure 7 As shown, the point cloud data compression device includes:
[0084] The acquisition module 701 is used to acquire the octree sequence corresponding to the point cloud data of the target object, and to acquire the image of the target object;
[0085] The sequence processing module 702 is used to input the octree sequence into a preset first neural network model and obtain the first probability distribution output by the first neural network model, wherein the first neural network model is used to process the octree sequence to obtain the first probability distribution;
[0086] The image processing module 703 is used to input an image into a preset second neural network model and obtain a second probability distribution output by the second neural network model, wherein the second neural network model is used to process the image to obtain the second probability distribution;
[0087] The probability processing module 704 is used to obtain a combined probability distribution based on the first probability distribution and the second probability distribution;
[0088] Compression module 705 is used to compress octree sequences based on merging probability distributions to obtain compressed point cloud data.
[0089] In one embodiment, the point cloud data compression device further includes a decompression module, used to compress the octree sequence based on the merged probability distribution to obtain compressed point cloud data; then, losslessly compress the image to obtain compressed image data; losslessly decompress the compressed image data to obtain a decompressed image; re-input the decompressed image into a second neural network model to re-obtain the second probability distribution output by the second neural network model; input a preset initial value of the sequence context into a first neural network model to obtain a first initial probability output by the first neural network model; decompress the compressed point cloud data based on the first initial probability, the re-obtained second probability distribution, and the initial value of the sequence context to obtain a decompressed octree sequence; and convert the octree sequence into corresponding point cloud data.
[0090] In one embodiment, the acquisition module 701 is used to acquire point cloud data of the target object; divide the point cloud data into at least one voxel based on a preset voxel size; and convert the point cloud data into a corresponding octree sequence based on the voxel in breadth-first order, wherein the octree nodes in the octree sequence correspond one-to-one with the voxels.
[0091] In one embodiment, the sequence processing module 702 is used to sequentially obtain at least one node subsequence in an octree sequence, wherein the node subsequence includes a consecutive preset number of octree nodes, and the octree nodes in any two node subsequences do not overlap; sequentially input the previous node subsequence into a first neural network model to obtain the first sub-probability distribution of the next node subsequence, wherein the first sub-probability distribution of the first node subsequence is obtained by inputting a preset initial value of the sequence context into the first neural network model; and obtain a first probability distribution based on the first sub-probability distribution of each node subsequence.
[0092] In one embodiment, the merge probability distribution includes a merge sub-probability distribution for each node subsequence;
[0093] The compression module 705 is used to sequentially perform entropy encoding on each octree node in each node subsequence based on the merged subprobability distribution and the node subsequence to obtain the sub-point cloud compressed data corresponding to the node subsequence; and obtain the point cloud compressed data based on the sub-point cloud compressed data corresponding to each node subsequence.
[0094] In one embodiment, the decompression module is configured to: obtain a merged sub-probability distribution corresponding to the first sub-point cloud compressed data in the point cloud compressed data based on a first initial probability and a reacquired second probability distribution; perform entropy decoding on the first sub-point cloud compressed data in the point cloud compressed data based on the merged sub-probability distribution of the first node sub-sequence to obtain the first node sub-sequence in the octree sequence, wherein the node sub-sequence includes a consecutive preset number of octree nodes, the octree nodes in any two node sub-sequences do not overlap, and the merged probability distribution includes the merged sub-probability distribution of each node sub-sequence; repeat the steps of inputting the previous node sub-sequence into the first neural network model to obtain the first sub-probability distribution of the next node sub-sequence until the entropy decoding of each sub-point cloud compressed data in the point cloud compressed data is completed; and obtain the decompressed octree sequence based on each node sub-sequence.
[0095] In one embodiment, the acquisition module 701 is used to acquire point cloud data obtained by the lidar from the target object; and to acquire images of the target object obtained by the camera; wherein the spatial coordinate systems of the lidar and the camera are consistent.
[0096] Figure 8 An example is a schematic diagram of the physical structure of an electronic device, such as... Figure 8 As shown, the electronic device may include: a processor 801, a communication interface 802, a memory 803, and a communication bus 804, wherein the processor 801, the communication interface 802, and the memory 803 communicate with each other through the communication bus 804. The processor 801 can call logical instructions in the memory 803 to execute a point cloud data compression method. This method includes: acquiring an octree sequence corresponding to the point cloud data of a target object, and acquiring an image of the target object; inputting the octree sequence into a preset first neural network model to obtain a first probability distribution output by the first neural network model, wherein the first neural network model is used to process the octree sequence to obtain the first probability distribution; inputting the image into a preset second neural network model to obtain a second probability distribution output by the second neural network model, wherein the second neural network model is used to process the image to obtain the second probability distribution; obtaining a merged probability distribution based on the first and second probability distributions; and compressing the octree sequence based on the merged probability distribution to obtain compressed point cloud data.
[0097] Furthermore, the logical instructions in the aforementioned memory 803 can be implemented as software functional units and, when sold or used as independent products, can be stored in a computer-readable storage medium. Based on this understanding, the technical solution of the present invention, essentially, or the part that contributes to the prior art, or a part of the technical solution, can be embodied in the form of a software product. This computer software product is stored in a storage medium and includes several instructions to cause a computer device (which may be a personal computer, server, or network device, etc.) to execute all or part of the steps of the methods described in the various embodiments of the present invention. The aforementioned storage medium includes various media capable of storing program code, such as USB flash drives, portable hard drives, read-only memory (ROM), random access memory (RAM), magnetic disks, or optical disks.
[0098] On the other hand, the present invention also provides a computer program product, the computer program product including a computer program stored on a non-transitory computer-readable storage medium, the computer program including program instructions, when the program instructions are executed by a computer, the computer is able to execute the point cloud data compression method provided in the above embodiments, the method including: acquiring an octree sequence corresponding to the point cloud data of a target object, and acquiring an image of the target object; inputting the octree sequence into a preset first neural network model, and acquiring a first probability distribution output by the first neural network model, wherein the first neural network model is used to process the octree sequence to obtain the first probability distribution; inputting the image into a preset second neural network model, and acquiring a second probability distribution output by the second neural network model, wherein the second neural network model is used to process the image to obtain the second probability distribution; acquiring a merged probability distribution based on the first probability distribution and the second probability distribution; and compressing the octree sequence based on the merged probability distribution to obtain compressed point cloud data.
[0099] In another aspect, the present invention also provides a non-transitory computer-readable storage medium storing a computer program thereon. When executed by a processor, the computer program implements the point cloud data compression method provided in the above embodiments. The method includes: acquiring an octree sequence corresponding to point cloud data of a target object, and acquiring an image of the target object; inputting the octree sequence into a preset first neural network model to acquire a first probability distribution output by the first neural network model, wherein the first neural network model is used to process the octree sequence to obtain the first probability distribution; inputting the image into a preset second neural network model to acquire a second probability distribution output by the second neural network model, wherein the second neural network model is used to process the image to obtain the second probability distribution; acquiring a merged probability distribution based on the first probability distribution and the second probability distribution; and compressing the octree sequence based on the merged probability distribution to obtain compressed point cloud data.
[0100] The device embodiments described above are merely illustrative. The units described as separate components may or may not be physically separate. The components shown as units may or may not be physical units; that is, they may be located in one place or distributed across multiple network units. Some or all of the modules can be selected to achieve the purpose of this embodiment according to actual needs. Those skilled in the art can understand and implement this without any creative effort.
[0101] Through the above description of the embodiments, those skilled in the art can clearly understand that each embodiment can be implemented by means of software plus necessary general-purpose hardware platforms, and of course, it can also be implemented by hardware. Based on this understanding, the above technical solutions, in essence or the part that contributes to the prior art, can be embodied in the form of a software product. This computer software product can be stored in a computer-readable storage medium, such as ROM / RAM, magnetic disk, optical disk, etc., and includes several instructions to cause a computer device (which may be a personal computer, server, or network device, etc.) to execute the methods described in the various embodiments or some parts of the embodiments.
[0102] Finally, it should be noted that the above embodiments are only used to illustrate the technical solutions of the present invention, and not to limit them; although the present invention has been described in detail with reference to the foregoing embodiments, those skilled in the art should understand that modifications can still be made to the technical solutions described in the foregoing embodiments, or equivalent substitutions can be made to some of the technical features; and these modifications or substitutions do not cause the essence of the corresponding technical solutions to deviate from the spirit and scope of the technical solutions of the embodiments of the present invention.
Claims
1. A point cloud data compression method, characterized in that, include: Obtain the octree sequence corresponding to the point cloud data of the target object, and obtain the image of the target object; The octree sequence is input into a preset first neural network model to obtain a first probability distribution output by the first neural network model, wherein the first neural network model is used to process the octree sequence to obtain the first probability distribution. The image is input into a preset second neural network model to obtain a second probability distribution output by the second neural network model, wherein the second neural network model is used to process the image to obtain the second probability distribution; Based on the first probability distribution and the second probability distribution, obtain the merged probability distribution; Based on the merging probability distribution, the octree sequence is compressed to obtain point cloud compressed data; The merging probability distribution includes a merging sub-probability distribution for each node sub-sequence. The step of compressing the octree sequence based on the merging probability distribution to obtain compressed point cloud data includes: For each node subsequence in sequence: based on the merged subprobability distribution and the node subsequence, entropy encoding is performed on each octree node in the node subsequence to obtain the sub-point cloud compressed data corresponding to the node subsequence; The point cloud compressed data is obtained based on the sub-point cloud compressed data corresponding to each of the node sub-sequences.
2. The point cloud data compression method according to claim 1, characterized in that, After compressing the octree sequence based on the merging probability distribution to obtain compressed point cloud data, the process further includes: The image is losslessly compressed to obtain compressed image data. The image compression data is losslessly decompressed to obtain the decompressed image. The image obtained from decompression is re-inputted into the second neural network model, and the second probability distribution output by the second neural network model is obtained again; Input the preset initial value of the sequence into the first neural network model, and obtain the first initial probability output by the first neural network model; Based on the first initial probability, the reacquired second probability distribution, and the initial value above the sequence, the point cloud compressed data is decompressed to obtain the decompressed octree sequence; The octree sequence is converted into the corresponding point cloud data.
3. The point cloud data compression method according to claim 1, characterized in that, The acquisition of the octree sequence corresponding to the point cloud data of the target object includes: Obtain the point cloud data of the target object; Based on a preset voxel size, the point cloud data is divided into at least one voxel; Based on the voxels, the point cloud data is converted into the corresponding octree sequence according to breadth-first order, wherein the octree nodes in the octree sequence correspond one-to-one with the voxels.
4. The point cloud data compression method according to claim 3, characterized in that, The step of inputting the octree sequence into a preset first neural network model and obtaining the first probability distribution output by the first neural network model includes: At least one node subsequence in the octree sequence is obtained sequentially, wherein the node subsequence includes a consecutive preset number of octree nodes, and the octree nodes in any two node subsequences do not overlap; The previous node subsequence is sequentially input into the first neural network model to obtain the first sub-probability distribution of the next node subsequence. The first sub-probability distribution of the first node subsequence is obtained by inputting a preset initial value of the sequence context into the first neural network model. The first probability distribution is obtained based on the first sub-probability distribution of each node sub-sequence.
5. The point cloud data compression method according to claim 2, characterized in that, The step of decompressing the point cloud compressed data based on the first initial probability, the reacquired second probability distribution, and the initial value of the sequence context to obtain the decompressed octree sequence includes: Based on the first initial probability and the reacquired second probability distribution, the merged sub-probability distribution corresponding to the first sub-point cloud compressed data in the point cloud compressed data is obtained; Based on the merging sub-probability distribution of the first node sub-sequence, entropy decoding is performed on the first sub-point cloud compressed data in the point cloud compressed data to obtain the first node sub-sequence in the octree sequence. The node sub-sequence includes a consecutive preset number of octree nodes, and the octree nodes in any two node sub-sequences do not overlap. The merging probability distribution includes the merging sub-probability distribution of each node sub-sequence. The steps of inputting the previous node subsequence into the first neural network model and obtaining the first subprobability distribution of the next node subsequence are repeated sequentially until the entropy decoding of each sub-point cloud compressed data in the point cloud compressed data is completed. Based on each of the node subsequences, the decompressed octree sequence is obtained.
6. The point cloud data compression method according to claim 3, characterized in that, The acquisition of the point cloud data of the target object includes: The point cloud data obtained by the lidar on the target object is acquired. The step of acquiring the image of the target object includes: The image of the target object is acquired by the camera. The spatial coordinate systems of the lidar and the camera are consistent.
7. A point cloud data compression device, characterized in that, include: The acquisition module is used to acquire the octree sequence corresponding to the point cloud data of the target object, and to acquire the image of the target object; The sequence processing module is used to input the octree sequence into a preset first neural network model and obtain the first probability distribution output by the first neural network model. The image processing module is used to input the image into a preset second neural network model and obtain the second probability distribution output by the second neural network model; The probability processing module is used to obtain a merged probability distribution based on the first probability distribution and the second probability distribution; A compression module is used to compress the octree sequence based on the merging probability distribution to obtain compressed point cloud data; the merging probability distribution includes the merging sub-probability distribution of each node sub-sequence; The compression module is further configured to sequentially perform entropy encoding on each of the node subsequences based on the merged subprobability distribution and the node subsequence, to obtain the sub-point cloud compressed data corresponding to the node subsequence; The point cloud compressed data is obtained based on the sub-point cloud compressed data corresponding to each of the node sub-sequences.
8. An electronic device comprising a memory, a processor, and a computer program stored in the memory and executable on the processor, characterized in that, When the processor executes the program, it implements the point cloud data compression method as described in any one of claims 1 to 6.
9. A non-transitory computer-readable storage medium having a computer program stored thereon, characterized in that, When the computer program is executed by a processor, it implements the point cloud data compression method as described in any one of claims 1 to 6.