A 2T0C dynamic random memory array implements an in-memory accelerated point cloud square euclidean distance calculation method

By storing point cloud data and performing some calculations using a 2T0C DRAM array, the problem of frequent access to external memory by point cloud accelerators is solved, achieving high-efficiency point cloud processing speed and energy efficiency improvement, and providing hardware acceleration for point cloud neural network algorithms.

CN120656045BActive Publication Date: 2026-06-23PEKING UNIV

Patent Information

Authority / Receiving Office
CN · China
Patent Type
Patents(China)
Current Assignee / Owner
PEKING UNIV
Filing Date
2025-06-03
Publication Date
2026-06-23

AI Technical Summary

Technical Problem

Existing point cloud accelerators frequently access raw point cloud data in external memory when performing downsampling operations, resulting in low cache hit rate and high data access bandwidth consumption. Furthermore, existing in-memory computing technologies require converting data into an analog domain for computation, introducing additional overhead.

Method used

Point cloud data is stored using a 2T0C dynamic random access memory array. The square Euclidean distance calculation of the point cloud is accelerated in memory through 2T0C DRAM cells. Information storage nodes are constructed using write transistors and read transistors to store the sign bit, exponent bit and mantissa bit of the point cloud data, respectively. Part of the calculation is performed in the array, including 1-bit multiplication and shift addition operations.

Benefits of technology

It significantly improves point cloud processing speed and energy efficiency, reduces hardware overhead, provides a general and efficient hardware acceleration infrastructure, and supports a variety of point cloud neural network algorithms.

✦ Generated by Eureka AI based on patent content.

Smart Images

  • Figure CN120656045B_ABST
    Figure CN120656045B_ABST
Patent Text Reader

Abstract

The application discloses a method for realizing in-memory acceleration point cloud square Euclidean distance calculation by a 2T0C dynamic random memory array, and belongs to the fields of novel in-memory calculation technology and three-dimensional point cloud identification.The application forms a point cloud storage array based on a 2T0C DRAM array, obtains the dot product result of a vector M and a vector N, the dot product result of the vector M and the vector M and the dot product result of the vector N and the vector N in a three-dimensional space while storing input point cloud data in the 2T0C DRAM array, and performs summation to obtain the square Euclidean distance result of any two points in the three-dimensional space.Compared with a CMOS accelerator under a traditional von Neumann architecture, the application greatly reduces hardware overhead, reduces data transfer, can significantly improve processing speed and energy efficiency, and can provide a general and efficient hardware acceleration basic configuration for various point cloud neural network algorithms.
Need to check novelty before this filing date? Find Prior Art

Description

Technical Field

[0001] This invention belongs to the fields of in-memory computing technology and 3D point cloud recognition, specifically involving a method for calculating the square Euclidean distance of point clouds using a 2T0C dynamic random access memory array. Background Technology

[0002] 3D point cloud recognition technology, as a crucial means of 3D environmental perception, plays a vital role in numerous fields such as autonomous driving, robot navigation, virtual reality, augmented reality, industrial inspection, and smart cities, possessing irreplaceable application value. Compared to traditional 2D images, 3D point cloud data directly acquires spatial coordinate information through LiDAR or depth cameras, accurately representing the geometric structure and spatial relationships of targets. Its robustness in recognition, especially under complex lighting conditions and dynamic scenes, is significantly superior to visual solutions. Currently, various neural networks have been developed for efficient 3D point cloud recognition. Among them, point-based point cloud neural networks (such as PointNet and PointNet++) have become the mainstream in current research due to their direct and efficient processing of disordered point clouds and excellent network performance, omitting additional processing steps such as voxelization.

[0003] Existing point cloud accelerators frequently access raw point cloud data in external memory when performing downsampling operations. Since point cloud data is essentially a sparse, unstructured set of three-dimensional coordinates, it lacks good spatial or cache locality, resulting in low cache hit rates and high data access bandwidth consumption.

[0004] Currently, schemes have been proposed to accelerate matrix-vector multiplication using analog in-memory computing techniques (such as in-memory computing based on resistive random access memory) to address the high computational overhead of feature extraction. However, this requires converting the original point cloud from its original floating-point data type to the analog domain for computation, introducing additional data conversion overhead. Therefore, optimizing the design of efficient operators for key point cloud recognition steps such as downsampling and feature computation using in-memory computing techniques, and achieving data structure type matching, is of significant research importance. Summary of the Invention

[0005] This invention provides a 2T0C dynamic random access memory (DRAM) array to accelerate the calculation of square Euclidean distance of point clouds in memory, which can significantly improve processing speed and energy efficiency, and provide a general and efficient hardware acceleration basic configuration for various point cloud neural network algorithms.

[0006] To achieve the above objectives, the technical solution provided by the present invention is as follows:

[0007] A method for calculating the squared Euclidean distance of point clouds using a 2T0C dynamic random access memory array, comprising the following steps:

[0008] 1) The 2T0C DRAM array is used to store point cloud data. Each 2T0C DRAM cell consists of a write transistor and a read transistor. The drain of the write transistor is connected to the gate of the read transistor to form an information storage node SN. Each node stores 1 bit of information.

[0009] 2) Store the sign bit, exponent bit, and mantissa bit of the point cloud data separately. The mantissa bit of the same point cloud data is stored separately in multiple columns of the 2T0C DRAM array. All mantissa bit information constitutes the mantissa bit storage array. The sign bit and exponent bit of the same point cloud data are stored together in the same column of the 2T0C DRAM cell of the 2T0C DRAM array.

[0010] 3) Apply a read signal to the column driver to read the mantissa bits of the coordinates of point M / N in a certain direction, and input them bit by bit into the RBL of the column where the coordinates of point N / M in that direction are located. The dot product result of the mantissa bit multiplication is obtained through the sense amplifier and shift adder circuit. Combined with the original sign bit and exponent bit in the floating-point numbers of point M and point N, the exponent shift alignment and sign adjustment are performed to obtain the intermediate product result.

[0011] 4) After inputting all the last digits of points M and N through multiple time cycles, obtain all intermediate product results of vectors M and N in three-dimensional space, and sum them to obtain the squared Euclidean distance between points M and N in three-dimensional space.

[0012] Furthermore, in step 2), the last few digits of the different point cloud data are stored sequentially column by column.

[0013] Furthermore, in step 3), when the information stored in SN is 0, the output of RWL is 0 when a 1-read voltage or a 0-read voltage is input through RBL; when the information stored in SN is 1, the output of RBL is 1 when a 1-read voltage is input through RBL, and the output of RBL is 0 when a 0-read voltage is input through RBL, thus realizing a 1-bit data multiplication operation.

[0014] Furthermore, the intermediate product result described in step 3) is temporarily stored in the register circuit after being output by the shift adder, waiting for the intermediate product result of other bits.

[0015] Furthermore, in step 4), a cache is set up. After obtaining the squared Euclidean distance between point M and point N, the result is output to the cache for storage, while waiting for the Euclidean distance result between point M and the other points.

[0016] Furthermore, a squared Euclidean distance comparator is added to compare the squared Euclidean distance results of point M, enabling sampling of the farthest point and sampling of nearest neighbors.

[0017] Compared with the prior art, the present invention has the following advantages:

[0018] This invention is based on a point cloud storage array constructed from a 2T0C DRAM array. While storing the input point cloud through the 2T0C DRAM array, partial in-memory Euclidean distance calculation can be performed in the storage array. Compared with the traditional CMOS accelerator under the von Neumann architecture, it greatly reduces hardware overhead and data transfer, and can significantly improve processing speed and energy efficiency. It can provide a general and efficient hardware acceleration basic configuration for various point cloud neural network algorithms, and is an effective in-memory acceleration solution for point cloud recognition. Attached Figure Description

[0019] Figure 1 This is a schematic diagram illustrating the implementation of a 1-bit multiplication operation in a 2TOCDRAM cell according to a specific embodiment of the present invention.

[0020] Figure 2 This is a schematic diagram of the 2TOCDRAM array structure and related peripheral circuits in a specific embodiment of the present invention. Detailed Implementation

[0021] The present invention will be further illustrated below with reference to the accompanying drawings and embodiments, but the scope of the invention is not limited in any way.

[0022] This invention provides a method for calculating the squared Euclidean distance of point clouds using a 2T0C dynamic random access memory array, the specific steps of which include the following:

[0023] 1) A 2T0C DRAM array is used to store the input point cloud, such as Figure 1 As shown, each 2T0C DRAM cell consists of a write transistor and a read transistor. The drain of the write transistor is connected to the gate of the read transistor to form an information storage node SN, and each node stores 1 bit of information. When writing information, the write bit line (WBL) connected to the gate of the write transistor is turned on, and a 0 / 1 voltage is applied from the write word line (WWL) connected to the source of the write transistor. This changes the information at SN and completes the writing process of each bit of data.

[0024] This invention's 2T0C DRAM array performs a 1-bit multiplication operation simultaneously with a read operation. When the information stored in the SN is 0, the output of the RWL is 0 whether a 1-bit read voltage or a 0-bit read voltage is input through the RBL. When the information stored in the SN is 1, the RBL output is 1 when a 1-bit read voltage is input through the RBL, and 0 when a 0-bit read voltage is input through the RBL, thus achieving a 1-bit data multiplication operation.

[0025] 2) such as Figure 2As shown, the sign bit, exponent bit, and mantissa bit of the point cloud data are stored separately. The mantissa bit of the same point cloud data is stored separately in multiple columns of the 2T0C DRAM array, and the mantissa bits of different point cloud data are stored sequentially column by column. All mantissa bit information constitutes the mantissa bit storage array. The sign bit and exponent bit of the same point cloud data are stored together in the same column of the 2T0C DRAM array, which constitutes the storage array of the remaining information.

[0026] 3) When the 2T0C DRAM array performs the squared Euclidean distance calculation, it first expands the distance between point M (vector) and point N (vector) in three-dimensional space using the squared Euclidean distance formula:

[0027]

[0028] For any of the X, Y, and Z directions, the distance between two points can be expressed by the following formula:

[0029]

[0030] Read the last few digits of the coordinate of point M in a certain direction (e.g., the X direction), and input them bit by bit into the RBL of the column containing the coordinate of point N in that direction (e.g., the X direction). When each bit is input, the information of point N simultaneously completes the above 1-bit multiplication operation, resulting in multiple intermediate product results. The specific operation is as follows:

[0031] 3-1) First, a read signal is applied to the column driver to read the mantissa of the coordinates of point M. After passing through the sense amplifier, the output signal is input back to the column driver, and then input bit by bit to the RBL of the column containing the coordinates of point N. During each input, the information of each bit at point N simultaneously completes the aforementioned 1-bit multiplication operation. k represents the number of mantissa bits, which is determined by different floating-point formats. For example, in the half-precision floating-point FP16 format, k is 9.

[0032] m0·n0, m0·n1, m0·n2,…, m0·n k

[0033] 3-2) The result of multiplying the 1-bit M-point mantissa and the N-point mantissa is obtained through a sensing amplifier and shift adder circuit:

[0034]

[0035] The multiplication result is output through a shift adder and temporarily stored in a register circuit, awaiting the calculation results of other bits. By completing the input of all mantissa bits of point M over multiple time cycles, the multiplication results of each mantissa bit of point M with the mantissa bits of point N are obtained.

[0036]

[0037] 3-3) After the calculation is completed, all the temporary information mentioned above is read from the register and transferred again to the shift adder. The result is then shifted and added using the bit information of each bit at point M. The shift adder (which operates on the same shift principle as the multiplication of the mantissa at point M and the mantissa at point N in the previous process, and can reuse the shift adder circuit) yields the result of the multiplication of the mantissa at point M and the mantissa at point N, i.e.:

[0038]

[0039] The dot product of vectors M and N in the X direction is input into a register for temporary storage. Similarly, the dot product of vectors M and N is obtained.

[0040] 3-4) Input the above dot product result into the exponent shifter, and simultaneously read the floating-point exponent bits of points M and N from the sign bit and exponent bit 2T0C DRAM array, aligning the floating-point numbers according to the exponent bit information; then input it into the signed adder / subtractor circuit, and process it together with the sign bit information of points M and N, along with the sign bit information in the square Euclidean distance calculation, to obtain the intermediate product result of vector M and vector N, and return it to the register for temporary storage:

[0041]

[0042] 4) Similarly, the dot product of vectors M and N in the Y and Z directions can be obtained. That is, after inputting all the mantissas of points M and N through multiple time cycles, the intermediate product results of vectors M and N, the intermediate product results of vectors M and M, and the dot intermediate product results of vectors N and N in three-dimensional space are obtained. These are then summed to obtain the squared Euclidean distance between points M and N in three-dimensional space.

[0043]

[0044] In a specific embodiment of the present invention, a buffer is further added. After obtaining the Euclidean distance result between point M and point N, it can be output to the buffer for storage. While waiting for the Euclidean distance result between point M and the other points, peripheral circuits such as a square Euclidean distance comparator can be used to implement algorithms such as farthest point sampling and nearest neighbor sampling to complete the downsampling stage in the point cloud recognition task.

[0045] The above embodiments are only some preferred embodiments of the present invention and are not intended to limit the scope of protection of the present invention. Any changes made based on the design principles of the present invention, and any non-creative changes made on this basis, should fall within the scope of protection of the present invention.

Claims

1. A method for calculating the squared Euclidean distance of point clouds using a 2T0C dynamic random access memory array, comprising the following steps: 1) The 2T0C DRAM array is used to store point cloud data. Each 2T0C DRAM cell consists of a write transistor and a read transistor. The drain of the write transistor is connected to the gate of the read transistor to form an information storage node SN. Each node stores 1 bit of information. 2) Store the sign bit, exponent bit, and mantissa bit of the point cloud data separately. The mantissa bit of the same point cloud data is stored separately in multiple columns of the 2T0C DRAM array. All mantissa bit information constitutes the mantissa bit storage array. The sign bit and exponent bit of the same point cloud data are stored together in the same column of the 2T0C DRAM cell of the 2T0C DRAM array. 3) Apply a read signal to the column driver to read the mantissa bits of the coordinates of point M / N in a certain direction, and input them bit by bit into the RBL of the column where the coordinates of point N / M in that direction are located. The dot product result of the mantissa bit multiplication is obtained through the sense amplifier and shift adder circuit. Combined with the original sign bit and exponent bit in the floating-point numbers of point M and point N, the exponent shift alignment and sign adjustment are performed to obtain the intermediate product result. 4) After inputting all the last digits of points M and N through multiple time cycles, obtain all intermediate product results of vectors M and N in three-dimensional space, and sum them to obtain the squared Euclidean distance between points M and N in three-dimensional space.

2. The method for calculating the squared Euclidean distance of point clouds using a 2TOC dynamic random access memory array as described in claim 1, characterized in that, In step 2), the last few digits of different point cloud data are stored sequentially column by column.

3. The method for calculating the squared Euclidean distance of point clouds using a 2TOC dynamic random access memory array as described in claim 1, characterized in that, In step 3), when the information stored in SN is 0, the output of RWL is 0 when a reading voltage of 1 or 0 is input through RBL; when the information stored in SN is 1, the output of RBL is 1 when a reading voltage of 1 is input through RBL, and the output of RBL is 0 when a reading voltage of 0 is input through RBL, thus obtaining the intermediate product result.

4. The method for calculating the squared Euclidean distance of point clouds using a 2TOC dynamic random access memory array as described in claim 1, characterized in that, The intermediate product result described in step 3) is temporarily stored in the register circuit after being output by the shift adder, waiting for the intermediate product results of other bits.

5. The method for calculating the squared Euclidean distance of point clouds using a 2TOC dynamic random access memory array as described in claim 1, characterized in that, Step 4) Add a cache setting. After obtaining the squared Euclidean distance between point M and point N, output it to the cache for storage, and wait for the Euclidean distance between point M and the other points.

6. The method for calculating the squared Euclidean distance of point clouds using a 2TOC dynamic random access memory array as described in claim 5, characterized in that, Add a squared Euclidean distance comparator to compare all squared Euclidean distance results for point M, enabling sampling of the farthest point and sampling of nearest neighbors.