Patents
Literature
Patsnap Copilot is an intelligent assistant for R&D personnel, combined with Patent DNA, to facilitate innovative research.
Patsnap Copilot

36results about How to "Reduce the number of multiplications" patented technology

Convolution acceleration method, convolution calculation processing method, devices, electronic apparatus and storage medium

ActiveCN108229645ATake full advantage of reconfigurableTake full advantage of parallel computingProgram controlNeural architecturesParallel computingData store
The embodiments of the invention disclose a convolution acceleration method, a convolution calculation processing method, a convolution acceleration device, a convolution calculation processing device, an electronic apparatus and a storage medium. The acceleration method includes the following steps that: the to-be-processed data of a preset size in a to-be-processed task are sequentially read from the off-chip memory of an accelerator through an FIFO (first-input-first-output) port, and are stored in the input cache regions of a first cache region in the on-chip memory of the accelerator; to-be-processed input window data are sequentially read from the input cache regions in response to a condition that the to-be-processed data are stored into the input cache regions, and convolution calculation is performed on the to-be-processed input window data through convolution kernel data, so that output window data are obtained and are stored in a third cache region; and output window data inthe third cache region are sequentially stored to the off-chip memory through the FIFO port. With the methods, devices, electronic apparatus and storage medium provided by the embodiments of the invention adopted, a condition that the on-chip memory and bandwidth of the accelerator are insufficient and a condition that processor resources are limited can be avoided, and the efficiency of convolution calculation processing is improved. The methods and devices can be applied to hardware platforms such as an FPGA and an ASIC.
Owner:BEIJING SENSETIME TECH DEV CO LTD

Data processing method and device

The invention provides a data processing method and device.. The data processing method includes: determining the non-zero element in the vector to be calculated; Obtaining a data processing instruction, the data processing instruction carrying a first base address and a column number, the first base address being a first address stored in a memory by a sparse matrix multiplied by a vector to be calculated, and the column number being a position of a target column in the sparse matrix; Decoding the data processing instruction, and performing multiplication on the non-zero element in the vectorto be calculated and the target element in the target column according to the data processing instruction; Wherein the value of the target element of each column in the sparse matrix and the positionof the target element in the column to which the target element belongs are stored in the memory, and the target element comprises a non-zero element existing in the column and a zero element meetinga preset condition; And constructing a scheme of the multiplication result of the vector to be calculated and the sparse matrix according to the multiplication result of each non-zero element in thevector to be calculated, and reducing the number of times of multiplying the zero element in the sparse matrix by another vector to improve the utilization rate of operation resources and memory resources.
Owner:LOONGSON TECH CORP

Self-adaptive Cartesian grid generation method for three-dimensional streaming problem of any shape

ActiveCN113505443AEfficient and robust generationReal-time display of feature structureGeometric CADDesign optimisation/simulationComputational scienceAlgorithm
The invention discloses a self-adaptive Cartesian grid generation method for a three-dimensional streaming problem of any shape, and the method comprises the steps: generating an isotropic self-adaptive Cartesian grid suitable for an immersed boundary method based on geometric information in the three-dimensional streaming problem, carrying out the calculation of a flow field, and encrypting an area containing key flow features according to the calculation result of the flow field. In order to solve the problem of computational fluid mechanics numerical simulation with complex three-dimensional streaming, a surface set composed of triangles is adopted as input, a grid intersection judgment method based on a separation axis theory and a grid inside and outside judgment method based on an improved ray algorithm are adopted for grid classification, a grid subdivision method based on a unit is adopted for encrypting and coarsening grid units, and a self-adaptive Cartesian grid meeting the requirements of an immersed boundary method and flow field calculation resolution can be efficiently and robustly generated; and a region containing a feature structure is selectively encrypted according to flow field parameters obtained subsequently, and the flow field feature structure in the current flow field area is displayed in real time.
Owner:NANJING UNIV OF AERONAUTICS & ASTRONAUTICS

RUMSWF based low-complexity reduced rank balancing method in MIMO system

The invention provides a RUMSWF based low-complexity reduced rank balancing method in an MIMO system, which is improved from an MSWF based self-adapting reduced rank linear balancing method and is a reduced rank self-adapting MIMO linear balancing method which realizes a multilevel Weiner filter based on a rectangular block matrix and by a related subtraction structure. By improving the block matrix of the unitary multilevel Weiner filter, a rectangular matrix block of the square block matrix is selected as the block matrix, and the number of dimension of a received signal is reduced step by step in the forward recursive decomposition of the unitary multilevel Weiner filter, thereby reducing the iteration complexity of the self-adapting balancing and simultaneously increasing the convergence rate. The theoretical analysis and simulation result show that the low-complexity quick reduced rank self-adapting balancing method has the advantages of low complexity and quick convergence rate. In a V-BLAST system provided with 4 transmitting antennas and 8 receiving antennas and adopting the BPSK modulation, by only utilizing one half of complexity of the multilevel Weiner filter based balancing method, the error code performance which is only 0.78 dB lower than that of the multilevel Weiner filter based balancing method can be achieved.
Owner:XI AN JIAOTONG UNIV

A fast ray tracing method and system

The invention discloses a rapid ray tracing method and system, and the method comprises the steps: two-stage refusal detection and one-stage line-plane intersection calculation. The method specifically comprises the steps that In order to search a surface element intersected with a ray, the refusal detection at the first stage is used for locking a preliminary range, wherein the basic idea comprises the steps: taking the maximum side length Lm of a triangular surface element K as the radius, taking any top point Q0 of the triangular surface element as the sphere center, forming a sphere Q through rotation, and determining that the ray cannot be intersected with the triangular surface element if the ray is located outside the sphere Q formed by the triangular surface element; the refusal detection at the second stage is used for locking a more precise range, wherein the basic idea comprises the steps: determining that the surface element intersected with the ray is definitely intersected with two planes which pass through the ray and are not parallel; finally the line-plane interaction calculation is employed, wherein the basic idea comprises the steps: converting the line-plane intersection calculation of a three-dimensional space into the point-plane calculation of a two-dimensional space through parallel projection, and jumping there-dimensional inversion matrix solving. The method achieves the acceleration of ray tracing under the condition that the precision is not affected.
Owner:HUAZHONG UNIV OF SCI & TECH

An Adaptive Cartesian Mesh Generation Method for 3D Flow Around Arbitrary Shapes

ActiveCN113505443BEfficient and robust generationReal-time display of feature structureGeometric CADDesign optimisation/simulationAlgorithmImage resolution
The invention discloses a method for generating an adaptive Cartesian grid for a three-dimensional surrounding flow problem with arbitrary shapes. Based on the geometric information existing in the three-dimensional surrounding flow problem, an isotropic adaptive Cartesian grid suitable for the immersion boundary method is generated, and the Perform flow field calculations, and encrypt areas containing key flow characteristics based on the flow field calculation results. Aiming at the computational fluid dynamics numerical simulation problem with complex three-dimensional flow around, the present invention adopts the surface set composed of triangles as input, and adopts the grid intersection determination method based on the separation axis theory and the grid internal and external determination method based on the improved ray algorithm to carry out grid meshing. Classification, using the element-based meshing method to refine and coarsen the grid cells, can efficiently and robustly generate an adaptive Cartesian grid that meets the requirements of the immersion boundary method and the resolution of the flow field calculation, and obtains according to the subsequent The flow field parameters of the device selectively encrypt the area containing the characteristic structure, and display the flow field characteristic structure in the current flow field area in real time.
Owner:NANJING UNIV OF AERONAUTICS & ASTRONAUTICS

Dynamic electromagnetic spectrum posture method and system based on tensor and neural network

The invention discloses a tensor and neural network-based dynamic electromagnetic spectrum posture method and system. The method comprises the steps of obtaining spectrum data of an area surrounded by current unmanned aerial vehicle flight; performing analog-to-digital conversion on the frequency spectrum information, then reading the frequency spectrum information after performing fast Fourier transform, and checking whether the frequency spectrum data displayed for one time is full or not; then an original matrix is constructed, and the constructed original matrix is visualized; performing data completion on the visualized original matrix to obtain an optimized final completion matrix; and inputting the optimized final completion matrix result value into a BP neural network for fuzzification processing to obtain a final completion matrix value, drawing a completed spectrum posture two-dimensional graph by using the final completion matrix value, drawing a three-dimensional posture spatial resolution graph, and displaying a spectrum posture signal of the dynamic signal according to time and frequency changes. According to the invention, by monitoring the radio signal and the electromagnetic spectrum posture, the purposes of eliminating electromagnetic interference and improving the spectrum utilization rate are achieved.
Owner:NORTHWESTERN POLYTECHNICAL UNIV

A Dynamically Reconfigurable Convolutional Neural Network Accelerator Architecture for the Internet of Things

The present invention is a dynamic reconfigurable convolutional neural network accelerator architecture for the Internet of Things field, including a cache architecture, etc. The cache architecture is used to store data from external storage or data generated during the calculation process, and organize and arrange them , is transmitted to the processing unit array in a data structure for calculation; the processing unit array is used to receive data from the cache architecture, and is stored in the cache architecture after convolution operation processing; the calculation module is used to receive data from the processing unit array, select Perform three operations of pooling, normalization, or activation functions, and store the output data in the cache architecture; the controller is used to send commands to the cache architecture, processing unit array, and computing module, and is designed with an external interface for communicating with the outside system to communicate. The present invention improves the performance of the convolutional neural network accelerator and reduces power consumption by designing a processing unit array with high parallelism and high utilization rate and a cache architecture that can increase the data multiplexing rate.
Owner:XI AN JIAOTONG UNIV

Fast timing synchronizing method of full-duplex communication system

The invention relates to a fast timing synchronizing method of a full-duplex communication system, which belongs to the technical field of the timing synchronization of wireless communication. The method is characterized by comprising the steps: adopting two identical PN code sequences as pilot signals through a transmitting-terminal data frame, receiving the signal through a receiving terminal, delaying the signal for a PN code period, aligning a second PN code with a first PN code of the delayed received signal, multiplying the first PN code and the second PN code by taking the first PN code and the second PN code as corresponding sampling points, counting the quantity of the multiplied values which are greater than a set threshold value in the length of the PN code period, carrying out the multiplication operation of the corresponding sampling points on the delayed receiving signal and the local PN code sequence if the statistic quantity is greater than a critical value of the quantity, then counting the quantity of the multiplied values which are greater than the set threshold value in the length of one PN code period, and if the statistic quantity still exceeds the critical value of the set quantity, determining the head part of a delaying unit as a timing synchronizing point. By adopting the method, the timing synchronization is realized by ingeniously utilizing the self-correlation characteristic of the PN code and the random characteristic of the noise, the multiplication times can be reduced, the operand is reduced, the accumulation operation is not needed by adopting the threshold value comparison way, and the timing synchronizing point can be rapidly and accurately found.
Owner:SHANDONG UNIV

Data processing method and equipment

The invention provides a data processing method and device.. The data processing method includes: determining the non-zero element in the vector to be calculated; Obtaining a data processing instruction, the data processing instruction carrying a first base address and a column number, the first base address being a first address stored in a memory by a sparse matrix multiplied by a vector to be calculated, and the column number being a position of a target column in the sparse matrix; Decoding the data processing instruction, and performing multiplication on the non-zero element in the vectorto be calculated and the target element in the target column according to the data processing instruction; Wherein the value of the target element of each column in the sparse matrix and the positionof the target element in the column to which the target element belongs are stored in the memory, and the target element comprises a non-zero element existing in the column and a zero element meetinga preset condition; And constructing a scheme of the multiplication result of the vector to be calculated and the sparse matrix according to the multiplication result of each non-zero element in thevector to be calculated, and reducing the number of times of multiplying the zero element in the sparse matrix by another vector to improve the utilization rate of operation resources and memory resources.
Owner:LOONGSON TECH CORP
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products