Patents
Literature
Hiro is an intelligent assistant for R&D personnel, combined with Patent DNA, to facilitate innovative research.
Hiro

103 results about "Algorithm acceleration" patented technology

Algorithm Acceleration. Algorithm acceleration uses code generation technology to generate fast executable code. Accelerated algorithms must comply with MATLAB ® Coder™ code generation requirements and rules.

FPGA-based clustering algorithm acceleration system and design method thereof

The invention discloses an FPGA-based clustering algorithm acceleration system and a design method thereof. The method comprises the steps of obtaining a key code of each algorithm through a profiling technology; detailing the key code of each algorithm and extracting same function logic (a common operator); redesigning a code structure by using a blocking technology to increase the utilization rate of data locality and reduce the off-chip access frequency; designing an extended semantic instruction set, realizing function logic parts corresponding to the instruction set, and finishing a key code function through operations of fetching, decoding and execution of instructions; designing an acceleration framework of an accelerator and generating an IP core; and transplanting an operation system to a development board, and finishing cooperative work of software and hardware in the operation system. Various clustering algorithms can be supported and the flexibility and universality of a hardware accelerator can be improved; and the code of each algorithm is reconstructed by adopting the blocking technology to reduce the off-chip access frequency so as to reduce the influence of the off-chip access bandwidth on the acceleration effect of the accelerator.
Owner:SUZHOU INST FOR ADVANCED STUDY USTC

Method for accelerating RNA secondary structure prediction based on stochastic context-free grammar

The invention discloses a method for accelerating RNA secondary structure prediction based on stochastic context-free grammar (SCFG), aiming at accelerating the speed of RNA secondary structure predication by using the SCFG. The method comprises the following steps of: firstly establishing a heterogeneous computer system comprising a host computer and a reconfigurable algorithm accelerator, then transmitting a formatted CM model and an encoded RNA sequence into the reconfigurable algorithm accelerator through the host computer, and executing a non-backtrace CYK/inside algorithm calculation by a PE array of the reconfigurable algorithm accelerator, wherein task division strategies of region-dependent segmentation and layered column-dependent parallel processing are adopted in the calculation so as to realize fine-grained parallel calculation, and n numbers of PEs simultaneously calculate n numbers of data positioned different columns of a matrix by adopting an SPMD way, but different calculation sequences are adopted in the calculation according to different state types. The invention realizes the application acceleration of the RNA sequence secondary structure prediction based on the SCFG model and has high acceleration ratio and low cost.
Owner:NAT UNIV OF DEFENSE TECH

A national cryptographic algorithm acceleration processing system based on an FPGA

The invention discloses a national cryptographic algorithm acceleration processing system based on an FPGA. The system is used for processing a data packet which is sent to a server and needs to be processed by a national cryptographic algorithm. The system comprises an FPGA (Field Programmable Gate Array) accessed to a server through a PCIE (Peripheral Component Interface Express) core interface,wherein the FPGA is used for transmitting a data packet which is stored in the server and needs to be processed by a national cryptographic algorithm to a high-capacity cache DDR of the FPGA at a high speed through a PCIE core interface through DMA reading operation; The method comprises the following steps: processing a data packet needing to be processed by a national cryptographic algorithm through a corresponding national cryptographic algorithm IP core defined by a user, forming the data packet processed by the national cryptographic algorithm and transmitting the data packet to a DDR, and transmitting the data packet processed by the national cryptographic algorithm in the DDR to a server side memory through a PCIE core interface through DMA write operation. The acceleration processing system disclosed by the invention has good reusability and expandability, and has very good popularization and application values.
Owner:北京中科海网科技有限公司

Software and hardware cooperating design method for arithmetic acceleration

InactiveCN101493862AChanging the status quo of secular stagnationImprove compatibilitySpecial data processing applicationsAnalysis dataSystem requirements
The invention discloses a software and hardware collaborative design method of algorithm acceleration. The method has six steps of: step 1: static analysis of algorithm and software; step 2: using software analysis tools to carry out dynamic actual measurement analysis of the software so as to obtain a basic data chart of software operation; step 3: making overall structure and function design of a multi-core hardware system by combination of system requirements, the algorithm analysis and the software actual measurement analysis data; step 4: using appropriate modeling tools (RML) to describe the whole system; step 5: constructing a function process abstract chart GCG (including a function call chart of operation time parameters) on the base of the step 2 and discussing the distribution of the software in the multi-core system by using the chart GCG as the subject; and step 6: carrying out the software and hardware realization of a prototype system according to a proposal obtained from the step 5 and evaluating the realization results. The method has good compatibility, is applicable to the urgent demand for the design of a multi-core system on chip (SOC) and promotes the improvement of multi-core design tools. The method has very high utility value and promising application prospect.
Owner:BEIHANG UNIV

Method, device and equipment for optimizing intelligent video analysis performance

The invention relates to a method, a device and equipment for optimizing the analysis performance of an intelligent video, and the method comprises the steps: (1) carrying out a reference piperine test on a video file for the acceleration of an offline video file, and setting an optimal file slice number; slicing the video file, and issuing a slicing task to the GPU; calling a GPU to decode the slice file, and calling back a decoding result to an algorithm directly through a video memory address, and reducing the performance loss without the video memory-main memory copy, wherein the video analysis algorithm takes the decoded video memory address, calls a GPU for algorithm acceleration and outputs an analysis result; (2) optimizing and expanding the number of paths for real-time video stream algorithm analysis; and calling the GPU to decode each path of real-time video, calling back a decoding result to the algorithm directly through a video memory address, setting double caches by analgorithm end, storing decoded data in multiple paths, transmitting the decoded data to the algorithm for GPU batch processing, and switching the two cache functions after batch processing is completed to achieve the purpose of minimum system delay.
Owner:武汉众智数字技术有限公司

Industrial camera

InactiveCN108696727AIndependent processing capacityEasy to achieve smooth expansionTelevision system detailsColor television detailsGate arrayGraphics processing unit
The invention discloses an industrial camera. The industrial camera comprises an image sensor used for acquiring image data; a programmable gate array FPGA used for connecting the image sensor, a plurality of 10 gigabit optical modules, an HDMI display interface and a graphics processing unit GPU, and performing image processing and executing system and data management; a plurality of 10 gigabit optical module interfaces, which are connected with the image sensor through the FPGA and used for transmitting the image data acquired by the image sensor or processed by the FPGA; the graphics processing unit GPU, which is connected with the FPGA and used for performing algorithm acceleration on the image data transmitted by the FPGA; and the HDMI interface which is connected with the FPGA and used for performing image display on the image data processed by the FPGA. According to the industrial camera provided by the invention, the design of 10 gigabit optical interfaces can realize the smooth expansion of multiple interfaces, achieve the requirement of high bandwidth, and use an optical fiber to perform long-distance transmission without relaying, so that the transmission cost is reduced, the connection of multiple hosts to the camera can be met, and high-bandwidth image data can be acquired.
Owner:杭州言曼科技有限公司

Multi-level scene reconstruction and rapid segmentation method, system and device for narrow space

ActiveCN112200874AImprove scene reconstruction accuracyImprove rebuild speedImage enhancementImage analysisPattern recognitionColor image
The invention belongs to the field of robot scene reconstruction, particularly relates to a multi-level scene reconstruction and rapid segmentation method, system and device for a narrow space, and aims to solve the problem that the reconstruction precision and calculation real-time performance of robot scene reconstruction and segmentation in the narrow space cannot be considered at the same time. The method comprises the following steps: taking a color image, a depth image, camera calibration data and robot spatial position and attitude information; converting sensor data into a single-framepoint cloud through coordinate conversion; dividing scales of the single-frame point cloud, carrying out ray tracing and probability updating to acquire a multi-level scene map after scale fusion; and performing downsampling twice and upsampling once on the scene map, performing lossless transformation by means of scales, and establishing a plurality of sub-octree maps based on a space segmentation result, thereby realizing multi-level scene reconstruction and rapid segmentation. On the premise that necessary details of the scene are not lost, dense reconstruction and algorithm acceleration are achieved, and application to actual engineering occasions is better facilitated.
Owner:INST OF AUTOMATION CHINESE ACAD OF SCI +2

Neural network model real-time automatic quantification method and real-time automatic quantification system

The invention discloses a neural network model real-time automatic quantification method, which is based on an embedded AI accelerator, and comprises the following steps: carrying out embedded AI neural network training at a PC end, establishing a PC end deep learning neural network, and training an input floating point network model of an embedded AI model; quantizing the floating point network model into an embedded end fixed point network model; preprocessing data needing to be quantized, and realizing all acceleration operators of each layer of the model network through a hardware mode; deploying embedded AI hardware of the embedded end and transplanting the neural network model of the embedded end, and transplanting the neural network model of the built AI hardware platform. The invention further discloses a neural network model real-time automatic quantification system. According to the invention, algorithm acceleration is realized based on an embedded AI accelerator hardware mode, the storage occupied space of a neural network model can be reduced, the operation of the neural network model can be accelerated, the computing power of embedded equipment can be improved, the operation power consumption can be reduced, and the effective deployment of the embedded AI technology can be realized.
Owner:SENSLAB INC

Hardware circuit design and method of data loading device for accelerating calculation of deep convolutional neural network and combining with main memory

InactiveCN111783933ASimplify connection complexitySimplify space complexityNeural architecturesPhysical realisationComputer hardwareHigh bandwidth
The invention relates to a hardware circuit design and method of a data loading device combined with a main memory. The hardware circuit design and method are used for deep convolutional neural network calculation acceleration. According to the device, a cache structure is specifically designed and comprises input cache and control, wherein a macro block segmentation method is applied to input ofa main memory or / and other memories, and regional data sharing and tensor data fusion and distribution are achieved; a parallel input register array for converting the data segmentation pieces input into the cache; and a tensor type data loading unit that is connected with the output of the input cache and the input of the parallel input register array. The design simplifies an address decoding circuit, saves area and power consumption, and does not influence high bandwidth of data. The hardware device and the data processing method provided by the invention comprise a transformation method, amacro block segmentation method and an addressing method for the input data, so that the requirement of carrying out algorithm acceleration by limited hardware resources is met, and the address management complexity is reduced.
Owner:北京芯启科技有限公司

Virtual terrain rendering method for carrying out resource dynamic processing and caching based on GPU

The invention discloses a virtual terrain rendering method for dynamically processing and caching resources based on a GPU. The method comprises the following steps: constructing a terrain grid according to a spatial quadtree algorithm; gradually subdividing the terrain and cutting the view cone according to a viewpoint position, and creating GPU end caches for different terrain resources according to the resources positioned by the logic coordinates to realize different shader programs; starting different rendering processes for different resources, and storing the processed resources in a cache opened up by a GPU; writing a shader for terrain rendering to create a drawing instruction for the remaining nodes after terrain cutting, and submitting Draw Call to a GPU to complete drawing. According to the method, the computing power of the GPU is fully utilized, rendering work is submitted to the GPU, processing work of rendering resources is submitted to the GPU to be completed, and therefore resource processing work is greatly accelerated. Meanwhile, a whole set of GPU resource caching and access algorithm is realized, the access performance of resources is accelerated, the rendering performance is further improved, and real-time rendering and flexible editing of super-large-scale virtual terrains become possible.
Owner:航天远景科技(南京)有限公司

Spectrum resource self-allocation method

The invention provides a spectrum resource self-allocation method, which comprises the following steps of: constructing a cognitive function-based game model through a Nash game model and a Stackelberg game model; constructing a water injection model based on a cognitive function through the game model based on the cognitive function in combination with a classical water injection algorithm; and solving the water injection model based on the cognitive function through distributed free iteration to realize self-optimization allocation optimization of spectrum resources. In iteration, a user caneffectively improve the convergence speed of the system by using an acceleration scheme. According to the invention, the optimization of the utilization efficiency of channel resources is realized, the channel rate of the scheme can reach the maximum value of the theoretical spectrum utilization rate, and good system performance can be obtained in different scenes. Frequency spectrum resources can be intelligently found and utilized, and the frequency spectrum resource utilization efficiency is improved to the maximum extent. An algorithm acceleration scheme is provided by utilizing the capability of predicting a competition result of a cognitive function, so that the operation speed can be greatly improved under the condition of not reducing the performance.
Owner:BEIJING JIAOTONG UNIV
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products