Patents
Literature
Patsnap Copilot is an intelligent assistant for R&D personnel, combined with Patent DNA, to facilitate innovative research.
Patsnap Copilot

108results about How to "High speedup" patented technology

Fast motion estimation method realized based on GPU (Graphics Processing Unit) parallel

The invention discloses a fast motion estimation method realized based on GPU (graphics processing unit) parallel. The method comprises the following steps: firstly, whether a current segment belongs to a background domain is judged through a maximum probability using local full-search; secondly, if the current segment belongs to the background domain, full-pixel precision search is ended, or else a search step length is increased by reducing a search domain resolution, and an optimal motion vector distribution range is captured by executing local full-pixel full-search of low resolution, i.e., coarse positioning; thirdly, after the segment belonging to the motion domain is subjected to coarse positioning, the motion vector distribution range is refined by local full-search, so as to accomplish motion estimation of full-pixel precision, i.e., refining positioning; finally, a search template with high density and high precision is used for refining the motion vector, thereby accomplishing a quarter of pixel precision search and ending the motion estimation of the current segment. A termination judgment technique is adopted in the search process, i.e., if a minimum distortion reference segment reaches a set matching precision, the full-pixel search process of the algorithm is ended, and all search points of each step in the search process will have no change.
Owner:XIDIAN UNIV

Hadoop-based recognition method for fake-licensed car

The invention discloses a hadoop-based recognition method for a fake-licensed car. The hadoop-based recognition method for the fake-licensed car is characterized in that the input is massive process records. The method comprises the following steps: transferring valid passing car records subjected to dimension reduction into HBase of a Hadoop cluster; acquiring the passing record of a car that has the same license and appears in any two monitoring points from HBase through Hive; grouping and sequencing according to the license number and passing time; initializing a weighted graph that adopts the monitoring points as a vertex set, and the space between every two monitoring points as the edge weight value; calculating the shortest path between every two monitoring points; combining every two monitoring points and processing by block; creating a plurality of threads; concurrently submitting Hive tasks to recognize the fake-licensed car under the principle of the fake-licensed car and according to the combination of every two monitoring points subjected to block processing; acquiring the final suspectable fake-licensed car through correction factors. Compared with the non-optimization method under the traditional environment, the hadoop-based recognition method has the advantages that the running efficiency and speed-up ratio are raised, and the fake-licensed car can be effectively recognized.
Owner:HANGZHOU DIANZI UNIV

Theven equivalent overall modeling method of modularized multi-level converter (MMC)

The invention relates to a theven equivalent overall modeling method of a modularized multi-level converter (MMC), belonging to the technical field of power transmission and distribution. The electromagnetic transient off-line simulation efficiency of the MMC is determined by the integral computation complexity of a converter part and a modulation voltage-sharing part of the MMC, and the two parts are respectively optimized in three levels by the overall modeling method of the MMC. The core technological scheme provided by the invention is as follows: firstly, assuming the equivalence to be an infinite resistor (ROFF= infinity) when a switch element in a theven equivalent model of the MMC; secondly, discretizing all submodule capacitors in the MMC by adopting a regressive eulerian method, so as to replace a trapezoid integral method in the known model; and finally, providing a novel sequencing and voltage-sharing algorithm by combining the infinite resistor with a turn-off characteristic and the regressive eulerian method, wherein the computation complexity of the novel sequencing and voltage-sharing algorithm is the same as the number of the bridge arm submodules of the MMC. The complexity of the electromagnetic transient simulating calculation of the MMC can be linearly changed along with the increasing of the number of the submodules under the premise of guaranteeing the simulation precision, so that the overall modeling method has important reference significance on researchers in the field of the MMC.
Owner:NORTH CHINA ELECTRIC POWER UNIV (BAODING)

Real-time processing method for cancellation of direct wave and clutter of external radiation source radar based on graphic processing unit (GPU)

The invention provides a real-time processing method for cancellation of direct wave and clutter of an external radiation source radar on a graphic processing unit (GPU), a high-efficiency parallel processing process is built, and the real-time processing capacity of an algorithm is improved. The method comprises the steps: (1), conducting parallel processing on signal in a section mode, and sectioning a signal to be processed in T seconds; (2) adopting a BLMS method to conduct cancellation processing on the signal of each section, wherein the signal of each section is parallel processed through the step (1); (3) adopting the GPU to conduct parallel processing on the signal, processed through the step (1), of each section, conducting processing on data in each section through a fast Fourier transform (FFT) algorithm in a multithreading mode, removing an overlapping region on an output result, conducting splicing, at the same time, outputting a filter coefficient of a last block of data, and serving the filter coefficient as an initial weight coefficient of subsequent data processing; and (4) serving an iterative finally-obtained filter coefficient w (n) as an initial filter weight value of data in a next second, namely, iterating the data of a first section for K times, and processing the subsequent data only once.
Owner:INST OF ELECTRONICS CHINESE ACAD OF SCI

A multi-objective optimization automatic mapping scheduling method for row-column parallel coarse-grained reconfigurable arrays

The invention discloses a method for multi-objective optimization automatic map scheduling of row-column parallel coarse-grained reconfigurable arrays. Computing-intensive tasks are described in codesuch as C, which is translated into an intermediate representation of the data flow graph by semantic parsing, and then hardware and software are divided at the code level, the platform information such as the interconnection of the reconfigurable cell arrays and the scale constraint and the task set of the reconfigurable data flow are inputted through the core loop tool software to initialize theready task queue, the ready cross-layer and misaligned tasks are then removed, the priority of the operation nodes is calculated, and the execution units are selected to map one by one. The scenariois based on tightness dependencies between task nodes, a solution is given under the conditions of the parallelism of task nodes, which effectively solves the problems of high communication cost between computing arrays, ineffective integration of execution time extension and task scheduling in traditional methods, and achieves higher acceleration ratio, lower configuration cost and higher resource utilization ratio of reconfigurable units.
Owner:LANZHOU UNIVERSITY

Synthetic aperture radar echo parallel simulation method based on depth cooperation

ActiveCN105911532AOptimize co-processing modeCollaborative processing mode is obviousRadio wave reradiation/reflectionSynthetic aperture radarRadar
The invention discloses a synthetic aperture radar echo parallel simulation method based on depth cooperation, belonging to the technical field of synthetic aperture radar application. Under the traditional CPU+GPU heterogeneous computing mode, generally CPU refers to operation with high processing logicality, while GPU refers to computing which is used for processing relatively dense data and is suitable for parallel operations. The invention takes an example by a related satellite-borne SAR echo simulation parallel simulation algorithm on the basis of fully researching related articles of SAR rapid echo simulation at home and abroad, and provides a satellite-borne SAR rapid echo simulation algorithm based on SIMD heterogeneous parallel processing, the SAR echo simulation process is accelerated in parallel by virtue of depth cooperation of multi-core vector expanded CPU / many-core GPU, optimization of redundant computation and parallel depth optimization aiming at irregular problem computing in the SAR echo process are performed on the basis. The experimental result proves that compared with the traditional serial computing method, the optimized CPU / GPU heterogeneous cooperation mode has the advantage that the computational efficiency can be improved by 2-3 order of magnitudes.
Owner:BEIJING UNIV OF CHEM TECH

Double parallel computing-based on-line Prony analysis method

ActiveCN104504257AImprove resource utilization and computing efficiencyCalculation speedSpecial data processing applicationsHigh performance computationPower grid
The invention discloses a double parallel computing-based on-line Prony analysis method, and relates to the field of dispatching automaton of an electric power system and the field of high-performance computing of a computer. According to the method, distributed parallel processing of multiple computing nodes is provided aiming at the defects of low resource utilization rate and low computing speed due to adoption of a conventional serial Prony-based algorithm in oscillation parameter identification under the condition of low-frequency oscillation of a large-scale power grid, so task scheduling and load balancing are effectively performed, and the response time of the system is greatly reduced. Parallel design on a Prony mathematical model is realized for the first time; multi-thread parallel computing of Prony is realized by using a multi-thread parallel computing technology. According to the method, on-line identification of parameters such as oscillation amplitude, frequency, initial phase and an attenuation factor of multiple branches and multiple electrical quantities of the power grid can be synchronously realized, the computing and analysis speed is effectively improved, and the requirement on synchronous on-line computing of the large-scale power grid can be met better.
Owner:STATE GRID CORP OF CHINA +1

Multi-task runtime collaborative scheduling system under heterogeneous environment

The invention discloses a multi-task runtime collaborative scheduling system under a heterogeneous environment. The system comprises a system task preprocessing module, a runtime dynamic task scheduling module and a system resource monitoring and managing module, wherein the system task preprocessing module is used for performing static analysis and marking on a code and generating a task code for performing collaborative scheduling by taking a thread as a unit; the system resource monitoring and managing module is used for monitoring, tidying and recording the use condition of system resources, and providing to the runtime dynamic task scheduling module for performing system runtime feature analysis after processing; the runtime dynamic task scheduling module is used for receiving and managing the task code generated by the system task preprocessing module as well as loading and executing the corresponding task according to the system runtime information received from the system resource monitoring and managing module. By implementing the multi-task runtime collaborative scheduling system, the existing program can be fast transferred to the multi-core-many-core heterogeneous environment under the condition of ensuring higher speed-up ratio, higher energy efficiency ratio and higher system resource utilization ratio.
Owner:HUAZHONG UNIV OF SCI & TECH

Carbon-doped graphite-phase carbon nitride nanotube and preparation method thereof

The invention discloses a carbon-doped graphite-phase carbon nitride nanotube and a preparation method thereof. A preparation process of the carbon-doped graphite-phase carbon nitride nanotube comprises the steps of firstly, dispersing melamine into an ethanol solution of aminopropyl trimethoxysilane, carrying out a hydrothermal reaction, and then carrying out centrifugal drying to obtain solid powder; mixing (3-mercaptopropyl) trimethoxysilane with tetraethoxysilane and evenly stirring, adding a mixed solution of ethanol and water into the mixture and stirring again, and carrying out centrifugal separation to obtain a modified polystyrene (MPS)-modified SiO2 solution; adding the solid powder of the pretreated melamine into the MPS-modified SiO2 solution, stirring, then centrifuging, drying, and calcining to obtain a product; etching the product by using a hydrogen fluoride (HF) solution to obtain the carbon-doped graphite-phase carbon nitride nanotube. The preparation process of the carbon-doped graphite-phase carbon nitride nanotube is novel, convenient and fast, and high in controllability; the obtained nanotube has the advantages of being more uniform in size, thinner in tube wall, better in conductivity, excellent in photocatalytic property, and the like; the carbon-doped graphite-phase carbon nitride nanotube can be used for constructing multiple heterostructures, and has good application potential in the aspects of photocatalytic degradation of organic pollutants, water photolysis for hydrogen production, and the like.
Owner:UNIV OF JINAN

Deep learning framework transplanting and optimizing method and system based on target many-core

The invention relates to a deep learning framework transplanting and optimizing method and system based on a target many-core. The method comprises the steps: in a transplanting process, transplantinga source code of a deep learning framework to a target many-core machine, modifying and compiling the framework according to a compiling instruction of the target many-core machine, and enabling theframework to meet the operation conditions of the target many-core machine; the acceleration optimization process comprises the steps that the framework is used for operating a functional model basedon deep learning on domestic many-cores, a target many-core performance analysis tool is used for analyzing codes, and confirmation and extraction of hotspot functions are achieved; analyzing and testing features and function parameters of the hotspot function; accelerating the hotspot function by using a parallel acceleration library; and determining an optimization strategy, finally improving the speed-up ratio of the framework on the premise of ensuring the correctness of the framework, and modifying and testing the compiled file of the deep learning framework according to the current master-slave core parallel code so as to realize the hybrid compilation and operation of the current master-slave core parallel code.
Owner:OCEAN UNIV OF CHINA +1

Image processing method and device based on full convolutional network, and computer equipment

The invention relates to an image processing method and device based on a full convolutional network, and computer equipment. The image processing method comprises the steps: reducing the resolution of a first input image, obtaining a first low-resolution input image, inputting the first low-resolution input image into the full convolutional network, and obtaining a low-resolution output image; respectively converting the first input image and the first low-resolution input image into a second input image and a second low-resolution input image; utilizing an up-sampling method to obtain a processed image from the second low-resolution input image and the low-resolution output image, and calculating to obtain a first linear relation between the processed image and the second low-resolutioninput image; and obtaining a second linear relation between the second input image and the output image according to the first linear relation, and obtaining the output image according to the second linear relation and the second input image. The image processing method can be seamlessly connected with an existing trained network framework, has universality, and has an extremely high accelerationratio while not influencing the image processing quality.
Owner:WUHAN TCL CORP RES CO LTD

Hardware acceleration implementation system and method for RNN forward propagation model based on transverse pulsation array

The invention discloses a hardware acceleration implementation system and a method for an RNN forward propagation model based on a transverse pulsation array. The method comprises the steps of firstly, configuring network parameters, initializing data, lateral systolic array, wherein a blocking design is adopted in the weight in calculation; partitioning a weight matrix calculated by the hidden layer according to rows; carrying out matrix multiplication vector and vector summation operation and activation function operation; calculating hidden layer neurons, obtaining hidden layer neurons according to the obtained hidden layer neurons; performing matrix multiplication vector, vector summation operation and activation function operation; generating an RNN output layer result; finally, generating an output result required by the RNN network according to time sequence length configuration information; according to the method, a hidden layer and an output layer are parallel in a multi-dimensional mode, the pipelining performance of calculation is improved, meanwhile, the characteristic of weight matrix parameter sharing in the RNN is achieved, the partitioning design is adopted, the parallelism degree of calculation is further improved, the flexibility, expandability, the storage resource utilization rate and the acceleration ratio are high, and calculation is greatly reduced.
Owner:NANJING UNIV

Water supply pipe network sensor arrangement optimization method based on multiple particle swarm optimization algorithm

The invention discloses a water supply pipe network sensor arrangement optimization method based on a multiple particle swarm optimization algorithm. The method includes: a pipe network topology structure of a water supply pipe network is established, the complexity of each pipe network node in the water supply pipe network is obtained, hydraulic simulation and water quality simulation of the pipe network topology structure are performed, and the accessibility and the pollutant concentration of each pipe network node are obtained; population initialization of the multiple particle swarm optimization algorithm is performed at a main calculating node, and global search is conducted in the MAP stage; local search is conducted in the Reduce stage, and the newest global optimal individual is obtained; whether the fitness of the newest global optimal individual satisfies a preset convergence condition is determined, and if the fitness does not satisfy the preset convergence condition, iteration evolution is continued via moving to the task distribution step. According to the method, the technical problem of long optimization time of water supply pipe network sensor arrangement in the prior art is effectively solved, the monitoring effect is maximized (such as detecting pollution events with the fastest time), and safety risks due to pollution of drinking water are prevented.
Owner:CHINA UNIV OF GEOSCIENCES (WUHAN)

Switched network system structure with adjustable throughput rate

The invention discloses a switched network system structure with adjustable throughput rate, which aims at providing an expandable switched network system structure, so the throughput rate is adjustable, and the reliability, the flexibility and the controllability of the network can be improved. The invention adopts the technical scheme that: the switched network system structure with adjustable throughput rate is an improved tertiary-level indirect connection network system structure which has a complete switched capacity mode and a binary switched capacity mode, a first level of the complete switched capacity mode is formed by 2*3 crossbar switches with the quantity of P, a second level is formed by 4 Q*Q (Q=3P/4) crossbar switches, and a third level is formed by 3*2 crossbar switches with the quantity of P. Each line card is provided with one 3*2 crossbar switch and one 3*2crossbar switch, the Q*Q crossbar switches of the second level is arranged on a core switching board of a router, and the core switching board is connected with the line card through a back panel; and when only the part field with even number or odd number is configured, the binary switched capacity mode is adopted. The structure of the switched network is simplified, the scale is easy to be adjusted, and the reliability of the data switch is improved.
Owner:NAT UNIV OF DEFENSE TECH
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products