Patents
Literature
Hiro is an intelligent assistant for R&D personnel, combined with Patent DNA, to facilitate innovative research.
Hiro

62results about How to "Improve Parallel Computing Efficiency" patented technology

A GPU cluster deep learning edge computing system oriented to sensing information processing

The invention relates to a GPU cluster deep learning edge computing system oriented to sensing information processing. pre-feature extraction is carried out on sensing information by using weak computing power of front-end intelligent sensing equipment; the quantity of original data information is greatly compressed, then the rest processing tasks are transmitted to a GPU cluster for large-scale sensing data feature clustering set processing, the computing power of front-end intelligent sensing equipment can be dynamically adapted through task splitting processing, and the cost pressure of theconsistency requirement of the front-end sensing equipment and hardware versions is reduced; The communication pressure of the edge computing network is reduced, so that the cost of constructing theedge computing network is greatly reduced; Network data feature transmission hides user privacy;, the SPMD advantages of the GPU are brought into play through the clustering operation according to thedata transmitted in the network and the stored data core characteristics, the parallel computing efficiency of edge computing is improved, and meanwhile, the large-scale parallel computing capacity of the GPU cluster and the advantages of low cost and high reliability are effectively brought into play.
Owner:UNIV OF SHANGHAI FOR SCI & TECH

Integrated circuit electromagnetic response calculation method and device based on multi-level parallel strategy

The embodiment of the invention discloses an integrated circuit electromagnetic response calculation method and device based on a multi-level parallel strategy. According to the method, the method comprises the steps of dividing the calculation of electromagnetic response characteristics of each frequency point needing to be calculated in frequency domain simulation of the multi-layer super-large-scale integrated circuit into a plurality of calculation subtasks; and respectively carrying out parallel computing on each computing subtask by adopting the plurality of parallel subparticles, and independently executing the plurality of computing tasks corresponding to the plurality of frequency points by utilizing the plurality of parallel coarse particles to finish multi-process parallel computing of the plurality of frequency points. According to the method, the calculation efficiency of each part in the electromagnetic response characteristic calculation of the frequency points in the frequency domain simulation of the multi-layer super-large-scale integrated circuit can be improved, the parallel calculation efficiency of the electromagnetic response characteristics of a large numberof integrated circuits with different frequency points can also be improved, and the efficient calculation requirement is met.
Owner:北京智芯仿真科技有限公司

Rapid parallelization method of totally-distributed type watershed eco-hydrology model

ActiveCN103164190ABreak through singlenessReduce parallel granularityConcurrent instruction executionTerrain analysisLandform
The invention relates to a rapid parallelization method of a totally-distributed type watershed eco-hydrology model. The method is that grids serve as basic computing units, through the digital elevation model (DEM) terrain analysis, a watershed grid flow diagram is obtained and the computing dependency relationships of the grids are established, eco-hydrology process simulations of the vertical direction of the grid units serve as independent computing tasks, the grid unit computing tasks are decoupled according to the dependency relationships among the grid units, a task tree is constructed, a data bank available group (DAG) model is adopted to express the task tree, a task scheduling sequence is generated dynamically by utilization of the DAG model and the marginal elimination dynamic scheduling algorithm, the grid computing tasks are distributed to different nodes to compute through a portable batch system (PBS) dynamic scheduler, and parallelization of the totally-distributed type watershed eco-hydrology model is achieved. The rapid parallelization method of the totally-distributed type watershed eco-hydrology model has the advantages that operation is easy, practicability is strong, parallel logic control of the parallelization processing algorithm is greatly simplified, the parallel computing efficiency is effectively improved, and the parallel stability is improved.
Owner:CENT FOR EARTH OBSERVATION & DIGITAL EARTH CHINESE ACADEMY OF SCI

Dynamic load balancing method based on Linux parallel computing platform

The invention provides a dynamic load balancing method based on a Linux parallel computing platform, and belongs to the field of parallel computing. The hardware architecture of the dynamic load balancing method comprises multiple computers participating in computing, and each computer is provided with a Linux operating system and an MPI software development kit. In the parallel computing process, general computing tasks are divided into multiple stages with the same execution time to be executed. The routine job scheduling technology in a system is utilized, the current resource utilization rate of each node is firstly read before parallel computing of each time phase begins, calculated performance and computational complexity of each node are combined, dynamic allocation is conducted on the computing tasks of the nodes, the fact that computing time of the nodes in every phase is basically equal is guaranteed, and system synchronization waiting delay is reduced. Through the dynamic adjustment strategy, the general computing tasks can be completed by means of higher resource utilization rates, the dynamic load balancing method breaks through efficiency bottlenecks caused by low configuration computational nodes, the computing time is further saved on the basis of the parallel computing, and computational efficiency is improved.
Owner:SHANDONG UNIV

Data waveform processing method in seismic exploration

InactiveCN103105623AProgrammablePowerful floating-point computing capabilitySeismic signal processingGraphicsPrediction algorithms
The invention discloses a data waveform processing method in a seismic exploration, and the data waveform processing method in a seismic exploration is based on a three dimensional free surface multiple prediction method of a wave equation. The Data waveform processing method is used for the three dimensional surface multiple predictions in the process of the seismic exploration information. A graphic processing unit (GPU) is used for speeding up the fully three dimensional surface multiple prediction algorithm, namely, the GPU and the CPU are coordinated for calculation. Operations with intensive calculation are transferred to the GPU to carry out the calculation; higher calculation efficiency is available, the algorithm is capable of processing the earthquake information from the complicated underground medium. The method takes the space effect into account that a refection point, the shot point and an acceptance point are not in the same line. And the method also considers a two-dimensional algorithm which is superior to the convention. A simple approximation over the underground medium is unnecessary, therefore, the fully three dimensional free surface multiple prediction algorithm based on the wave equation accords to the true conditions of the underground medium so that the amplitude and the phase of the seismic data multiple wave can be exactly predicted.
Owner:NORTHEAST GASOLINEEUM UNIV

GPU thread design method of power flow Jacobian matrix calculation

InactiveCN105391057AImprove branch execution efficiencySolve load imbalanceAc network circuit arrangementsNODALPower flow
The invention discloses a GPU thread design method of power flow Jacobian matrix calculation. The method comprises the steps that: power grid data is input, and pre-processing is carried out on a node admittance matrix Y; a CPU calculates position mapping relation tables respectively between sub-matrix non-zero elements and node admittance matrix Y non-zero elements; the calculates position mapping relation tables respectively between sub-matrix non-zero elements and Jacobian matrix non-zero elements; in a GPU, injection power of each node is calculated by an injection power kernel function S; and in the GPU, non-zero elements in sub-matrixes are respectively calculated by Jacobian sub-matrix calculation kernel functions and stored in a Jacobian matrix. According to the invention, the calculation of Jacobian matrix non-zero elements is carried out by the sub-matrixes, and the judgment of branch structures of sub-matrix regional process belonging to the elements, which is required by the direct calculation process using a single kernel function, is avoided, so that the execution efficiency is improved; in addition, the non-zero elements in the sub-matrixes having identical calculation formulas are calculated in a centralized manner, so that the problem of unbalanced thread loads is solved, and the efficiency of parallel calculation is improved.
Owner:STATE GRID CORP OF CHINA +3

Automatic starting-stopping and computation task dynamic allocation method for mass parallel coarse particle computation

The invention discloses an automatic starting-stopping and computation task dynamic allocation method for mass parallel coarse particle computation. The method comprises the following steps: defining parallel coarse particles according to a problem computation characteristic; dynamically allocating computation tasks in the parallel coarse particles and input parameters corresponding to the computation tasks to all processes including a master process by the master process according to a file marking technology and a dynamic allocation computation task strategy; dynamically allocating memories to the processes including the computation tasks based on an automatic starting-stopping technology; and after completion of parallel computation of all the parallel coarse particles, collecting output parameters of all the processes by the master process, and combining and integrating the output parameters to obtain a final result of complete running. Through adoption of the method, communications among the processes are reduced to the maximum extent; the hard disk reading-writing bottleneck occurring since a memory peak value is greater than an available physical memory during multi-process parallel computation is avoided; meanwhile, the problem of non-equivalent complexity of computation examples is solved perfectly; and the parallel computation efficiency is increased greatly.
Owner:北京智芯仿真科技有限公司

Load flow calculation method and device for large-scale power distribution network

The embodiment of the invention provides a load flow calculation method and device for a large-scale power distribution network, and the method comprises the steps: decomposing a target power distribution network into a plurality of sub-networks, and obtaining an equation for solving a contact branch current value between the sub-networks; calculating constant impedance, constant current and constant power parameters of each subnet ZIP load model; calculating a node admittance matrix of each subnet according to the constant impedance parameters, and improving the node admittance matrix; calculating the node injection current of the current iteration according to the node voltage of the current iteration and the constant current and constant power parameters; obtaining the current iterationcontact branch current according to the improved node admittance matrix, the current iteration node injection current and an equation used for solving the contact branch current between the subnets;obtaining the node voltage phasor of the next iteration according to the contact branch current of the current iteration if the difference value between the contact branch current of the previous iteration and the next iteration is greater than the convergence criterion. According to the embodiment of the invention, the efficiency and flexibility of load flow parallel computing are considered.
Owner:CHINA AGRI UNIV

Coarse particle parallel method and system for electromagnetic function material optimization design

The invention relates to the field of function material design and high-performance calculation, in particular to a coarse particle parallel method and system for electromagnetic function material optimization design. The method comprises the steps of obtaining a serial version by taking a coarse particle as an independent execution module, and obtaining a parallel version by taking the coarse particle as a basic execution unit; further, processing a plurality of calculation tasks by adopting an allocation policy of random sorting, wherein the allocation policy of random sorting can thoroughly disrupt an allocation sequence of all the calculation tasks to obtain a new sequence of the calculation tasks; then, applying for allocating to-be-allocated calculation tasks in the new sequence by each calculation process through adopting a first-application first-allocation policy, and performing calculation until all the calculation tasks are finished; and finally, performing statistics on calculation results, performing targeting processing, comparing a processing result with an expected optimization goal, and judging whether the optimization goal is achieved or not. According to the coarse particle parallel method, the problem of low parallel calculation efficiency in electromagnetic function material optimization design at the present stage is effectively solved.
Owner:北京智芯仿真科技有限公司

Arithmetic expression parallel computing device and method

The invention provides an algorithm expression parallel computing device and method. A main processor in the device is connected to a coprocessor through a plurality of AXI buses; the main processor is used to determine the operation data and the algorithm expression additional information in an algorithm expression to be processed; the operation data includes a multiplication sub-algorithm expression, third data to be added with a product result, and mantissa data to be multiplied by an addition result; the multiplication sub-algorithm expression includes first data and second data; the operation data and the algorithm expression additional information are sent to the coprocessor in parallel through the AXI buses; the coprocessor is used to simultaneously perform multiplication on the first data and the second data received from the AXI buses separately, and perform calculation on a product result corresponding to the same algorithm expression to be processed, the third data and the mantissa data based on the algorithm expression additional information to obtain a calculation result, thus realizing the parallel processing of multiple operation data, realizing the multi-task time-sharing parallel processing, and improving the parallel computing efficiency.
Owner:TIANJIN CHIP SEA INNOVATION TECH CO LTD +1

Vector space computing strength predicting method and system based on fully random forest

The invention discloses a vector space computing strength predicting method and system based on a fully random forest. By inputting all features relevant to the vector space computing strength, multiple full regression trees are trained, computing strength modeling for a vector space computing domain with various features is achieved, the prediction result of the fully random forest is optimized,prediction values which are distinctly different from the prediction result are removed, the prediction precision of the fully random forest is improved, and the vector space computing strength is precisely predicted in a parallel computing environment. By means of the vector space computing strength predicting method and system based on the fully random forest, in the training process of the fully random forest, the training sample of each regression decision-making tree is randomly selected from an original sample, the selected features comprise all features of the original sample, and the model can adapt to prediction of the vector space computing strength with few important features and many redundancy features. A basis can be provided for parallel computing resource balanced scheduling and allocating, and the parallel computing efficiency is improved.
Owner:地大(武汉)资产经营有限公司

Data processing method and device and electronic device

The invention provides a data processing method and device and an electronic device. The method comprises the following steps of according to a preset spatial resolution, carrying out the warp, weft and vertical geographic region division on the earth to obtain a three-dimensional grid structure corresponding to an earth space; dividing the grids in the grid structure on the grid planes corresponding to the warp direction and the weft direction in the grid structure to obtain a sub-grid structure corresponding to each calculation component; dividing the grids in the sub-grid structures in a first weft direction or vertical direction to obtain the secondary sub-grid structures; distributing the region data corresponding to the secondary sub-grid structures to a main calculation unit; and controlling the main computing unit to distribute the regional data corresponding to the secondary sub-grid structures to the at least one sub-computing unit, so that the main computing unit and the at least one sub-computing unit compute the service data in the regional data. According to the method, the calculation efficiency of an ocean circulation mode in the warp direction, the weft direction and the vertical direction is improved.
Owner:THE FIRST INST OF OCEANOGRAPHY SOA

Efficient weld joint morphology numerical simulation prediction method based on fast Fourier transform

The invention discloses an efficient weld joint morphology numerical simulation prediction method based on fast Fourier transform, and aims to solve heat transfer and flow equations in a weld joint forming process based on fast Fourier transform in order to solve the problem that an existing weld joint morphology numerical simulation method is low in prediction efficiency. Therefore, efficient prediction of the weld joint morphology change process is achieved. The method comprises the following steps of firstly, defining an initial weld joint morphology function and inputting welding parameters; secondly, solving a welding heat transfer and flow equation based on fast Fourier transform; further, updating a motion equation of the weld joint morphology function; and finally, outputting a newweld joint morphology function and extracting new weld joint morphology and characteristic parameters. Compared with an existing weld morphology numerical simulation method, the weld prediction efficiency can be effectively improved by dozens, hundreds and even thousands of times, the infinite parallel speed-up ratio is supported, and the calculation efficiency and the calculation precision of the method are far higher than those of an existing method.
Owner:CHANGSHU INSTITUTE OF TECHNOLOGY

Modal parallel computing method and system for heterogeneous many-core parallel computer

The invention provides a modal parallel computing method and system for a heterogeneous many-core parallel computer. The method comprises the following steps: the step S1: generating a finite elementmodel rigidity matrix and quality matrix data through a finite element program, dividing the generated finite element model rigidity matrix and quality matrix data into N subareas, and independentlystoring the finite element model rigidity matrix and quality matrix data of each area in a file, wherein N is an integral multiple of a single core group; and the step S2, enabling the main cores of the core groups which are subjected to parallel computing to synchronously read the finite element model rigidity matrix and the quality matrix data corresponding to the sub-regions, enabling the coregroups to have no data communication exchange, and enabling the slave cores in the core groups to have no data communication exchange. Layering of the calculation process and data communication is achieved through a layering strategy, a large amount of data communication is limited in each core group, and the advantage that the communication rate in the domestic heterogeneous many-core parallel computer core group is high is fully played.
Owner:SHANGHAI JIAO TONG UNIV

Frequency domain astronomical image target detection method and system

The invention discloses a frequency domain astronomical image target detection method and system, and the method is realized based on a CPU-GPU heterogeneous processor, and the method comprises the steps of carrying out the preprocessing of a collected original astronomical image through a CPU based on a reference image obtained in advance; respectively carrying out image partitioning on the reference image and the preprocessed astronomical image by adopting an overlapping storage method, multiplying a Gaussian basis function by a polynomial to obtain n groups of basis vectors of a convolution kernel, and inputting the n groups of basis vectors into a GPU; enabling the GPU to perform fitting according to the reference sub-images and the astronomical sub-images in combination with n groups of base vectors to obtain a convolution kernel corresponding to each reference sub-image, using the convolution kernels for performing frequency domain filtering blurring processing on each reference sub-image to obtain a template image, and inputting the template image into the CPU; and discarding the edge of the template image by the CPU, connecting the remaining parts and making a difference between the remaining parts and the original astronomical image to obtain a difference image, thereby realizing target detection of the astronomical image.
Owner:NAT SPACE SCI CENT CAS

Layered data center resource optimization method and system based on SDN and NFV

The invention discloses a layered data center resource optimization method and system based on SDN and NFV, and the method comprises the steps: enabling an MANO controller to arrange a request initiated by a client to generate a corresponding service chain, and taking chains with the same source and destination nodes into a feature group; filtering the grouped service chains to obtain a part of which the bandwidth requirement is higher than a set threshold value, marking the highest priority, and arranging the remaining service chains in a descending order according to the requirement of the remaining service chains on the time delay to indicate the priority; calculating the first K shortest paths between the source node and the destination node for each group of service chains; formulating, by the MANO controller, a mapping strategy of each service chain by using a coevolutionary multi-population competition genetic algorithm according to the first K shortest paths; and issuing, by the MANO controller, the formulated mapping strategy to an SDN controller of a basic control layer, and enabling the SDN controller to convert the mapping strategy into a flow table suitable for being processed by a switch and issue the flow table to the switch for execution. The end-to-end time delay of the service chain is small, and the utilization rate of network resources is high.
Owner:NANJING UNIV OF POSTS & TELECOMM

System and method for extracting capacitance parameters of integrated circuits based on GPU

The invention discloses a GPU(graphic processing unit)-based system and method for extracting a capacitance parameter of an integrated circuit. The system comprises a random walk start module, a random walk jump module and a random walk statistic module, wherein all modules achieve data exchange on a global memory of a GPU; each module operates a plurality of GPU threads in parallel; in the random walk start module, each GPU thread generates a specified number of walk starting points and acquires corresponding weights of the walk starting points; in the random walk jump module, each GPU thread performs specified times of random walks and acquires an identifier of a conductor hit during each walk; and in the random walk statistic module, each GPU thread reads the specified number of identifiers of the hit conductors and weights of corresponding walk starting points so as to calculate the accumulated capacitance value and the accumulated capacitance quadratic sum. If the relative error of the self-capacitance of a main conductor does not reach the target accuracy, the number of paths required to be covered is estimated. The system and the method can achieve rapid extraction for the capacitance parameter of the integrated circuit.
Owner:北京超逸达科技有限公司

Integrated circuit simulation multi-thread management parallel method and device with secrecy function

The invention provides an integrated circuit simulation multi-thread management parallel method and device with a secrecy function. The method comprises the steps that electromagnetic simulation of a whole integrated circuit cloud platform is divided into two parts, namely a cloud computing platform and a client; the client extracts integrated circuit model information and calculation conditions which need to be calculated, and parallel coarse particles are formed and sent to the cloud platform; the cloud platform builds a management process, the management process builds a mutual exclusion body and a thread, and the thread builds a calculation process according to the state of the working mutual exclusion body and distributes and manages calculation coarse particles; and the threads created by the calculation process and the management process realize query and response communication to complete the distributed calculation coarse particles. According to the invention, only the parallel coarse particles extracted by the client are sent to the cloud platform instead of sending all information of the whole integrated circuit model to the cloud platform, so that the integrated circuit model is prevented from being leaked out through the Internet; and the calculation coarse particles are distributed and managed by adopting a mutual exclusion body and a thread, so that the parallel calculation efficiency is improved.
Owner:北京智芯仿真科技有限公司
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products