Patents
Literature
Patsnap Copilot is an intelligent assistant for R&D personnel, combined with Patent DNA, to facilitate innovative research.
Patsnap Copilot

230 results about "Parallel optimization" patented technology

Software performance optimization method based on central processing unit (CPU) multi-core platform

The invention provides a software performance optimization method based on a CPU multi-core platform. The method comprises software characteristic analysis, parallel optimization scheme formulation and parallel optimization scheme implementation and iteration tuning. Particularly, the method comprises application software characteristic analysis, serial algorithm analysis, CPU multi-in/thread parallel algorithm design, multi-buffer design, design of communication modes among threads, memory access optimization, cache optimization, processor vectorization optimization, mathematical function library optimization and the like. The method is widely applicable to application occasions with multi-thread parallel processing requirements, software developers are guided to perform multi-thread parallel optimization improvement on prior software rapidly and efficiently with short developing periods and low developing costs, the utilization of system resources by software is optimized, data reading and computing and mutual masking of write-back data are achieved, the software running time is shortened furthest, the hardware resource utilization rate is improved apparently, and the software computing efficiency and the software whole performance are enhanced.
Owner:LANGCHAO ELECTRONIC INFORMATION IND CO LTD

Design optimization method and optimization device of power assembly mounting system

The invention provides a design optimization method and an optimization device of a power assembly mounting system. The design optimization method comprises the steps of establishing a differential equation of a space six-freedom degree vibration model of the power assembly mounting system; analyzing to obtain an inherent frequency, an inherent vibration mode and vibration energy coupling among six freedom degrees according to inherent characteristics of the differential equation for the power assembly mounting system; establishing a multiple target optimization function of the power assembly mounting system according to each order inherent frequency, inherent vibration mode and vibration energy coupling; and carrying out optimization design with a particle swarm optimization algorithm. A dynamical model and the optimization function of the power assembly mounting system are established, the multiple target optimization function with reasonable distribution of mounting modal frequency and decoupling degree of energy as targets is determined, and a parallel optimization multiple target algorithm is subsequently adopted to obtain a multiple target optimization scheme set of the power assembly mounting system so that the designed power assembly mounting system can best meet the performance requirements of energy decoupling and modal distribution.
Owner:BAIC MOTOR CORP LTD

FPGA parallel acceleration system based on CNN image quality enhancement algorithm

The invention discloses an FPGA parallel acceleration system based on a CNN image quality enhancement algorithm. The FPGA parallel acceleration system comprises a central processing unit, a DMA controller, a bus module, an accelerator IP core module, an on-chip memory BRAM and an off-chip memory SDRAM. The central processing unit performs fixed-point quantification on the weight data of the trained convolutional neural network model to obtain quantified weight data and stores the quantified weight data in the off-chip SRDAM; the DMA controller carries the weight data pre-stored in the off-chipSDRAM and the video image data to be processed to an on-chip memory BRAM for block storage; the accelerator IP core module adopts multiplier parallel optimization and dimension conversion and streamline line line caching and shared ping design optimization operation, the central processing unit starts the accelerator IP core module and obtains data from the BRAM to carry out forward calculation of a network, and a picture obtained through calculation is carried to the off-chip SDRAM. According to the invention, the power consumption is greatly reduced, the balance of FPGA resource utilizationand operation efficiency is realized, and the video image application requirement in an actual embedded scene can be met.
Owner:SOUTHEAST UNIV

Ultra-dimension fluvial dynamics self-adapting parallel monitoring method

The invention discloses a method of super-dimensional river dynamics self-adaptive parallel monitoring, which includes the steps as following: input super-dimensional data into a system and classify according to the different dimension where the data are; create a super-dimensional unstructured grid river dynamics model based on a characteristic-type high-resolution numerical algorithm; in terms of an efficient parallel algorithm in a super-dimensional fluid splitting scheme, perform intra-dimensional and inter-dimensional calculations; the calculation region is divided into a plurality of sub-regions, each sub-region is mapped on a calculation node on the parallel system structure, the communication between the nodes uses a standard message passing interface, the overlapped parallel optimization technique of calculation and communication in the self-adaptive grid, and the calculation of variables associated with the space is independent. The method in the invention puts the super-dimensional river dynamics into the adaptive grid to execute the efficient parallel calculation of splitting scheme, and simultaneously processes the change of dimension; the method realizes the monitoring of river conveniently, timely and high accurately.
Owner:SHENZHEN INST OF ADVANCED TECH

Bimodal fusion tomography method based on iterative shrinkage

The invention belongs to the field of medical molecular imaging, and relates to an autofluorescence tomography and computed tomography bimodal fusion method, in particular to a bimodal fusion tomography method based on iterative shrinkage. The technology is used for quantifying the intensity of a light source in a reconstructed target body and positioning the light source, and solving the problemof negative direction of all internal light intensity distributions acquired by inversing the limited light intensity distributions on the surface of the target body. The technical scheme has the main points that: the surface light intensity information obtained in autofluorescence tomography and the internal geometric structure information obtained in computed tomography are fused, a complicatedmulti-dimension optimization process in reconstruction is converted into a one-dimension parallel optimization high-efficiency cyclic process by iterative shrinkage, and accurate reconstruction results of regular parameter, lp norm, noise and initial value robustness are acquired integrally. The technology can be effectively applied to research on the systemic physiologic metabolism of a target body, has a high reconstruction efficiency and is suitable for a condition with lower imaging system performance.
Owner:INST OF AUTOMATION CHINESE ACAD OF SCI

Fast image interpolation method for mobile terminal

InactiveCN105023241AImprove interpolation qualitySmooth interpolationGeometric image transformationVideo monitoringImage resolution
The present invention discloses a fast image interpolation method for a mobile terminal. The fast image interpolation method comprises the steps of: firstly, performing edge detection on a low-resolution original image to obtain edge information, calculating the strength of an edge according to the edge information of the low-resolution original image and a human visual system, expanding the edge according to the strength, dividing the image into an edge region and a non-edge region according to the expanded edge, processing the edge region by adopting an interpolation algorithm with relatively high fidelity, and storing the edge information; and secondly, processing the non-edge region by adopting a faster bicubic interpolation algorithm, then further sharpening the edge of an interpolation image according to the existing edge information to reduce the blurriness of the edge and prompt the visual quality of the image, and at last combining the interpolation algorithm with an NEON parallel technology to obtain a high-resolution image subjected to parallel optimization. The result of the application of the technology to a mobile video monitoring system shows that the technology can ensure that multimedia applications of a mobile phone can smoothly interpolate high-resolution images for playing.
Owner:SOUTH CHINA UNIV OF TECH

Method for realizing automatic pipeline parallelism

InactiveCN101944014ABalance workloadAdded optimization capabilities for automatic parallel optimizationConcurrent instruction executionArray data structureThread scheduling
The invention belongs to the technical field of program compilation and in particular relates to a method for realizing automatic pipeline parallelism. The method of the invention mainly comprises the following steps of: (1) identification of the pipeline parallelism, namely judging a loop structure which is provided with cross-loop iteration dependence and a dependence distance vector is a constant; (2) synchronization among threads, namely inserting the synchronization according to the dependence distance vector and deleting the redundant synchronization with the same distance vector; and (3) thread scheduling in a static step length, namely self-defining a thread scheduling strategy for balancing the workload of each thread and reducing the communication expense. The type identification of the loop structure is depended on the conventional array data stream analysis and dependence tests, while the pipeline parallelism only processes the regular loop structure with backward cross-loop iteration. The synchronization expense of the pipeline parallelism is high, so the pipeline parallelism is only performed on the outmost layer of a nested loop. Profit of the pipeline parallelism depends on programs, the number of the cyclic iteration is larger and the dependence distance is longer, the performance promotion is greater. The method for realizing the pipeline parallelism improvesthe capacity of automatic parallel optimization and contributes to further improving the performance of scientific calculation programs.
Owner:FUDAN UNIV

Layering modeling and optimizing method targeting complicated manufacture system

The invention provides a layering modeling and optimizing method targeting a complicated manufacture system, which is used for the total target optimization of a layering manufacture system comprising a plurality of elements. The method comprises the following steps of: (1) dividing the manufacture system into a layering system comprising a plurality of elements, wherein each element comprises a computer and carries out data exchange with other elements by a computer interface; (2) confirming the reaction, a contact variable and a local variable of all the elements and leading all elements to contact mutually; (3) establishing models of all the elements, which comprise an optimizing design model and an analyzing model; and (4) carrying out optimization solving on the models of the layering manufacture system. The invention can realize the modeling and optimizing unification of the layering manufacture system comprising the elements and has the advantages of consistence with the traditional manufacture system topological structure, capability of parallel optimization, unlimited layering grade number, and the like, and all the elements can select different optimization algorithms, therefore, different optimization algorithms can be integrated into one system.
Owner:SOUTH CHINA UNIV OF TECH

Predicate-based automatic parallel optimizing method

InactiveCN101944040AEliminate data dependenciesAutomatic Parallel Optimization ImplementationProgram controlMemory systemsData streamArray data structure
The invention belongs to the technical field of program compilation, in particular to a predicate-based automatic parallel optimizing method. The method mainly comprises the following: (1) a step of predicate establishment, which is to establish a parallel predicate of a program by using different kinds of known information of a user program, and remove simple dependence of the program; and (2) a step of parallel loop structure establishment, which is to perform subsequent parallel analysis and judge whether to use the parallel predicate or not under the restriction of a predicate condition. The parallel predicate establishment is based on the traditional array data flow analysis and loop dependence test. The loop simple dependence caused by imprecise loop information is eliminated through predicate establishment, so that the analysis range and the parallel optimization effect of the traditional automatic parallel optimization are widened and improved respectively. In the actual execution of the program, if the predicate is not satisfactory, the program executes the original serial version and the increased judgment operation and skip operation hardly influence the overall performance of the program; and if the predicate is satisfactory, the parallel version of the loop structure is executed, thus the program performance is obviously improved.
Owner:FUDAN UNIV
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products