Patents

Literature

PatSnap Eureka AI that helps you search prior art, draft patents, and assess FTO risks, powered by patent and scientific literature data.

71results about How to "Reduce clock cycles" patented technology

Filter

Efficacy Topic

Property

Owner

Technical Advancement

Application Domain

Technology Topic

Technology Field Word

Patent Country/Region

Patent Type

Patent Status

Application Year

Inventor

Data encryption and decryption processing method and device based on Montgomery modular multiplication operation

InactiveCN102207847AReduce clock cyclesImprove efficiencyComputations using residue arithmeticComputer hardwareMontgomery reduction

The invention discloses a data encryption and decryption processing method and device based on Montgomery modular multiplication operation. During the data encryption or decryption processing, the modular exponentiation operation is transformed to the Montgomery modular multiplication operation which is divided into an external circulation part and an internal circulation part. The internal circulation part mainly performs the multiplication operation, while the external circulation part mainly conducts the reduction operation. After the external circulation operation and the internal circulation operation are completed, the operation result is reduced to a range smaller than a modulus P if the operation result is determined to be larger than the modulus P, and otherwise, the operation result is outputted directly. For the hardware implementation, the internal circulation and the external circulation are designed to a parallel processing pipeline architecture, which can reduce the clock period consumed by one modular multiplication operation and can improve the overall modular multiplication efficiency, thereby improving the efficiency and the speed for data encryption and decryption.

Data encryption and decryption processing method and device based on Montgomery modular multiplication operation

Data encryption and decryption processing method and device based on Montgomery modular multiplication operation

Data encryption and decryption processing method and device based on Montgomery modular multiplication operation

Owner:GCI SCI & TECH

LDPC (low density parity check) decoder and decoding method based on layer decoding processing

InactiveCN102394661AImprove throughputReduce clock cyclesError correction/detection using multiple parity bitsResource consumptionControl signal

The invention relates to the technical field of channel coding and discloses an LDPC (low density parity check) decoder based on layer decoding processing, which comprises a node updating processing unit, a variable node storage unit, a check node storage unit, an addressing and offset control ROM (read only memory) configuration group, a cyclic shifting network, an output buffer, an address signal controller, and a decoding time sequence controller, wherein the node updating processing unit is used for carrying out updating-operation on check node information and variable node information, the variable node storage unit is used for storing initialized information and updated variable node information, the check node storage unit is used for storing updated check node information, the addressing and offset control ROM configuration group is used for configuring various control signals, the cyclic shifting network is used for carrying out cyclic shifting on data read by variable nodes, the output buffer is used for storing judgment bits during decoding process, the address signal controller is used for addressing storage units used in LDPC decoder, and the decoding time sequence controller is used for fixing the sequence for the operation of the node updating processing unit, and controlling the data exchange among the node updating processing unit and the storage units. The invention can solve the problems that hardware in the common decoder is large in resource consumption, high in interconnection complex degree and not easy for universality.

LDPC (low density parity check) decoder and decoding method based on layer decoding processing

LDPC (low density parity check) decoder and decoding method based on layer decoding processing

LDPC (low density parity check) decoder and decoding method based on layer decoding processing

Owner:BEIJING UNIV OF POSTS & TELECOMM

High frame frequency sun sensor and implementing method thereof

InactiveCN101363732AIncrease frame rateReduce background pixel areaNavigation by astronomical meansHigh frame rateFacula

The invention discloses a high frame rate sun sensor. The sensor realizes high data update rate of the ROI-based window dynamic tracking through an image acquisition module, a calculation module, a prediction module and a first control module. The invention further discloses a realization method of the high frame rate sun sensor; the method utilizes the current solar facula image and coordinates of center of mass of a facula which is corresponding to a same small hole obtained by the former solar facula image to predict the position of the facula which is corresponding to the small hole in the next solar facula image, thereby realizing the ROI window dynamic tracking; compared with the sun sensor which is based on the full-frame image collection, the high frame rate sun sensor greatly reduces data of a background pixel region. The use of the high frame rate sun sensor and the realization method can reduce the amount of the data which needs to be processed for obtaining attitude information and improve the data update rate of the sun sensor.

High frame frequency sun sensor and implementing method thereof

High frame frequency sun sensor and implementing method thereof

High frame frequency sun sensor and implementing method thereof

Owner:BEIHANG UNIV

Write-once read-many disc internal memory

ActiveCN102610269AKeep reading and writing characteristicsReduce areaDigital storageInternal memoryAddress decoder

The invention discloses a write-once read-many disc internal memory, comprising a write address decoder, d (d>1) gate clock logic circuits, d first level main latches, n d input inter-stage transmission circuits, n second-level sub latches, a read data output path, and a read address decoding and same address judging module, wherein if the d first-level main latches are negative latches, the n second-level sub latches are positive latches; and if the d first-level main latches are positive latches, the n second-level sub latches are negative latches. The area of the memory unit close to the latches is realized, simultaneously the reading and writing features of the file in a register are held, that is, the same memory unit can be accessed by any reading port and any writing port in a same clock period.

Write-once read-many disc internal memory

Write-once read-many disc internal memory

Write-once read-many disc internal memory

Owner:INST OF SEMICONDUCTORS - CHINESE ACAD OF SCI

Method for decreasing data access delay in stream processor

InactiveCN1885283ALower latencyAvoid diversionGeneral purpose stored program computerMultiple digital computer combinationsMemory hierarchyScalar processor

The invention relates to a method for reducing the data access delay in flow processor, wherein the invention improves the first memory layer of flow processor, combines scalar DRAM and flow DRAM into chip external share memory shared by scalar processor and flow processor; and uses new method to transmit data flow between chip external share DRAM and flow register document SRF; and uses synchronous mechanism to relate RAW; the scalar processor and flow processor directly send request to the bus when accessing chip external data, to obtain bus priority, and send the accessed address to DRAM controller; the DRAM controller accesses chip external DRAM to obtain the data of DRAM and feedback data to scalar processor or flow processor. The invention can avoid overflowing SRF caused by overlong flow, to avoid transferring data several times in memory space to reduce the data access delay.

Method for decreasing data access delay in stream processor

Method for decreasing data access delay in stream processor

Method for decreasing data access delay in stream processor

Owner:NAT UNIV OF DEFENSE TECH

Memory and storage method for video stream pixel-level data random real-time access

ActiveCN106201363AReduce clock cyclesQuality improvementMemory architecture accessing/allocationInput/output to record carriersStreaming dataData access

The invention discloses a memory and a storage method for video stream pixel-level data random real-time access. The memory is formed through the combined expansion of two groups of QDR2 storage modules, each group comprises two QSR2 storage modules; each group of storage module combination is provided with an image data odd line storage unit and an even line storage unit; the image data is respectively stored in the odd line storage unit and the even line storage unit, and then two groups of storage units are used for respectively storing one-half odd frame and one-half even frame to parallel work through the adoption of a ping-pong buffer mode, thereby realizing the video stream real-time processing. Each image data access is performed according to the read-write mode of two data outburst mode to splice into 3 pixel data structures with each pixel data structure bandwidth as 24bit. Each write is to repeatedly write the last pixel of the last time except the first write of each line in the data write process. Each redundantly stored pixel data is used for simultaneously reading / writing four pixel values through the adoption of one pixel clock cycle, the clock cycle of the video stream data random real-time access is greatly lowered, and the memory space occupation and the cost are lowered.

Memory and storage method for video stream pixel-level data random real-time access

Memory and storage method for video stream pixel-level data random real-time access

Memory and storage method for video stream pixel-level data random real-time access

Owner:SICHUAN UNIV

AVS-based motion estimation apparatus and searching method

InactiveCN101227611AImprove reusabilityMeet real-time requirementsTelevision systemsDigital video signal modificationProcessor registerComputer science

The invention provides a motion estimation device and a searching method on the basis of an AVS, wherein a calculating and data preserving array which is contained in the device is formed by a calculating unit array and a registering apparatus array, the searching method comprises the following steps: reading current macro module pixel and reference pixel line-by-line in parallel into the calculating unit array and the registering apparatus array, and then carrying out a SAD calculation to each pixel searching point of a first line, then beginning to carry out the SAD calculation on a second line from the last pixel searching point until the pixel of the line is finished through changing the array of pixel after the first line is finished, and simultaneously finishing all the calculation of the reference pixel in the same way. The method can decrease reduplicate read-in of the pixel, improve reuse rate of the pixel, and therefore improve coding efficiency.

AVS-based motion estimation apparatus and searching method

AVS-based motion estimation apparatus and searching method

AVS-based motion estimation apparatus and searching method

Owner:CENT ACADEME OF SVA GROUP

Circuit and method for measuring length of single-particle turning transient pulse

PendingCN107422193ASave hardware resourcesImprove detection accuracyPulse characteristics measurementsComputational physicsEngineering

The invention discloses a circuit and a method for measuring the length of a single-particle turning transient pulse and aims to provide a circuit and a method for measuring the length of a single-particle turning transient pulse by adopting a counter, wherein the number of electronic components is reduced and the measuring precision is improved. According to the technical point of the invention, the circuit comprises a counting circuit, a controller and a clock circuit. The clock circuit is used for providing a counting clock to the counting circuit. The counting circuit is used for receiving a to-be-detected signal, starting the counting operation upon detecting the first-type jumping edge of the to-be-detected signal, and stopping the counting operation upon detecting the second-type jumping edge of the to-be-detected signal, wherein the first-type jumping edge and the second-type jumping edge are opposite to each other. The controller is used for receiving the counting result of the counting circuit. The product of the counting result and the period of the counting clock is calculated to obtain the length of the single-particle turning transient pulse.

Circuit and method for measuring length of single-particle turning transient pulse

Circuit and method for measuring length of single-particle turning transient pulse

Circuit and method for measuring length of single-particle turning transient pulse

Owner:CHENGDU UNIV OF INFORMATION TECH +1

Execution method for microcontroller instruction set

InactiveCN104090740AOvercome the problem of complicated coding methodsSimple codingMemory systemsMachine execution arrangementsMicrocontrollerComputer architecture

The invention discloses an execution method for a microcontroller instruction set. The execution method comprises the following steps: carrying out instruction address computation by a microcontroller, obtaining the address of an instruction, to be executed, stored in a program counter, wherein the length of the address of the instruction is fixed 16 bits; carrying out a fetch instruction by the microcontroller, and obtaining the 16-bit instruction in the address of the instruction to be executed; coding by the microcontroller, translating a binary system operation code field in the 16-bit instruction into an output signal, and indicating the function of the instruction to be executed; carrying out address computation by the microcontroller, and obtaining the operand address in the instruction to be executed; obtaining one or more operands through the operand address, carrying out function operation appointed by the instruction on the one or more operands by the microcontroller; ending when no operand is returned; carrying out computation of the operand address if operands are returned. The coding for the instruction set adopts the length of fixed 16 bits, so that the speed of the fetch instruction is improved.

Execution method for microcontroller instruction set

Execution method for microcontroller instruction set

Execution method for microcontroller instruction set

Owner:ANHUI NORMAL UNIV

High-speed and low-delay Berlekamp-Massey iteration decoding circuit for broadcast channel (BCH) decoder

ActiveCN101777922AReduce solution delayImprove data throughputCyclic codesBroadcast channelsLow delay

The invention discloses a high-speed and low-delay Berlekamp-Massey iteration decoding circuit for a broadcast channel (BCH) decoder. The iteration decoding circuit comprises: calculating an odd number syndrome through received BCH coding input data; calculating an even number syndrome by utilizing an even number syndrome stepwise calculating method, and outputting the syndromes in parallel according to a certain sequence through a syndrome sequencing circuit; and finally calculating an error position equation through a parallel iteration decoding circuit. Compared with the prior art, the decoding circuit has very low delay, and improves the error correcting capability and the bandwidth of the BCH decoder.

High-speed and low-delay Berlekamp-Massey iteration decoding circuit for broadcast channel (BCH) decoder

High-speed and low-delay Berlekamp-Massey iteration decoding circuit for broadcast channel (BCH) decoder

High-speed and low-delay Berlekamp-Massey iteration decoding circuit for broadcast channel (BCH) decoder

Owner:MEMBLAZE TECH BEIJING

48*30 bit multiplier based on Booth algorithm

InactiveCN102999311AReduce clock cyclesReduce consumptionComputation using non-contact making devicesSystems designBinary multiplier

The invention discloses a 48*30 bit multiplier based on a Booth algorithm. The multiplier comprises a Bit displacement control mode, a partial product generator, a linear accumulation controller and a partial product linear accumulator. According to the invention, partial products are firstly generated by the multiplier and then accumulated and summed to realize the multiplying operation of 48*30 bits, wherein the partial products are generated by using a Radix-4Booth algorithm and are processed in a linear summing manner, so that the number of the partial products to be summed is reduced greatly. Therefore, a clock period required for realization of the multiplying operation is shortened, and the algorithms of multiplication with symbolic numbers and without symbolic numbers are unified. As a result, the traditional serial multiplier and the traditional parallel multiplier for processing the consumption of bandwidth and hardware resources are compromised. Consequently, the hardware resource consumption in design is lowered greatly and the cost is saved obviously on the basis that the multiplier with a high bit width satisfies the requirement of the bandwidth speed of system design.

48*30 bit multiplier based on Booth algorithm

48*30 bit multiplier based on Booth algorithm

48*30 bit multiplier based on Booth algorithm

Owner:张友能

Encryption, decryption and expansion method and device, encryption and decryption system, and terminal

PendingCN109617671AImprove efficiencyReduce the number of judgmentsEncryption apparatus with shift registers/memoriesHigh level techniquesDisk encryptionExtension method

The embodiment of the invention discloses an encryption, decryption and expansion method and device, an encryption and decryption system and a terminal, wherein the encryption method comprises the following steps: determining a round key needed by multi-round iterative operation of an encryption algorithm; using the round key to carry out multi-round iterative operation of the encryption algorithmon plaintext data to be encrypted, wherein the at least two rounds of iterative operation of the encryption algorithm are completed in one clock cycle. According to the technical scheme in the embodiment of the invention, the encryption and decryption efficiency is improved, and the attack resistance of the encryption algorithm is improved.

Encryption, decryption and expansion method and device, encryption and decryption system, and terminal

Encryption, decryption and expansion method and device, encryption and decryption system, and terminal

Encryption, decryption and expansion method and device, encryption and decryption system, and terminal

Owner:CHENGDU HAIGUANG INTEGRATED CIRCUIT DESIGN CO LTD

Finite field square calculation circuit

ActiveCN103138770AReduce solution delayImprove data throughputCyclic codesExclusive orComputer science

The invention discloses a square calculation circuit of a generator field 213(GF 213), and provides a finite field square calculation circuit which comprises a plurality of exclusive-or gates, thirteen signal input ends and thirteen signal output ends.

Finite field square calculation circuit

Finite field square calculation circuit

Finite field square calculation circuit

Owner:MEMBLAZE TECH BEIJING

Serial transmission device

InactiveCN102163418AReduce clock cyclesCathode-ray tube indicatorsSerial transferData signal

The invention relates to a serial transmission device, comprising a plurality of data transmission devices which are connected in series. Each data transmission device in serial connection comprises a multi-bit latch, a shifting cache and a switching module, wherein the multi-bit latch is used for providing parallel data, the shifting cache is used for receiving and outputting the parallel data to the multi-bit latch according to a bolt locking signal, and the switching module is used for providing a received data signal and a received time signal to the shifting cache, providing a latching signal to the multi-bit latch before a first clock period (containing a plurality of clocks) is finished after a synchronization signal is received as well as stopping providing of the latch signal to the multi-bit latch and providing a synchronization signal to a data transmission device of the next level before a second clock period is started.

Serial transmission device

Serial transmission device

Serial transmission device

Owner:STARCHIPS TECH

Port allocation method, device and equipment and readable storage medium

ActiveCN111708634AWork fasterImprove work efficiencyResource allocationData portDistribution method

The invention discloses a port allocation method. The method comprises the steps: obtaining multiple pieces of data and port matching information corresponding to the multiple pieces of data; constructing a matching matrix by utilizing the port matching information; determining a target vector in the matching matrix, wherein the target vector is a data vector or a port vector; allocating a targetdata port to the target data according to the target vector, updating the matching matrix, and re-determining the target vector until corresponding data ports are allocated to all the data. Accordingto the method, data and ports with poor matching capability can be preferentially matched, and other data and ports are matched in a mode of updating a matching matrix and determining a target vectoruntil all the data are distributed to the data ports meeting the requirements; since backtracking is not needed, the speed of determining the final allocation scheme is increased; in addition, the invention further provides a device, equipment and a computer readable storage medium, and the device, the equipment and the computer readable storage medium also have the above beneficial effects.

Port allocation method, device and equipment and readable storage medium

Port allocation method, device and equipment and readable storage medium

Port allocation method, device and equipment and readable storage medium

Owner:SHANDONG YUNHAI GUOCHUANG CLOUD COMPUTING EQUIP IND INNOVATION CENT CO LTD

Circuit time sequence optimization method based on register flexible time sequence library

ActiveCN113673193AFully reflect timing characteristicsImprove performanceComputer aided designSpecial data processing applicationsCapacitanceStatic timing analysis

The invention discloses a circuit time sequence optimization method based on a register flexible time sequence library, which comprises the following steps of: firstly, simulating a register under the conditions of multiple groups of input signal conversion time, clock signal conversion time and register load capacitance respectively, and obtaining corresponding actual propagation delay at the moment by changing establishment relaxation and keeping relaxation of the register; obtaining specific input signal conversion time, clock signal conversion time and register load capacitance through linear interpolation, and establishing and keeping actual propagation delay of the register under relaxation, so that a flexible time sequence library of the register is established; then, performing static time sequence analysis on all register paths in the circuit by utilizing the library, and finding a minimum clock period meeting the condition that the establishment time margin and the retention time margin are both greater than zero by changing the establishment relaxation and the retention relaxation of the register; therefore, the circuit performance is improved under the condition that the circuit design is not changed and the circuit area overhead is not increased.

Circuit time sequence optimization method based on register flexible time sequence library

Circuit time sequence optimization method based on register flexible time sequence library

Circuit time sequence optimization method based on register flexible time sequence library

Owner:SOUTHEAST UNIV +1

Multilevel folding-interpolation type analog-digital converter and decoding method thereof

ActiveCN106656189AReduce clock cyclesSimplify complexityAnalogue-digital convertersDecoding methodsProduction line

The invention discloses a multilevel folding-interpolation type ADC (Analog-Digital Converter) and a decoding method thereof. The decoding method comprises the steps of, by a current-level decoding structure, carrying out folding-interpolation on a folding curve of an upper-level decoding structure and multiplying the output value of the upper-level decoding structure by odd times of weighting by utilizing a weight adder; summing the decoding result of the current-level decoding structure and the output value of the current-level weight adder by utilizing an inter-level adder; transmitting the summing result which is regarded as output value of the current-level decoding structure to the next-level decoding structure, wherein the output value of the last-level decoding structure is regarded as analog-digital conversion result. The decoding method utilizes a mode of a multilevel production line framework and adopts a production line form for decoding. Final quantification can be completed by adding results together after each level of decoding is multiplied by a corresponding weight, and then enabling the summing result to be subjected to the decimal-to-binary conversion logic. As the production line is utilized, the decoding cycle is saved, the complex degree of the decoding circuit can be simplified to the greatest extent, and the problem that the decoding is too difficult due to that an odd number cannot be simplified into a 2n form can be solved efficiently.

Multilevel folding-interpolation type analog-digital converter and decoding method thereof

Multilevel folding-interpolation type analog-digital converter and decoding method thereof

Multilevel folding-interpolation type analog-digital converter and decoding method thereof

Owner:INST OF MICROELECTRONICS CHINESE ACAD OF SCI

Two-step incremental analog-digital converter and two-step conversion method

ActiveCN106130556ASmall sizeLow bandwidthAnalogue-digital convertersIntegratorDigital down converter

The invention provides a two-step incremental analog-digital converter. The two-step incremental analog-digital converter comprises a trigonometric integral modulator and a decimation filter, wherein the trigonometric integral modulator is composed of an adding circuit, N-1 cascade integrators, stepped selection integrators, a comparator, re-selectors and a digital-analog converter; the decimation filter is composed of an N-order digital filter and an decimator; one input end of the first re-selector is connected to an input signal Vin; the output end of the first re-selector is connected to the adding circuit; the output end of the adding circuit is connected to the N-1 cascade integrators; the output ends of the N-1 cascade integrators are connected to the input end of the second re-selector; the output end of the second re-selector is respectively connected with the first stepped selection integrator and the second stepped selection integrator; the output ends of the first stepped selection integrator and the second stepped selection integrator are connected to the input end of the third re-selector; simultaneously, the output end of the first stepped selection integrator is connected to another input end of the first re-selector; the output end of the third re-selector is connected to the comparator; and the output end of the comparator is respectively connected with the adding circuit and the decimation filter. The two-step incremental analog-digital converter disclosed by the invention is high in precision, low in power consumption, rapid to convert and high in signal-to-noise ratio.

Two-step incremental analog-digital converter and two-step conversion method

Two-step incremental analog-digital converter and two-step conversion method

Two-step incremental analog-digital converter and two-step conversion method

Owner:KUNMING INST OF PHYSICS

Data access method, device and system and AI accelerator

ActiveCN113448624AImprove performanceRead Control Difficulty ReducedDigital data processing detailsEnergy efficient computingConcurrent computationAccess method

The invention discloses a data access method, device and system and an AI accelerator. The method comprises the following steps: reading to-be-calculated data stored in a memory; writing the read data into a preset cache block group based on the number of multipliers in the processing array, so that the data of the same write-in address of each cache block group corresponds to different read addresses in a memory, and the data stored in each cache block group is completely different; and enabling the processing array to read the to-be-calculated data from the preset cache block group for parallel calculation. By adopting the scheme, the overall performance of the AI accelerator can be improved.

Data access method, device and system and AI accelerator

Data access method, device and system and AI accelerator

Data access method, device and system and AI accelerator

Fishbone-shaped clock tree and implementation method

ActiveCN113723046AReduce buffer unitReduce power consumptionComputer aided designEnergy efficient computingComputer architectureParallel computing

The invention belongs to the technical field of clock trees, and discloses a fishbone-shaped clock tree and an implementation method. The fishbone-shaped clock tree comprises: a main clock tree and a plurality of sub-clock trees, wherein the main clock tree is led out from a PLL, the sub-clock trees are led out from the main clock tree, and each sub-clock tree is provided with a plurality of clock branch points; each clock branch point is used as a source of a traditional clock tree to establish the traditional clock tree. The fishbone-shaped clock tree and the implementation method have the beneficial effects that a fishbone-shaped clock tree structure is established by establishing the main clock tree and the sub-clock trees, so that buffer units between a PLL and a chip sub-module are reduced, and the power consumption of chips is reduced; in addition, the main clock tree and the sub-clock trees also play a role of a common path, the OCV occupies a few clock cycles, and the final time sequence convergence is easy.

Fishbone-shaped clock tree and implementation method

Fishbone-shaped clock tree and implementation method

Fishbone-shaped clock tree and implementation method

Owner:广芯微电子(广州)股份有限公司

H.264 protocol based optimized decoding method for intra-frame coding compression technology

InactiveCN102186082ASimple structureReduce clock cyclesTelevision systemsDigital video signal modificationIntra-frameComputer engineering

The invention discloses an H.264 protocol based optimized decoding method for an intra-frame coding compression technology. According to the intra-frame predictive coding principle, the optimized decoding method is provided at a decoding end. The design structure aims to save a clock period and simplify a computing unit, and decodes the input code stream to reduce the pixels of the original image. After the predictive coding characteristics of two macro blocks and the algorithm are researched and analyzed, the process is simplified, analyzed and reasonably arranged, and parallel decoding is adopted according to the positions of the macro blocks and the calling characteristics of reference pixels during decoding, so that the purpose of realizing accurate and efficient image decoding is fulfilled. The method has the advantages that: 1, the clock period is saved; 2, the requirement of real-time decoding response can be adapted; 3, the computing unit and the circuit structure are simplified to facilitate update; and 4, the decoding process is accelerated by adopting a parallel decoding mechanism.

H.264 protocol based optimized decoding method for intra-frame coding compression technology

H.264 protocol based optimized decoding method for intra-frame coding compression technology

H.264 protocol based optimized decoding method for intra-frame coding compression technology

Owner:HARBIN INST OF TECH

Fast inter-core data synchronization method for multi-core parallel computing

PendingCN111459872AReduce the number of operationsReduce clock cyclesDigital computer detailsElectric digital data processingData bufferConcurrent computation

The invention provides a fast inter-core data synchronization method for multi-core parallel computing, which comprises the following steps of: configuring a buffer area and a buffer area data structure between every two directly connected cores, and setting a buffer area write-in indicator and a buffer area read indicator in each buffer area data structure; initializing a buffer area writing indicator and a buffer area reading indicator; enabling all the cores to execute the steps of reading the inter-core data and writing the inter-core data at the same time; and repeating the steps of reading the inter-core data and writing the inter-core data. According to the method, the buffer area write-in indicator and the buffer area read indicator are configured at the same time, the frequency ofoperating the prefetcher and the data cache can be reduced, and therefore the efficiency of the multi-core parallel computing scheduling method is improved while correct transmission of data is guaranteed.

Fast inter-core data synchronization method for multi-core parallel computing

Fast inter-core data synchronization method for multi-core parallel computing

Fast inter-core data synchronization method for multi-core parallel computing

Owner:SHANGHAI INST OF MICROSYSTEM & INFORMATION TECH CHINESE ACAD OF SCI

Data encryption and decryption processing method and device based on Montgomery modular multiplication operation

InactiveCN102207847BReduce clock cyclesImprove efficiencyComputations using residue arithmeticComputer hardwareMontgomery reduction

The invention discloses a data encryption and decryption processing method and device based on Montgomery modular multiplication operation. During the data encryption or decryption processing, the modular exponentiation operation is transformed to the Montgomery modular multiplication operation which is divided into an external circulation part and an internal circulation part. The internal circulation part mainly performs the multiplication operation, while the external circulation part mainly conducts the reduction operation. After the external circulation operation and the internal circulation operation are completed, the operation result is reduced to a range smaller than a modulus P if the operation result is determined to be larger than the modulus P, and otherwise, the operation result is outputted directly. For the hardware implementation, the internal circulation and the external circulation are designed to a parallel processing pipeline architecture, which can reduce the clock period consumed by one modular multiplication operation and can improve the overall modular multiplication efficiency, thereby improving the efficiency and the speed for data encryption and decryption.

Data encryption and decryption processing method and device based on Montgomery modular multiplication operation

Data encryption and decryption processing method and device based on Montgomery modular multiplication operation

Data encryption and decryption processing method and device based on Montgomery modular multiplication operation

Owner:GCI SCI & TECH

Eight-point Winograd Fourier convertor avoiding rearrangement

InactiveCN107168928AReduce consumptionLow costComplex mathematical operationsDiagonal matrixEuclidean vector

The invention relates to an eight-point Winograd Fourier transformer without reordering, which is characterized in that the processor is mainly composed of an input matrix I, a variable diagonal matrix A, an output matrix O and complex multipliers M1 to M3. Partial composition. The input matrix I is multiplied by the complex multiplier M1 and the input vector v to obtain the vector p, the variable diagonal matrix A is multiplied by the complex multiplier M2 and the vector p to obtain the vector q, and the output matrix O is multiplied by the complex multiplier M3 and the vector q Multiply to get the output vector V. The invention eliminates the reordering operation involved in the eight-point Winograd Fourier transform in the algorithm of N-point indexed iso-sequence prime factors, simplifies the control logic, improves the operation speed, saves the memory consumption and reduces the hardware cost.

Eight-point Winograd Fourier convertor avoiding rearrangement

Eight-point Winograd Fourier convertor avoiding rearrangement

Eight-point Winograd Fourier convertor avoiding rearrangement

Owner:RONGCHENG DINGTONG ELECTRONICS INFORMATION SCI & TECH CO LTD

Lookup table type convolution operation hardware structure based on FPGA

InactiveCN110059814AEasy to operateLower latencyNeural architecturesArchitecture with single central processing unitHardware structureFpga implementations

The invention relates to a lookup table type convolution operation hardware structure based on a field programmable gate array (FPGA). N multiplication operations in convolution operation are disassembled, N pieces of M-bit data with the same offset are added, the sum of the N pieces of M-bit data is stored in a lookup table, and M lookup tables are formed. After a calculation result of the lookuptable is obtained, the result is sent to an adder tree with shift operation, and a convolution result is calculated. Compared with a traditional method, the hardware consumption of the structure saves more than 50% of LUT resources (FPGA implementation), and the lookup table type convolution operation hardware structure has the advantages of being easy to deploy, convenient to reuse and the like.

Lookup table type convolution operation hardware structure based on FPGA

Lookup table type convolution operation hardware structure based on FPGA

Lookup table type convolution operation hardware structure based on FPGA

Owner:SUN YAT SEN UNIV

Information processing device

InactiveCN101213514AAvoid the inconvenience of repeated accessReduce memory access timeNext instruction address formationConcurrent instruction executionInformation processingParallel computing

The information processing device of the present invention is a device having an access unit for controlling access to a memory corresponding to an address space to which an address generated using at least two address generation source information belongs, and further comprising: Generate source information, a prediction unit that predicts one or more address spaces that the address of the access object may belong to; through the prediction unit, start using the memory corresponding to all the address spaces predicted by the prediction unit A starting unit for accessing; a judging unit for judging the address space to which the address of the access object belongs, which is generated by using at least two address generation source information; Among them, access stop means corresponding to the stop of access to the access unit other than the access to the address space judged by the judgment means is used.

Information processing device

Information processing device

Information processing device

Owner:PANASONIC CORP

Random access memory initialization

InactiveCN100452232CAvoid resetReduce clock cyclesData switching by path configurationDigital storageMemory addressRandom access memory

The invention discloses a random access memory initialization method and circuit. The circuit includes a memory having a set of address spaces to which tag data is written and read from, each address space having tag status bits, and a tag generator to allocate tag data to the memory address spaces. When a reset occurs, this logic circuit provides a logic "0" to the flag generator to disable the status bit check control so that all flags can be issued sequentially. New tag data is sequentially allocated to this address space, and the respective status bits are updated or maintained at logic "1". When all address space has been allocated, this logic provides the actual state of this status bit to the flag generator to control subsequent allocations. The method and circuit of the present invention avoid all resets of status bits in the memory and thereby save the clock cycles required to initialize these status bits.

Random access memory initialization

Random access memory initialization

Random access memory initialization

Owner:INT BUSINESS MASCH CORP

VLSI system structure of bit plane encoder

InactiveCN101488225BReduce accessReduce clock cyclesImage codingTelevision systemsOn columnComputer architecture

The invention discloses a VLSI system framework of a bit plane coder and comprises a master controller, a channel attribution judging device, a register controller, a register set, each channel coding module, and modules such as a coding primitive module and the like. The system framework realizes coding primitive operation by adopting a mode of a combined circuit to improve generation speed of acontext judgment pair, directly controls and realizes run coding logic in a clearing channel module to simplify a circuit structure, and adopts a coefficient jump method based on column treatment when scanning each channel to reduce clock period wasted in a scanning process greatly, so as to enhance coding speed.

VLSI system structure of bit plane encoder

VLSI system structure of bit plane encoder

VLSI system structure of bit plane encoder

Owner:SHANDONG UNIV

Modular operation method with variable bit width, and modular operation circuit

PendingCN113723035AFlexible and configurableImprove efficiencyDigital data processing detailsCAD circuit designAlgorithmHemt circuits

The invention relates to a modular operation method with the variable bit width, and a modular operation circuit. Firstly, a modulus is shifted leftwards, the left shifting result and the modulus are combined with the bit width of an operand, N+1 intervals are formed according to the sequence from large to small, one bit is shifted leftwards, the minimum value of the first interval is subtracted from the operand, and the size of the obtained result and the value is judged; when the result is greater than the value, the result is used for replacing the operand to continue subtraction operation until the obtained result is smaller than the operand, then next interval operation is carried out, if the result is smaller than the operand at the beginning, the next interval is directly entered, the operation is repeated until the last interval is entered, subtraction and comparison operation is carried out, and modular operation is completed; and when left shifting is carried out by two or more bits, the interval where the operand is located is judged, then the operand skips to the corresponding interval, and c subtraction and comparison operation are directly performed on the later interval according to the previous process; and corresponding circuits need to be configured with a plurality of shifters, registers, subtracters and selectors.

Modular operation method with variable bit width, and modular operation circuit

Modular operation method with variable bit width, and modular operation circuit

Modular operation method with variable bit width, and modular operation circuit

Owner:XI AN JIAOTONG UNIV

Method and device for processing video data stream, equipment and storage medium

ActiveCN112995237AImprove computing efficiencyReduce clock cyclesDigital video signal modificationTransmissionData packConcurrent computation

The invention can provide the method and device for processing the video data stream, the equipment and the storage medium. The method for processing the video data stream can comprise, but is not limited to, at least one of the following steps: reading a video basic code stream, wherein the video basic code stream comprises a plurality of network abstraction layer units; segmenting the network abstraction layer unit into first data packets, wherein each first data packet comprises a plurality of signal source symbols; taking the first preset number of first data packets as a packet group, and respectively obtaining a second preset number of signal source symbols from each first data packet in the packet group; and performing parallel coding processing on the second preset number of signal source symbols to obtain a second data packet. According to the method, the encoding and decoding processes are comprehensively optimized through two aspects of parallel computing and data access mode optimization, so that the technical purposes of increasing the encoding and decoding processing speed and the like are achieved, and the problem of low encoding and decoding speed is fundamentally solved.

Method and device for processing video data stream, equipment and storage medium

Method and device for processing video data stream, equipment and storage medium

Method and device for processing video data stream, equipment and storage medium

Owner:杭州博雅鸿图视频技术有限公司

Popular searches

High speed Reduce data volume Reduce processing time Realized area Reduce the number of memory accesses Take advantage of Reduce space consumption Promote market development Increase overhead Cutting costs