Patents
Literature
Patsnap Copilot is an intelligent assistant for R&D personnel, combined with Patent DNA, to facilitate innovative research.
Patsnap Copilot

232 results about "Speedup" patented technology

In computer architecture, speedup is a number that measures the relative performance of two systems processing the same problem. More technically, it is the improvement in speed of execution of a task executed on two similar architectures with different resources. The notion of speedup was established by Amdahl's law, which was particularly focused on parallel processing. However, speedup can be used more generally to show the effect on performance after any resource enhancement.

RRGS-round-robin greedy scheduling for input/output terabit switches

A novel protocol for scheduling of packets in high-speed cell based switches is provided. The switch is assumed to use a logical cross-bar fabric with input buffers. The scheduler may be used in optical as well as electronic switches with terabit capacity. The proposed round-robin greedy scheduling (RRGS) achieves optimal scheduling at terabit throughput, using a pipeline technique. The pipeline approach avoids the need for internal speedup of the switching fabric to achieve high utilization. A method for determining a time slot in a NxN crossbar switch for a round robin greedy scheduling protocol, comprising N logical queues corresponding to N output ports, the input for the protocol being a state of all the input-output queues, output of the protocol being a schedule, the method comprising: choosing input corresponding to i=(constant-k-1)mod N, stopping if there are no more inputs, otherwise choosing the next input in a round robin fashion determined by i=(i+1)mod N; choosing an output j such that a pair (i,j) to a set C={(i,j)| there is at least one packet from I to j}, if the pair (i,j) exists; removing i from a set of inputs and repeating the steps if the pair (i,j) does not exist; removing i from the set of inputs and j from a set of outputs; and adding the pair (i,j) to the schedule and repeating the steps.
Owner:NEC CORP

Method and apparatus to schedule packets through a crossbar switch with delay guarantees

A method for scheduling cell transmissions through a switch with rate and delay guarantees and with low jitter is proposed. The method applies to a classic input-buffered N×N crossbar switch without speedup. The time axis is divided into frames each containing F time-slots. An N×N traffic rate matrix specifies a quantized guaranteed traffic rate from each input port to each output port. The traffic rate matrix is transformed into a permutation with NF elements which is decomposed into F permutations of N elements using a recursive and fair decomposition method. Each permutation is used to configure the crossbar switch for one time-slot within a frame of size F time-slots, and all F permutations result in a Frame Schedule. In the frame schedule, the expected Inter-Departure Time (IDT) between cells in a flow equals the Ideal IDT and the delay jitter is bounded and small. For fixed frame size F, an individual flow can often be scheduled in O(logN) steps, while a complete reconfiguration requires O(NlogN) steps when implemented in a serial processor. An RSVP or Differentiated Services-like algorithm can be used to reserve bandwidth and buffer space in an IP-router, an ATM switch or MPLS switch during a connection setup phase, and the proposed method can be used to schedule traffic in each router or switch. Best-effort traffic can be scheduled using any existing dynamic scheduling algorithm to fill the remaining unused switch capacity within each Frame. The scheduling algorithm also supports multicast traffic.
Owner:SZYMANSKI TED HENRYK

Nonblocking and deterministic multirate multicast packet scheduling

A system for scheduling multirate multicast packets through an interconnection network having a plurality of input ports, a plurality of output ports, and a plurality of input queues, comprising multirate multicast packets with rate weight, at each input port is operated in nonblocking manner in accordance with the invention by scheduling corresponding to the packet rate weight, at most as many packets equal to the number of input queues from each input port to each output port. The scheduling is performed so that each multicast packet is fan-out split through not more than two interconnection networks and not more than two switching times. The system is operated at 100% throughput, work conserving, fair, and yet deterministically thereby never congesting the output ports. The system performs arbitration in only one iteration, with mathematical minimum speedup in the interconnection network. The system operates with absolutely no packet reordering issues, no internal buffering of packets in the interconnection network, and hence in a truly cut-through and distributed manner. In another embodiment each output port also comprises a plurality of output queues and each packet is transferred corresponding to the packet rate weight, to an output queue in the destined output port in deterministic manner and without the requirement of segmentation and reassembly of packets even when the packets are of variable size. In one embodiment the scheduling is performed in strictly nonblocking manner with a speedup of at least three in the interconnection network. In another embodiment the scheduling is performed in rearrangeably nonblocking manner with a speedup of at least two in the interconnection network. The system also offers end to end guaranteed bandwidth and latency for multirate multicast packets from input ports to output ports. In all the embodiments, the interconnection network may be a crossbar network, shared memory network, clos network, hypercube network, or any internally nonblocking interconnection network or network of networks.
Owner:TEAK TECH

A GEMM (general matrix-matrix multiplication) high-performance realization method based on a domestic SW 26010 many-core CPU

ActiveCN107168683ASolve the problem that the computing power of slave cores cannot be fully utilizedImprove performanceRegister arrangementsConcurrent instruction executionFunction optimizationAssembly line
The invention provides a GEMM (general matrix-matrix multiplication) high-performance realization method based on a domestic SW 26010 many-core CPU. For a domestic SW many-core processor 26010, based on the platform characteristics of storage structures, memory access, hardware assembly lines and register level communication mechanisms, a matrix partitioning and inter-core data mapping method is optimized and a top-down there-level partitioning parallel block matrix multiplication algorithm is designed; a slave core computing resource data sharing method is designed based on the register level communication mechanisms, and a computing and memory access overlap double buffering strategy is designed by using a master-slave core asynchronous DMA data transmission mechanism; for a single slave core, a loop unrolling strategy and a software assembly line arrangement method are designed; function optimization is achieved by using a highly-efficient register partitioning mode and an SIMD vectoring and multiplication and addition instruction. Compared with a single-core open-source BLAS math library GotoBLAS, the function performance of the high-performance GEMM has an average speed-up ratio of 227. 94 and a highest speed-up ratio of 296.93.
Owner:INST OF SOFTWARE - CHINESE ACAD OF SCI +1

Composite type device utilizing ocean wave energy for generating electricity

ActiveCN101614180AIncrease profitGood power generation stabilityMachines/enginesEngine componentsElectricityImpeller
The invention provides a composite type device utilizing the ocean wave energy for generating electricity, comprising a first energy acquisition device, a second energy acquisition device and an electricity generation device, wherein the first energy acquisition device comprises a floating body, a vane wheel, a bracket, a speedup transmission device, a supporting platform; and the second energy acquisition device comprises an air cylinder, an air intake one way valve, an air exhaust one way valve, an air storage tank, a steam turbine, a fixed platform, etc. The invention is characterized in that: firstly, the ocean heading face is provided with the floating body on which the vane wheel is fixed. The movement of waves propels the blades to rotate and drive an intermediate shaft of the vane wheel to rotate together, and drives the generator to generate the electricity by the speedup transmission device; secondly, when the floating body floats upwards, the air is injected into the air cylinder by driving a connecting rod and a piston; and when the floating body falls, the air in the air cylinder is compressed into the air storage tank. When the air pressure in the air storage tank generates a certain value, the pressure valve is automatically opened to push the steam turbine to rotate, and then the generator can be driven to generate the electrical energy. The electricity generation device applies two energy acquiring method, which can double efficient utilize the ocean for generating electricity with good economic benefit.
Owner:ADVANCED MFG TECH CENT CHINA ACAD OF MASCH SCI & TECH

Nonblocking and deterministic unicast packet scheduling

A system for scheduling unicast packets through an interconnection network having a plurality of input ports, a plurality of output ports, and a plurality of input queues, comprising unicast packets, at each input port is operated in nonblocking manner in accordance with the invention by scheduling at most as many packets equal to the number of input queues from each input port to each output port. The system is operated at 100% throughput, work conserving, fair, and yet deterministically thereby never congesting the output ports. The system performs arbitration in only one iteration, with mathematical minimum speedup in the interconnection network. The system operates with absolutely no packet reordering issues, no internal buffering of packets in the interconnection network, and hence in a truly cut-through and distributed manner. In another embodiment each output port also comprises a plurality of output queues and each packet is transferred to an output queue in the destined output port in nonblocking and deterministic manner and without the requirement of segmentation and reassembly of packets even when the packets are of variable size. In one embodiment the scheduling is performed in strictly nonblocking manner with a speedup of at least two in the interconnection network. In another embodiment the scheduling is performed in rearrangeably nonblocking manner with a speedup of at least one in the interconnection network. The system also offers end to end guaranteed bandwidth and latency for packets from input ports to output ports. In all the embodiments, the interconnection network may be a crossbar network, shared memory network, clos network, hypercube network, or any internally nonblocking interconnection network or network of networks.
Owner:TEAK TECH

Method and apparatus to schedule packets through a crossbar switch with delay guarantees

A method for scheduling cell transmissions through a switch with rate and delay guarantees and with low jitter is proposed. The method applies to a classic input-buffered N×N crossbar switch without speedup. The time axis is divided into frames each containing F time-slots. An N×N traffic rate matrix specifies a quantized guaranteed traffic rate from each input port to each output port. The traffic rate matrix is transformed into a permutation with NF elements which is decomposed into F permutations of N elements using a recursive and fair decomposition method. Each permutation is used to configure the crossbar switch for one time-slot within a frame of size F time-slots, and all F permutations result in a Frame Schedule. In the frame schedule, the expected Inter-Departure Time (IDT) between cells in a flow equals the Ideal IDT and the delay jitter is bounded and small. For fixed frame size F, an individual flow can often be scheduled in O(log N) steps, while a complete reconfiguration requires O(N log N) steps when implemented in a serial processor. An RSVP or Differentiated Services-like algorithm can be used to reserve bandwidth and buffer space in an IP-router, an ATM switch or MPLS switch during a connection setup phase, and the proposed method can be used to schedule traffic in each router or switch. Best-effort traffic can be scheduled using any existing dynamic scheduling algorithm to fill the remaining unused switch capacity within each Frame. The scheduling algorithm also supports multicast traffic.
Owner:SZYMANSKI TED HENRYK

VLSI layouts of fully connected generalized and pyramid networks with locality exploitation

VLSI layouts of generalized multi-stage and pyramid networks for broadcast, unicast and multicast connections are presented using only horizontal and vertical links with spacial locality exploitation. The VLSI layouts employ shuffle exchange links where outlet links of cross links from switches in a stage in one sub-integrated circuit block are connected to inlet links of switches in the succeeding stage in another sub-integrated circuit block so that said cross links are either vertical links or horizontal and vice versa. Furthermore the shuffle exchange links are employed between different sub-integrated circuit blocks so that spacially nearer sub-integrated circuit blocks are connected with shorter links compared to the shuffle exchange links between spacially farther sub-integrated circuit blocks. In one embodiment the sub-integrated circuit blocks are arranged in a hypercube arrangement in a two-dimensional plane. The VLSI layouts exploit the benefits of significantly lower cross points, lower signal latency, lower power and full connectivity with significantly fast compilation.The VLSI layouts with spacial locality exploitation presented are applicable to generalized multi-stage and pyramid networks, generalized folded multi-stage and pyramid networks, generalized butterfly fat tree and pyramid networks, generalized multi-link multi-stage and pyramid networks, generalized folded multi-link multi-stage and pyramid networks, generalized multi-link butterfly fat tree and pyramid networks, generalized hypercube networks, and generalized cube connected cycles networks for speedup of s≧1. The embodiments of VLSI layouts are useful in wide target applications such as FPGAs, CPLDs, pSoCs, ASIC placement and route tools, networking applications, parallel & distributed computing, and reconfigurable computing.
Owner:KONDA TECH

Method for fast realizing signal processing of passive radar based on GPU (Graphics Processing Unit)

The invention discloses a method for fast realizing the signal processing of passive radar based on a GPU (Graphics Processing Unit). The method comprises the following steps of at a link of direct wave and clutter suppression, dividing whole data into N data blocks and dividing the data blocks into L data segments, wherein the number of data points within each data segment is M; splicing the data with the same data segment number, of each data block together in a way that the current segment of the No. N data block is spliced with the next segment of the first data block; enabling M*N threads by the GPU for parallel processing; at a link of coherent accumulation and walk correction, segmenting the data in a way that all the data are divided into n segments; grouping the continuous segments in a way that every DIM segments form one group; splicing the all groups of data together in sequence and storing such data in a continuous address space of a GPU video memory; and finally, carrying out GPU parallel processing on the each group of data. The method for fast realizing the signal processing of the passive radar based on the GPU is more applicable for the GPU parallel processing, and the higher speedup ratio can enable real-time processing demands to be met.
Owner:INST OF ELECTRONICS CHINESE ACAD OF SCI

VLSI Layouts of Fully Connected Generalized and Pyramid Networks with Locality Exploitation

VLSI layouts of generalized multi-stage and pyramid networks for broadcast, unicast and multicast connections are presented using only horizontal and vertical links with spacial locality exploitation. The VLSI layouts employ shuffle exchange links where outlet links of cross links from switches in a stage in one sub-integrated circuit block are connected to inlet links of switches in the succeeding stage in another sub-integrated circuit block so that said cross links are either vertical links or horizontal and vice versa. Furthermore the shuffle exchange links are employed between different sub-integrated circuit blocks so that spacially nearer sub-integrated circuit blocks are connected with shorter links compared to the shuffle exchange links between spacially farther sub-integrated circuit blocks. In one embodiment the sub-integrated circuit blocks are arranged in a hypercube arrangement in a two-dimensional plane. The VLSI layouts exploit the benefits of significantly lower cross points, lower signal latency, lower power and full connectivity with significantly fast compilation.
The VLSI layouts with spacial locality exploitation presented are applicable to generalized multi-stage and pyramid networks, generalized folded multi-stage and pyramid networks, generalized butterfly fat tree and pyramid networks, generalized multi-link multi-stage and pyramid networks, generalized folded multi-link multi-stage and pyramid networks, generalized multi-link butterfly fat tree and pyramid networks, generalized hypercube networks, and generalized cube connected cycles networks for speedup of s≧1. The embodiments of VLSI layouts are useful in wide target applications such as FPGAs, CPLDs, pSoCs, ASIC placement and route tools, networking applications, parallel & distributed computing, and reconfigurable computing.
Owner:KONDA TECH

VLSI layouts of fully connected generalized and pyramid networks with locality exploitation

VLSI layouts of generalized multi-stage and pyramid networks for broadcast, unicast and multicast connections are presented using only horizontal and vertical links with spacial locality exploitation. The VLSI layouts employ shuffle exchange links where outlet links of cross links from switches in a stage in one sub-integrated circuit block are connected to inlet links of switches in the succeeding stage in another sub-integrated circuit block so that said cross links are either vertical links or horizontal and vice versa. Furthermore the shuffle exchange links are employed between different sub-integrated circuit blocks so that spacially nearer sub-integrated circuit blocks are connected with shorter links compared to the shuffle exchange links between spacially farther sub-integrated circuit blocks. In one embodiment the sub-integrated circuit blocks are arranged in a hypercube arrangement in a two-dimensional plane. The VLSI layouts exploit the benefits of significantly lower cross points, lower signal latency, lower power and full connectivity with significantly fast compilation. The VLSI layouts with spacial locality exploitation presented are applicable to generalized multi-stage and pyramid networks, generalized folded multi-stage and pyramid networks, generalized butterfly fat tree and pyramid networks, generalized multi-link multi-stage and pyramid networks, generalized folded multi-link multi-stage and pyramid networks, generalized multi-link butterfly fat tree and pyramid networks, generalized hypercube networks, and generalized cube connected cycles networks for speedup of s≧1. The embodiments of VLSI layouts are useful in wide target applications such as FPGAs, CPLDs, pSoCs, ASIC placement and route tools, networking applications, parallel & distributed computing, and reconfigurable computing.
Owner:KONDA TECH

Isolating type self-oscillating flyback converter with a soft start loop

An isolating type self-oscillating flyback converter is disclosed, which includes a coupled transformer, a FET, a transistor and an electro-optical coupled isolating feedback unit, wherein the input terminal of the circuit is connected to the source of the FET through a primary winding of the coupled transformer, the input terminal of the circuit is connected to the collector of the transistor through a resistor R1 and another resistor R2, the source of the FET is connected to the collector of the transistor, one branch of the drain of the FET is connected to the ground through a resistor, while the other branch is connected to the base of the transistor through the parallel connection body of a resistor and a capacitor, the base of the transistor is connected to the output terminal of a secondary output winding of the coupled transformer through the electro-optical coupled isolating feedback unit; the series connection joint between the said resistor R1 and the said resistor R2 is connected to the ground through a speedup capacitor and a secondary winding of the coupled transformer; a loop for implementing the soft start is connected between the said input terminal of the circuit and the series connection joint. Thus the start current of the invention is small and the converter can keep working normally when the input voltage is high.
Owner:MORNSUN GUANGZHOU SCI & TECH
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products