Patents
Literature
Patsnap Copilot is an intelligent assistant for R&D personnel, combined with Patent DNA, to facilitate innovative research.
Patsnap Copilot

239results about How to "Improve locality" patented technology

Parallel Array Architecture for a Graphics Processor

A parallel array architecture for a graphics processor includes a multithreaded core array including a plurality of processing clusters, each processing cluster including at least one processing core operable to execute a pixel shader program that generates pixel data from coverage data; a rasterizer configured to generate coverage data for each of a plurality of pixels; and pixel distribution logic configured to deliver the coverage data from the rasterizer to one of the processing clusters in the multithreaded core array. The pixel distribution logic selects one of the processing clusters to which the coverage data for a first pixel is delivered based at least in part on a location of the first pixel within an image area. The processing clusters can be mapped directly to the frame buffers partitions without a crossbar so that pixel data is delivered directly from the processing cluster to the appropriate frame buffer partitions. Alternatively, a crossbar coupled to each of the processing clusters is configured to deliver pixel data from the processing clusters to a frame buffer having a plurality of partitions. The crossbar is configured such that pixel data generated by any one of the processing clusters is deliverable to any one of the frame buffer partitions.
Owner:NVIDIA CORP

Method and apparatus for rasterizing in a hierarchical tile order

A method and apparatus for efficiently rasterizing graphics is provided. The method is intended to be used in combination with a frame buffer that provides fast tile-based addressing. Within this environment, frame buffer memory locations are organized into a tile hierarchy. For this hierarchy, smaller low-level tiles combine to form larger mid-level tiles. Mid-level tiles combine to form high-level tiles. The tile hierarchy may be expanded to include more levels, or collapsed to included fewer levels. A graphics primitive is rasterized by selecting an starting vertex. The low-level tile that includes the starting vertex is then rasterized. The remaining low-level tiles that are included in the same mid-level tile as the starting vertex are then rasterized. Rasterization continues with the mid-level tiles that are included in the same high-level tile as the starting vertex. These mid-level tiles are rasterized by rasterizing their component low-level tiles. The rasterization process proceeds bottom-up completing at each lower level before completing at higher levels. In this way, the present invention provides a method for rasterizing graphics primitives that accesses memory tiles in an orderly fashion. This reduces page misses within the frame buffer and enhances graphics performance.
Owner:MICROSOFT TECH LICENSING LLC

Method optimizing sparse matrix vector multiplication to improve incompressible pipe flow simulation efficiency

ActiveCN103984527AImprove data locality and cache hit ratioFew influencing factorsConcurrent instruction executionData transmissionDecomposition
The invention discloses a method optimizing sparse matrix vector multiplication to improve incompressible pipe flow simulation efficiency. The method uses a QCST storage structure to combine with the advantages of a quadtree structure and a CSR storage structure to operate recursion decomposition and rearrangement to a sparse matrix to realize the storage of the sparse matrix, so that the sparse matrix vector multiplication operating process has the universality to the matrix form, particularly is suitable for the matrix with the whole being sparse and the local part being provided with a plurality of dense sub-matrixes. The method realizes the sparse matrix vector multiplication based on the QCSR storage structure through four strategies of thread mapping optimization, data storage optimization, data transmission optimization and data reusing optimization in a CPU/GPU (central processing unit/graphics processing unit) heterogeneous parallel system. The method has the advantages that the data locality and the cache hit rate in the sparse matrix vector multiplication value calculating process are improved, and the better calculating acceleration and the whole acceleration effect are obtained, so that the incompressible pipe flow simulation efficiency is improved.
Owner:HANGZHOU DIANZI UNIV

Data block balancing method in operation process of HDFS (Hadoop Distributed File System)

The invention discloses a data block balancing method in an operation process of an HDFS (Hadoop Distributed File System). The method comprises the following steps of: at first, pre-processing local task lists of nodes, and dividing the local task list of each node into entirely local tasks and non-entirely local tasks, so as to provide the basis for starting data block balance judgment of the HDFS; secondly, carrying out estimation and task request prediction on an operation rate of each node; thirdly, designing and realizing an assignment process of each node after completing said steps; fourthly, selecting proper nodes to move a data block between the proper nodes, so that the distribution of the data block can be matched with a predicted node task request sequence; and finally, balancing the data block. With the adoption of the data block balancing method, non-local map task execution which is possible to occur is judged by predicting the node task request in advance, and the proper data block is moved between the corresponding nodes, so that the distribution response of the local map tasks can be obtained when the nodes send an actual task request. Therefore, the completion efficiency of a Map step can be improved.
Owner:XI AN JIAOTONG UNIV
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products