Techniques for texture filtering using a refracted ray cone

By generating and rendering refracted ray cones using ray cone tracing technology, the problem of simulating light refraction in virtual scenes is solved, achieving more realistic image rendering and lower computational costs, making it suitable for video games, film production, and architectural design.

CN115485733BActive Publication Date: 2026-06-19NVIDIA CORP

Patent Information

Authority / Receiving Office
CN · China
Patent Type
Patents(China)
Current Assignee / Owner
NVIDIA CORP
Filing Date
2022-01-21
Publication Date
2026-06-19

AI Technical Summary

Technical Problem

Existing ray cone tracing technology cannot reasonably simulate light refraction within virtual scenes, resulting in unrealistic rendered images and high computational costs.

Method used

By tracing the ray cone and calculating the direction and width of the refracted light in a two-dimensional coordinate system, a refracted ray cone is generated. Isotropic and anisotropic texture filtering is then performed during the rendering process, and the image is rendered using the refracted ray cone.

Benefits of technology

It achieves more realistic image rendering effects while reducing computational costs, demonstrating higher efficiency and accuracy, especially in video games, film production, and architectural design.

✦ Generated by Eureka AI based on patent content.

Smart Images

  • Figure CN115485733B_ABST
    Figure CN115485733B_ABST
Patent Text Reader

Abstract

An embodiment of a method for rendering one or more graphic images includes: tracing a ray cone through a three-dimensional (3D) graphic scene; generating a refracted ray cone based on the ray cone and a two-dimensional (2D) coordinate system; and rendering the graphic image based on the refracted ray cone.
Need to check novelty before this filing date? Find Prior Art

Description

[0001] Cross-references to related applications

[0002] This application claims priority to U.S. Provisional Patent Application No. 63 / 141,355, filed January 25, 2021, entitled “Texture Filtering Techniques for Refracted Ray Cones,” and to U.S. Patent Application No. 17 / 329,737, filed May 25, 2021, entitled “Techniques for Texture Filtering Using Refracted Ray Cones.” The subject matter of these related applications is incorporated herein by reference. Technical Field

[0003] The embodiments of this disclosure generally relate to computer science and computer graphics, and more specifically, to techniques for texture filtering using refracting ray cones. Background Technology

[0004] In 3D computer graphics, ray tracing is a popular technique for rendering images, such as frames in movies or video games. Ray tracing tracks the paths of light rays and simulates the effects of these rays interacting with virtual objects in a virtual scene. Ray cone tracing is similar to ray tracing, but instead of light rays, it tracks cones within the scene. Ray cone tracing can solve various sampling and aliasing problems that negatively impact traditional ray tracing techniques. Furthermore, ray cone tracing is computationally less expensive than some well-known ray tracing techniques such as differential ray tracing and covariance tracing.

[0005] Refraction is the change in the direction of light as it travels from one medium to another, and the two media have different refractive indices. The refractive index of a particular medium is related to the speed at which light travels through that medium, which in turn depends on the density of the medium. For example, light bends in one direction when it travels from air into a denser medium (such as glass), and in the opposite direction when it travels from a denser medium into air.

[0006] Currently, there is no ray cone tracing technique that can reasonably simulate the refraction of light within a virtual scene. Therefore, when the virtual scene in which the rendered image is located includes objects constructed from media that cause light refraction, the image rendered using current ray cone tracing techniques may appear unrealistic.

[0007] As mentioned earlier, there is a need in the field for more effective techniques to render graphics scenes using ray cone tracing. Summary of the Invention

[0008] One embodiment of this disclosure illustrates a computer-implemented method for rendering one or more images. The method includes tracing ray cones through a three-dimensional (3D) graphics scene. The method also includes generating refracted ray cones based on the ray cones and a two-dimensional (2D) coordinate system. Furthermore, the method includes rendering the graphics image based on the refracted ray cones.

[0009] Other embodiments of this disclosure include, but are not limited to, one or more computer-readable media, including instructions for performing one or more aspects of the disclosed technology, and one or more computing systems for performing one or more aspects of the disclosed technology.

[0010] At least one technical advantage of the disclosed technology over existing technologies lies in that it implements refracting ray cones that can be used to render more realistic images of virtual scenes, including objects constructed from media that cause light refraction. Furthermore, the disclosed technology uses ray cone tracing, which has a lower computational cost than many ray tracing techniques that can be used to trace refracted rays, such as differential ray tracing. These technical advantages represent one or more technical improvements over existing methods. Attached Figure Description

[0011] To gain a detailed understanding of the features described above in the various embodiments, the inventive concepts briefly summarized above can be described in more detail with reference to various embodiments (some of which are shown in the accompanying drawings). However, it should be noted that the drawings illustrate only typical embodiments of the inventive concepts and should not be construed as limiting the scope in any way, and that other equally effective embodiments exist.

[0012] Figure 1 This is a block diagram illustrating a computer system configured to implement one or more aspects of this embodiment;

[0013] Figure 2 According to various embodiments Figure 1 A block diagram of the parallel processing units included in the parallel processing subsystem;

[0014] Figure 3 According to various embodiments Figure 2 A block diagram of the general-purpose processing cluster included in the parallel processing unit;

[0015] Figure 4 This is a block diagram illustrating an exemplary cloud computing system according to various embodiments;

[0016] Figure 5 Exemplary ray cones are shown being tracked through a virtual 3D scene according to various embodiments;

[0017] Figure 6 Methods for calculating the refracted ray cone according to various embodiments are shown;

[0018] Figure 7 An approximation of the refracted ray cone according to various embodiments is shown;

[0019] Figure 8A Exemplary images rendered using refracting ray cones and isotropic texture filtering according to various embodiments are shown;

[0020] Figure 8B Exemplary images rendered using refracting ray cones and anisotropic texture filtering according to various other embodiments are shown;

[0021] Figure 8C Exemplary ground-based images according to the prior art are shown;

[0022] Figure 9A Exemplary images rendered using refracting ray cones and isotropic texture filtering according to various embodiments are shown;

[0023] Figure 9B An exemplary image rendered using full-resolution textures according to existing technology is shown;

[0024] Figure 10 This is a flowchart of method steps for calculating pixel color using a ray cone tracing technique that implements a refracted ray cone, according to various embodiments; and

[0025] Figure 11 This is a flowchart of method steps for generating a refracting ray cone according to various embodiments. Detailed Implementation

[0026] In the following description, numerous specific details are set forth to provide a more thorough understanding of the various embodiments. However, it will be apparent to those skilled in the art that the inventive concept can be practiced without one or more of these specific details.

[0027] General Overview

[0028] Embodiments of this disclosure provide an improved ray cone tracing technique for realizing refracted ray cones. The improved ray cone tracing technique has numerous practical applications, including video games, film production rendering, architectural and design applications, and any other application that can use ray cone tracing to render images. In the improved ray cone tracing technique, a refracted ray cone is generated when a ray cone traced through a virtual 3D scene hits a geometric surface within the scene and refracts. This is achieved by: (1) calculating the direction of the intermediate ray and the direction of the top and bottom edges of the refracted ray cone in a 2D coordinate system; and (2) given such calculations, further calculating the width and expansion angle of the refracted ray cone. The refracted ray cone is then traced through the scene. Furthermore, isotropic texture filtering can be performed before generating the refracted ray cone, and anisotropic texture filtering can be performed using the refracted ray cone and any subsequent ray cones to determine the color of pixels in the rendered image.

[0029] The ray cone tracing technique disclosed herein has numerous practical applications. For example, ray cone tracing can be used to efficiently render images and / or frames in video games. As a specific example, ray cone tracing can be performed by a cloud-based graphics processing platform, such as a cloud-based gaming platform that executes video games and streams video of game sessions to client devices. The disclosed ray cone tracing technique is computationally more efficient and / or can render more realistic images than some other techniques such as differential ray tracing, conventional ray cone tracing, and rasterization-based techniques.

[0030] As another example, ray cone tracing can be used for production-quality rendering of films. The creation of computer-generated imagery (CGI) and special effects in animated films, as well as live-action films, often requires high-quality rendering of the film's frames. The disclosed ray cone tracing technique can be used to render film frames more efficiently and / or more accurately than some other techniques such as differential ray tracing and traditional ray cone tracing.

[0031] As yet another example, the disclosed ray cone tracing technique can be used to render designs of architectural structures and other objects. Architectural and design applications often provide renderings to show how a particular design would look in real life. The disclosed ray cone tracing technique can be used to render design images more efficiently and / or more accurately than some other techniques, such as differential ray tracing and traditional ray cone tracing.

[0032] The examples above are not intended to be limiting in any way. As those skilled in the art will understand, in general, the ray cone tracing techniques described herein can be implemented in any application currently employing conventional ray tracing and / or ray cone tracing techniques.

[0033] System Overview

[0034] Figure 1 This is a block diagram illustrating a computer system 100 configured to implement one or more aspects of this embodiment. As those skilled in the art will understand, the computer system 100 can be any type of technically feasible computer system, including but not limited to server machines, server platforms, desktop computers, laptop computers, handheld / mobile devices, or wearable devices. In some embodiments, the computer system 100 is a server machine operating in a data center or cloud computing environment that provides scalable computing resources as a service over a network.

[0035] In various embodiments, computer system 100 includes, but is not limited to, a central processing unit (CPU) 102 and system memory 104 coupled to parallel processing subsystem 112 via memory bridge 105 and communication path 113. Memory bridge 105 is further coupled to I / O (input / output) bridge 107 via communication path 106, and I / O bridge 107 is in turn coupled to switch 116.

[0036] In one embodiment, I / O bridge 107 is configured to receive user input from an optional input device 108 (such as a keyboard or mouse) and forward the input to CPU 102 for processing via communication path 106 and memory bridge 105. In some embodiments, computer system 100 may be a server machine in a cloud computing environment. In these embodiments, computer system 100 may not have input device 108. Instead, computer system 100 may receive equivalent input by receiving commands in the form of messages transmitted over a network and received via network adapter 118. In one embodiment, switch 116 is configured to provide connectivity between I / O bridge 107 and other components of computer system 100, such as network adapter 118 and various add-on cards 120 and 121.

[0037] In one embodiment, I / O bridge 107 is coupled to system disk 114, which can be configured to store content, applications, and data for use by CPU 102 and parallel processing subsystem 112. In one embodiment, system disk 114 provides non-volatile storage for applications and data and may include fixed or removable hard disk drives, flash memory devices, and CD-ROMs (optical disc read-only memory), DVD-ROMs (digital versatile optical disc-ROMs), Blu-ray, HD-DVDs (high-definition DVDs), or other magnetic, optical, or solid-state storage devices. In various embodiments, other components such as universal serial buses or other port connections, optical disc drives, digital versatile optical disc drives, film recording devices, etc., may also be connected to I / O bridge 107.

[0038] In various embodiments, memory bridge 105 may be a Northbridge chip, and I / O bridge 107 may be a Southbridge chip. Furthermore, communication paths 106 and 113, as well as other communication paths within computer system 100, may be implemented using any technically suitable protocol, including but not limited to AGP (Accelerated Graphics Port), HyperTransport, or any other bus or point-to-point communication protocol known in the art.

[0039] In some embodiments, the parallel processing subsystem 112 includes a graphics subsystem that transmits pixels to an optional display device 110, which can be any conventional cathode ray tube, liquid crystal display, light-emitting diode display, etc. In these embodiments, the parallel processing subsystem 112 incorporates circuitry optimized for graphics and video processing, including, for example, video output circuitry. As follows... Figure 2-3 As described in more detail herein, such circuitry can be combined across one or more parallel processing units (PPUs) (also referred to herein as parallel processors) included within the parallel processing subsystem 112. In other embodiments, the parallel processing subsystem 112 includes circuitry optimized for general and / or computational processing. Again, such circuitry can be combined across one or more PPUs included within the parallel processing subsystem 112, configured to perform such general and / or computational operations. In yet another embodiment, one or more PPUs included within the parallel processing subsystem 112 may be configured to perform graphics processing, general processing, and computational processing operations. System memory 104 includes at least one device driver configured to manage the processing operations of one or more PPUs within the parallel processing subsystem 112. Furthermore, system memory 104 includes a rendering application 130. Rendering application 130 can be any technically feasible application that renders a virtual 3D scene via the ray cone tracing techniques disclosed herein. For example, rendering application 130 can be a game application or a rendering application used in filmmaking. Although this document is primarily described with respect to rendering application 130, the techniques disclosed herein can also be implemented, in whole or in part, in other software and / or hardware, such as in parallel processing subsystem 112.

[0040] In various embodiments, the parallel processing subsystem 112 can be coupled with... Figure 1 One or more other components can be integrated to form a single system. For example, the parallel processing subsystem 112 can be integrated with the CPU 102 and other interconnect circuitry on a single chip to form a system-on-a-chip (SoC).

[0041] In one embodiment, CPU 102 is the main processor of computer system 100, used to control and coordinate the operation of other system components. In one embodiment, CPU 102 issues commands to control the operation of PPUs. In some embodiments, as known in the art, communication path 113 is a PCI Express link in which a dedicated channel is allocated to each PPU. Other communication paths may also be used. The PPU advantageously implements a highly parallel processing architecture. The PPU can be equipped with any number of local parallel processing memories (PP memories).

[0042] It should be understood that the system illustrated herein is illustrative and can be changed and modified. The connection topology can be modified as needed, including the number and arrangement of bridges, the number of CPUs 102, and the number of parallel processing subsystems 112. For example, in some embodiments, system memory 104 may be directly connected to CPU 102 instead of via memory bridge 105, and other devices will communicate with system memory 104 via memory bridge 105 and CPU 102. In other embodiments, parallel processing subsystems 112 may be connected to I / O bridge 107 or directly to CPU 102 instead of via memory bridge 105. In other embodiments, I / O bridge 107 and memory bridge 105 may be integrated into a single chip rather than existing as one or more discrete devices. In some embodiments, they may not exist. Figure 1 One or more components are shown. For example, switch 116 can be removed, and network adapter 118 and add-on cards 120, 121 can be directly connected to I / O bridge 107. Finally, in some embodiments, Figure 1 One or more components shown can be implemented as virtualized resources in a virtual computing environment, such as a cloud computing environment. Specifically, in some embodiments, the parallel processing subsystem 112 can be implemented as a virtualized parallel processing subsystem. For example, the parallel processing subsystem 112 can be implemented as a virtual graphics processing unit (GPU) that renders graphics on a virtual machine (VM) that executes on a server machine whose GPU and other physical resources are shared across multiple VMs.

[0043] Figure 2 According to various embodiments Figure 1 A block diagram of the parallel processing unit (PPU) 202 included in the parallel processing subsystem 112. Although Figure 2A PPU 202 is depicted, but as described above, the parallel processing subsystem 112 may include any number of PPUs 202. As shown, the PPU 202 is coupled to a local parallel processing (PP) memory 204. The PPU 202 and PP memory 204 may be implemented using one or more integrated circuit devices (such as programmable processors, application-specific integrated circuits (ASICs), or storage devices), or in any other technically feasible manner.

[0044] In some embodiments, PPU 202 includes a GPU configured to implement a graphics rendering pipeline to perform various operations related to generating pixel data based on graphics data provided by CPU 102 and / or system memory 104. When processing graphics data, PPU 204 can be used as graphics memory, storing one or more regular frame buffers and (if needed) one or more other rendering targets. Among other things, PPU 204 can be used to store and update pixel data and transmit the final pixel data or display frame to optional display device 110 for display. In some embodiments, PPU 202 can also be configured for general processing and computational operations. In some embodiments, computer system 100 can be a server machine in a cloud computing environment. In these embodiments, computer system 100 may not have display device 110. Instead, computer system 100 can generate equivalent output information by sending commands in the form of messages over a network via network adapter 118.

[0045] In some embodiments, CPU 102 is the main processor of computer system 100, controlling and coordinating the operation of other system components. In one embodiment, CPU 102 issues commands to control the operation of PPU 202. In some embodiments, CPU 102 writes the command stream for PPU 202 into a data structure ( Figure 1 or Figure 2 (Not explicitly shown herein), the data structure may reside in system memory 104, PP memory 204, or another storage location accessible to both CPU 102 and PPU 202. A pointer to the data structure is written to a command queue, also referred to herein as a push buffer, to initiate processing of the command stream in the data structure. In one embodiment, PPU 202 reads the command stream from the command queue and then executes the commands asynchronously relative to the operation of CPU 102. In embodiments that generate multiple push buffers, the application may specify an execution priority for each push buffer via a device driver to control the scheduling of different push buffers.

[0046] In one embodiment, PPU 202 includes an I / O (input / output) unit 205 that communicates with the remainder of computer system 100 via communication path 113 and memory bridge 105. In one embodiment, I / O unit 205 generates packets (or other signals) for transmission on communication path 113 and also receives all incoming packets (or other signals) from communication path 113, directing the incoming packets to the appropriate components of PPU 202. For example, commands related to processing tasks may be directed to host interface 206, while commands related to memory operations (e.g., reading from or writing to PP memory 204) may be directed to crossbar switch unit 210. In one embodiment, host interface 206 reads each command queue and sends the command stream stored in the command queue to front end 212.

[0047] As mentioned above Figure 1 The connection between the PPU 202 and the rest of the computer system 100 can be varied. In some embodiments, the parallel processing subsystem 112, including at least one PPU 202, is implemented as an add-in card that can be inserted into an expansion slot of the computer system 100. In other embodiments, the PPU 202 can be integrated on a single chip with a bus bridge, such as memory bridge 105 or I / O bridge 107. Similarly, in other embodiments, some or all of the components of the PPU 202 can be included together with the CPU 102 in a single integrated circuit or system-on-a-chip (SoC).

[0048] In one embodiment, front-end 212 sends processing tasks received from host interface 206 to a work allocation unit (not shown) within task / work unit 207. In one embodiment, the work allocation unit receives pointers to processing tasks, which are encoded as Task Metadata (TMDs) and stored in memory. Pointers to TMDs are included in a command stream, stored as a command queue and received by front-end unit 212 from host interface 206. Processing tasks that can be encoded as TMDs include an index associated with the data to be processed, as well as state parameters and commands defining how the data is processed. For example, state parameters and commands can define a program to be executed on the data. Furthermore, for example, a TMD can specify the number and configuration of CTA sets. Typically, each TMD corresponds to one task. Task / work unit 207 receives tasks from front-end 212 and ensures that GPC 208 is configured to a valid state before initiating the processing task specified by each TMD. Priorities can be assigned to each TMD used to schedule the execution of processing tasks. Processing tasks can also be received from processing cluster array 230. Optionally, the TMD may include a parameter that controls whether the TMD is added to the head or tail of the list of processing tasks (or a list of pointers to processing tasks), thus providing another level of control over execution priority.

[0049] In one embodiment, PPU 202 implements a highly parallel processing architecture based on a processing cluster array 230 comprising a set of C general-purpose processing clusters (GPCs) 208, where C ≥ 1. Each GPC 208 is capable of executing a large number (e.g., hundreds or thousands) of threads simultaneously, where each thread is an instance of a program. In various applications, different GPCs 208 can be allocated to handle different types of programs or perform different types of computations. The allocation of GPCs 208 can vary depending on the workload generated by each type of program or computation.

[0050] In one embodiment, the memory interface 214 includes a set of D partition units 215, where D ≤ 1. Each partition unit 215 is coupled to one or more dynamic random access memories (DRAMs) 220 residing within the PPM memory 204. In some embodiments, the number of partition units 215 is equal to the number of DRAMs 220, and each partition unit 215 is coupled to a different DRAM 220. In other embodiments, the number of partition units 215 may differ from the number of DRAMs 220. Those skilled in the art will understand that the DRAMs 220 can be replaced by any other technically suitable storage device. In operation, various rendering targets, such as texture maps and framebuffers, can be stored across the DRAMs 220, thereby allowing the partition units 215 to write portions of each rendering target in parallel to efficiently utilize the available bandwidth of the PPM memory 204.

[0051] In one embodiment, a given GPC 208 can process data to be written to any DRAM 220 within the PP memory 204. In one embodiment, the crossbar switch unit 210 is configured to route the output of each GPC 208 to the input of any partition unit 215, or to any other GPC 208 for further processing. The GPC 208 communicates with the memory interface 214 via the crossbar switch unit 210 to read or write data from the various DRAMs 220. In some embodiments, the crossbar switch unit 210 has a connection to I / O unit 205, and also a connection to the PP memory 204 via the memory interface 214, thereby enabling processing cores within different GPCs 208 to communicate with system memory 104 or other memory not native to the PPU 202. Figure 2 In some embodiments, the crossbar switch unit 210 is directly connected to the I / O unit 205. In various embodiments, the crossbar switch unit 210 may use a virtual channel to separate the traffic flow between the GPC 208 and the partition unit 215.

[0052] In one embodiment, GPC 208 can be programmed to perform processing tasks relevant to a variety of applications, including but not limited to linear and nonlinear data transformations, filtering of video and / or audio data, modeling operations (e.g., applying physical laws to determine the position, velocity, and other properties of an object), image rendering operations (e.g., tessellation shaders, vertex shaders, geometry shaders, and / or pixel / fragment shader programs), general computational operations, etc. In operation, PPU 202 is configured to transfer data from system memory 104 and / or PP memory 204 to one or more on-chip memory units, process the data, and write the resulting data back to system memory 104 and / or PP memory 204. The resulting data can then be accessed by other system components, including CPU 102, another PPU 202 within parallel processing subsystem 112, or another parallel processing subsystem 112 within computer system 100.

[0053] In one embodiment, the parallel processing subsystem 112 may include any number of PPUs 202. For example, multiple PPUs 202 may be provided on a single add-on card, or multiple add-on cards may be connected to the communication path 113, or one or more PPUs 202 may be integrated into a bridge chip. The PPUs 202 in a multi-PPU system may be the same or different from each other. For example, different PPUs 202 may have different numbers of processing cores and / or different numbers of PP memories 204. In embodiments with multiple PPUs 202, these PPUs can operate in parallel to process data at a higher throughput than a single PPU 202 might achieve. Systems containing one or more PPUs 202 can be implemented in various configurations and form factors, including but not limited to desktops, laptops, handheld personal computers or other handheld devices, wearable devices, servers, workstations, game consoles, embedded systems, etc.

[0054] Figure 3 According to various embodiments Figure 2 A block diagram of the General Processing Cluster (GPC) 208 included in the Parallel Processing Unit (PPU) 202 is shown. As shown, the GPC 208 includes, but is not limited to, a pipeline manager 305, one or more texture units 315, a preROP unit 325, a work assignment crossbar switch 330, and an L1.5 cache 335.

[0055] In one embodiment, the GPC 208 can be configured to execute a large number of threads in parallel to perform graphics, general processing, and / or computational operations. As used herein, a "thread" refers to an instance of a specific program that executes on a particular set of input data. In some embodiments, Single Instruction Multiple Data (SIMD) instruction issuing techniques are used to support the parallel execution of a large number of threads without providing multiple independent instruction units. In other embodiments, Single Instruction Multiple Threading (SIMT) techniques are used to support the parallel execution of a large number of generally synchronous threads, which uses a common instruction unit configured to issue instructions to a set of processing engines within the GPC 208. Unlike SIMD execution mechanisms, where all processing engines typically execute the same instructions, SIMT execution allows different threads to more easily follow different execution paths through a given program. Those skilled in the art will understand that SIMD processing mechanisms represent a subset of the functionality of SIMT processing mechanisms.

[0056] In one embodiment, the operation of GPC 208 is controlled via pipeline manager 305, which distributes processing tasks received from work assignment units (not shown) within task / work unit 207 to one or more streaming multiprocessors (SMs) 310. Pipeline manager 305 can also be configured to control work assignment crossbar switch 330 by specifying the destination of processing data output by SM 310.

[0057] In various embodiments, GPC 208 includes a set of M SMs 310, where M ≥ 1. Furthermore, each SM 310 includes a set of functional execution units (not shown), such as execution units and load-memory units. Processing operations specific to any functional execution unit can be pipelined, allowing new instructions to be issued for execution before the previous instruction has completed. Any combination of functional execution units within a given SM 310 can be provided. In various embodiments, functional execution units can be configured to support a variety of different operations, including integer and floating-point operations (e.g., addition and multiplication), comparison operations, Boolean operations (AND, OR, XOR), bit shifting, and computation of various algebraic functions (e.g., plane interpolation and trigonometric functions, exponential and logarithmic functions, etc.). Advantageously, the same functional execution unit can be configured to perform different operations.

[0058] In one embodiment, each SM 310 is configured to process one or more thread groups. As used herein, a "thread group" or "thread bundle" refers to a group of threads that execute the same program simultaneously on different input data, with one thread in the group assigned to a different execution unit within the SM 310. A thread group may include fewer threads than the number of execution units within the SM 310, in which case some executions may be idle during the cycle in which the thread group is being processed. A thread group may also include more threads than the number of execution units within the SM 310, in which case processing may occur in consecutive clock cycles. Since each SM 310 can support up to G thread groups simultaneously, up to G*M thread groups can be executed in the GPC 208 at any given time.

[0059] Furthermore, in one embodiment, multiple related thread groups can be active simultaneously within the SM 310 (at different execution phases). This collection of thread groups is referred to herein as a “cooperative thread array” (“CTA”) or “thread array”. The size of a particular CTA is equal to m*k, where k is the number of threads executing concurrently in the thread group, which is typically an integer multiple of the number of execution units within the SM 310, and m is the number of thread groups active concurrently within the SM 310. In some embodiments, a single SM 310 can support multiple CTAs simultaneously, where such CTAs are at the granularity of work being assigned to the SM 310.

[0060] In one embodiment, each SM 310 includes a Level 1 (L1) cache or uses space in a corresponding L1 cache external to the SM 310 to support load and store operations, etc., performed by the execution unit. Each SM 310 may also access a Level 2 (L2) cache (not shown), which is shared among all GPCs 208 in the PPU 202. The L2 cache can be used to transfer data between threads. Finally, the SM 310 may also access off-chip “global” memory, which may include PP memory 204 and / or system memory 104. It should be understood that any memory external to the PPU 202 can be used as global memory. Furthermore, as Figure 3 As shown, a Level 1.5 (L1.5) cache 335 may be included within the GPC 208 and configured to receive and store data requested from memory by the SM 310 via the memory interface 214. Such data may include, but is not limited to, instructions, uniform data, and constant data. In embodiments where the GPC 208 has multiple SMs 310, the SMs 310 may advantageously share common instructions and data cached in the L1.5 cache 335.

[0061] In one embodiment, each GPC 208 may have an associated memory management unit (MMU) 320 configured to map virtual addresses to physical addresses. In various embodiments, the MMU 320 may reside within the GPC 208 or within the memory interface 214. The MMU 320 includes a set of page table entries (PTEs) for mapping virtual addresses to physical addresses of blocks or memory pages, and optionally cache line indexes. The MMU 320 may include an address translation back buffer (TLB) or cache, which may reside within the SM 310, one or more L1 caches, or within the GPC 208.

[0062] In one embodiment, in a graphics and computing application, the GPC 208 can be configured such that each SM310 is coupled to a texture unit 315 for performing texture mapping operations, such as determining texture sample locations, reading texture data, and filtering texture data.

[0063] In one embodiment, each SM 310 sends processed tasks to the job assignment crossbar switch 330 to provide the processed tasks to another GPC 208 for further processing, or stores the processed tasks in an L2 cache (not shown), parallel processing memory 204, or system memory 104 via the crossbar switch unit 210. Furthermore, the pre-raster operation (preROP) unit 325 is configured to receive data from the SM 310, direct the data to one or more raster operation (ROP) units within the partition unit 215, perform color blending optimization, organize pixel color data, and perform address translation.

[0064] It should be understood that the architecture described herein is illustrative and is subject to change and modification. Among other things, any number of processing units (such as SM 310, texture unit 315, or preROP unit 325) may be included within the GPC 208. Furthermore, as described above... Figure 2 The PPU 202 may include any number of GPCs 208, which are configured to be functionally similar to each other so that their performance is independent of which GPC 208 receives a particular processing task. Furthermore, each GPC 208 operates independently of the other GPCs 208 in the PPU 202 to perform tasks for one or more applications.

[0065] Figure 4 This is a block diagram illustrating an exemplary cloud computing system according to various embodiments. As shown, computing system 400 includes one or more servers 402 communicating with client device 404 via network 406. Each server 402 may include components as described above. Figure 1-3The example computer system 100 may contain components, features, and / or functions similar to those of the computer system 100. Each server 402 may be any technically feasible type of computer system, including but not limited to server machines or server platforms. Each client device 402 may also include components, features, and / or functions similar to those of the computer system 100, except that each client device 404 executes client application 422 instead of rendering application 130. Each client device 404 may be any technically feasible type of computer system, including but not limited to desktop computers, laptop computers, handheld / mobile devices, and / or wearable devices. In some embodiments, one or more of the server 402 and / or client devices 404 may be replaced by one or more virtualization processing environments, such as one or more virtualization processing environments provided by one or more VMs and / or containers running on one or more underlying hardware systems. One or more networks 406 may include one or more networks of any type, such as one or more local area networks (LANs) and / or wide area networks (WANs) (e.g., the Internet).

[0066] In some embodiments, one or more servers 400 may be included in a cloud computing system, such as a public cloud, private cloud, or hybrid cloud and / or distributed system. For example, one or more servers 400 may implement a cloud-based gaming platform that provides game streaming services, sometimes referred to as “cloud gaming,” “on-demand gaming,” or “games as a service.” In this case, the game stored and executed on one or more servers 400 is streamed as video to client device 404 via a client application 422 running on client device 404. During a game session, the client application 422 processes user input and sends that input to server 400 for in-game execution. Although the cloud-based gaming platform is described herein as a reference example, those skilled in the art will understand that, in general, one or more servers 400 can execute any technically feasible type of application, such as the application designed as described above.

[0067] As shown in the figure, each of one or more client devices 404 includes one or more input devices 426, a client application 422, a communication interface 420, and a display 424. One or more input devices 426 may include one or more devices of any type for receiving user input, such as a keyboard, mouse, joystick, and / or game controller. The client application 422 receives input data in response to user input at one or more input devices 426, transmits the input data to one or more servers 402 via the communication interface 420 (e.g., a network interface controller) and through one or more networks 406 (e.g., the Internet), receives encoded display data from the server 402, and decodes and displays the data on the display 424 (e.g., a cathode ray tube, liquid crystal display, light-emitting diode display, etc.). Therefore, computationally intensive computation and processing can be offloaded to one or more servers 402. For example, a game session can be streamed from one or more servers 402 to one or more client devices 404, thereby reducing the graphics processing and rendering requirements of one or more client devices 404.

[0068] As shown in the figure, each of one or more servers 402 includes a communication interface 418, one or more CPUs 408, a parallel processing subsystem 410, a rendering component 412, a rendering capture component 414, and an encoder 416. Input data sent from a client device 404 to one of the one or more servers 402 is received via the communication interface 418 (e.g., a network interface controller) and processed via one or more CPUs 408 and / or parallel processing subsystems 410 included in that server 402, which respectively correspond to the above-described combination. Figure 1-3 The described computer system 100 includes a CPU 102 and a parallel processing subsystem 112. In some embodiments, one or more CPUs 408 may receive input data, process the input data, and send the data to the parallel processing subsystem 410. In turn, the parallel processing subsystem 410 renders one or more independent images and / or image frames, such as frames in a video game, based on the transmitted data.

[0069] For example, rendering component 412 employs parallel processing subsystem 112 to render the results of processing input data, and rendering capture component 414 captures the rendering as display data (e.g., as image data capturing individual images and / or image frames). The rendering performed by rendering component 412 may include ray or path tracing lighting and / or shadow effects computed using one or more parallel processing units of server 402 (such as GPUs, which may further utilize one or more dedicated hardware accelerators or processing cores to perform ray or path tracing techniques). In some embodiments, rendering component 412 performs rendering using ray cone tracing techniques disclosed herein. Subsequently, encoder 416 encodes the captured rendering display data to generate encoded display data, which is transmitted via communication interface 418 through one or more networks 406 to one or more client devices 422 for display to one or more users. In some embodiments, rendering component 412, rendering capture component 414, and encoder 416 may be included in rendering application 130, as described above. Figure 1 As stated above.

[0070] Returning to the cloud gaming example, during a game session, input data received by one of one or more servers 402 can represent a user character's movement, weapon firing, reloading, passing, vehicle turning, etc., within the game. In this case, rendering component 412 can generate a render of the game session representing the results of the input data, and rendering capture component 414 can capture the render of the game session as display data (e.g., as image data of the captured render frames of the game session). Parallel processing (e.g., GPU) resources can be dedicated to each game session, or resource scheduling techniques can be employed to share parallel processing resources across multiple game sessions. Furthermore, the ray cone tracing techniques disclosed herein can be used to render the game session. The rendered game session can then be encoded by encoder 416 to generate encoded display data, which is transmitted via one or more networks 406 to one or more client devices 404 for decoding and output via the display 424 of that client device 404.

[0071] It should be understood that the architecture described in this article is illustrative and is subject to change and modification. Among other things, any number of processing units, such as those described above... Figure 3 The SM 310, texture unit 315, or preROP unit 325 may be included within the GPC 208.

[0072] Texture filtering using a refracting light cone

[0073] Figure 5Exemplary ray cones are illustrated, tracked through a virtual 3D scene according to various embodiments. As shown, a ray cone 500, as an enhancement of ray 502, is tracked through pixels (not shown) in screen space into a scene comprising two objects 510 and 540. When the ray cone 500 hits an object at a point of impact, the ray cone 500 can reflect or refract according to the material properties of the object and the surface curvature at the point of impact.

[0074] As shown in the figure, object 510 is made of a medium with a light cone 500 that refracts incident light rays, while object 540 is made of a medium with a light cone 530 that does not reflect or refract light rays. For example, object 510 can be made of glass, while object 540 can be made of concrete.

[0075] Illustratively, the rendering application 130 traces a ray cone 500 to a hit point 506 on the object 510 and performs a texture filtering lookup based on the texture footprint associated with the ray cone 500 and the texture of the surface of the object 510. In some embodiments, performing a texture filtering lookup includes instructing the GPU's texture units (e.g., in conjunction with the above) Figure 3 The texture unit 315 described performs texture filtering at the hit point 506 based on the texture of the object 510 at the hit point 506 and the texture occupancy area corresponding to the size of the ray cone 500 at the hit point 506. For example, the intersection of the ray cone 500 with a triangular plane (which is a plane passing through the vertices of the triangle at the hit point 506) forms an ellipse, which can be used as a texture occupancy area during texture filtering. In particular, the rendering application 130 can input the major and minor axes of such an ellipse into a hardware-accelerated texture lookup unit of the GPU, which performs texture filtering based on these axes. Although this document describes texture filtering primarily, in some cases, texture can be sampled without performing texture filtering. For example, when the first hit of the ray cone along the path is refraction, texture filtering cannot be performed before refraction.

[0076] After performing texture filtering lookup (or texture sampling), the rendering application 130 generates a refracted ray cone 520 and traces it to a hit point 524 on the other side of the object 510. It should be noted that generating a refracted ray cone differs from generating a reflected ray cone because, in the case of refraction, the refractive indices of the two media on opposite sides of the hit point surface are typically different, while in the case of reflection, the refractive indices are typically the same. In the case of refraction, the relative refractive index η of the two media affects the direction of the refracted ray, which can change not only the width of the refracted ray cone but also its geometry. Depending on the relative refractive index (whether the light refracts from a medium with higher optical density to a thinner optical density, or vice versa), the refracted ray cone can shrink or expand. Furthermore, the centerline of the refracted ray cone can often differ from the direction of the refracted ray generated by refracting the middle ray of the incident ray cone. However, as described in more detail below, some embodiments do not modify the refracted ray to reflect this change because doing so would be computationally more expensive and might miss the geometry that would be hit at certain angles.

[0077] In some embodiments, the rendering application 130 generates the refracting ray cone 520 by calculating the direction of the intermediate ray and the direction of the top and bottom edges of the refracting ray cone 520 in a 2D coordinate system, and given such calculations, by further calculating the width and spread angle of the refracting ray cone 520, as follows: Figure 6-11 Described. After tracing the refracted ray cone 520 to the hit point 524, the rendering application 130 performs another texture filtering lookup based on the texture occupancy associated with the refracted ray cone 520 and the texture of the surface of object 512. In some embodiments, isotropic texture filtering is performed before refraction, and anisotropic texture filtering is performed after refraction, for example, when the refracted ray cone 520 hits the other side of object 510 at the hit point 524. Anisotropic texture filtering (which is computationally more expensive than isotropic texture filtering) can be used to correct defects in the reflected ray cone 520, as combined below. Figure 7 In other embodiments, only isotropic texture filtering may be performed.

[0078] As shown, the rendering application 130 generates another refracting ray cone 530 and traces it to a hit point 534 on the object 540. The refracting ray cone 530 can be generated in a similar manner to the refracting ray cone 520. The rendering application 130 then performs another texture filtering lookup based on the texture occupancy associated with the refracting ray cone 530 and the texture of the surface of the object 540. Similar to the description above, anisotropic texture filtering can be performed in some embodiments.

[0079] The results of the texture filtering lookup described above can be used to render a combination of the textures associated with the surface of object 510 at hit points 506 and 524 and the textures associated with the surface of object 540 at hit point 534 on the surface of object 510.

[0080] Figure 6 A method for calculating a refracted ray cone according to various embodiments is illustrated. As shown, the ray cone 600 can be defined by: an origin 602, denoted by O; an initial width, denoted by w; an expansion angle, denoted by α1, which indicates the width as the intermediate ray 603 of the ray cone 600 passes through the scene; and a direction vector 604, denoted by d, which indicates the direction of the intermediate ray 603. Illustratively, the ray cone 600 hits an object 610 with a curved surface at a hit point 616 (denoted by P). The ray cone 600 is then refracted as a refracted ray cone 634.

[0081] In some embodiments, the rendering application 130 first calculates in a 2D coordinate system (1) a direction vector 630 (denoted by t, which indicates the direction of the middle ray of the refracting ray cone 634); (2) an upper direction vector 632 (denoted by t) u (This is indicated by its association with one side of the refracted ray cone 634 in 2D); and (3) the downward direction vector 628, (using t) l This indicates that it is associated with the other side of the refracted ray cone 634 in 2D. The rendering application 130 then generates the refracted ray cone 634 based on the direction from the hit point 616(P) along the direction vector 630(t) to the direction vector 632(t). u ) and the downward direction vector 628(t) l The width of the refracted ray cone 634 is calculated based on the distance between lines 631 and 629 defined by . Furthermore, the rendering application 130 can calculate the width based on the upper direction vector 632(t). u ) and the downward direction vector 628(t) l We use α2 to calculate the semi-cone angle of the refracted ray cone 634.

[0082] More formally, the curvature of surface 610 at the point of impact 616 (P) can be modeled as a signed angle, denoted by β. If surface 610 is convex at the point of impact 616 (P), β is positive; if surface 610 is concave at the point of impact 616 (P), β is negative. At point 618 (denoted by P)... l The rotation normal vector at point (represented by n) is 622 (using n). l (represented) and at another hit point 620 (using P) u The rotation normal vector at point (represented by n) is 626 (using n). uThe refracting ray cone 634 is generated by rotating the vector 624 (denoted by n) perpendicular to the surface of the object at the hit point 616 (P) by an angle β in the opposite direction. As described, the calculation of generating the refracting ray cone 634 can be performed in two dimensions, including determining the rotation normal vector 622 (n). l ) and 626(n u In some embodiments, the 2D coordinate system is defined using: a hit point 616(P), as the origin of the 2D coordinate system; the direction of vector 624(n), which serves as the y-axis perpendicular to the surface of object 610; and the direction of tangent vector 614 (denoted by m), which is orthogonal to normal vector 624(n) and parallel to direction vector 604(d). In some embodiments, tangent vector 614(m) can be calculated using projection and normalization, as follows:

[0083]

[0084] The normal vector 624(n) and the tangent vector 614(m) together form the basis vectors of the plane. The following discussion assumes that all vectors and points exist in the above 2D coordinate system.

[0085] In some embodiments, the upper direction vector 608 is obtained by rotating the direction vector 604(d) associated with the intermediate ray 603 of the ray cone 600 by the extension angle + α of the ray cone 600. u Similarly, the lower direction vector 606,d can be obtained by rotating the direction vector 606(d) by a negative extension angle -α. l As shown in the figure, the upper ray 607 associated with one side of the ray cone 600 has an upper direction vector 608 (d). u The direction indicated by the direction vector 604(d) is such that the upper ray 605 starts from the origin 602(O) and is offset by half w / 2 of its initial width in a direction orthogonal to the direction vector 604(d). Similarly, the lower ray 605 associated with the other side of the ray cone 600 has a direction vector 606(d) indicating the direction, and the upper ray 605 starts from the origin 602(O) and is offset by half w / 2 of its initial width in a direction orthogonal to the direction vector 604(d). l The direction indicated by ) and the lower ray 605 starts from the origin 602(O), at the direction vector 608(d) u Offset by half the initial width w / 2 in the opposite direction (orthogonal to direction vector 604(d)). The upper and lower rays 607 and 605 are traced through the scene until they intersect the X-axis of the 2D coordinate system at point P. u and P l They intersect.

[0086] Given a direction vector 604(d) associated with the incident ray, the rendering application 130 determines whether refraction should occur. In some embodiments, whether refraction occurs is determined based on whether the angle of incidence formed by the ray and the surface of the object 610 is greater than a critical angle associated with the medium of the object 610 and the medium surrounding the object 610. When the critical angle is exceeded, total internal reflection occurs instead of refraction if the ray attempts to travel from a medium with a higher optical density to a medium with a thinner optical density.

[0087] When refraction occurs, the rendering application 130 calculates the direction vector 630(t) and the upper direction vector 632(t) associated with the refracted ray cone 634. u ) and the downward direction vector 632(t) l Specifically, the direction vector 630(t) can be calculated using Snell's law based on the direction vector 604(d) associated with the intermediate ray 603 of the ray cone 600 and the refractive index of the medium on either side of the point of impact 616(P):

[0088] n1sinθ1=n2sinθ2, (2)

[0089] Where n1 and n2 are the refractive indices of the two media, θ1 is the incident angle of the middle ray 603 of ray cone 600, and θ2 is the refraction angle of the middle ray of refracted ray cone 634. The upward direction vector is 632(t). u It can be based on the d associated with one side of the ray cone 630. u The upward direction vector 608; the refractive indices n1 and n2 of the medium on either side of the impact point 616(P); and the refractive indices n1 and n2 calculated based on the curvature of the surface of object 610 at the impact point 616(P). u The rotational normal vector 626 is calculated using Snell's law. Specifically, the upward direction vector 632(t) u The angle of refraction (θ2 in equation (2)) can be relative to the rotation normal vector 626(n) u ) calculation. Similarly, the downward direction vector 632(t) l ), can be based on the other side of the ray cone 630 associated with d l The downward direction vector 606; the refractive indices n1 and n2 of the medium on either side of the impact point 616(P); and the refractive indices n1 and n2 calculated based on the curvature of the surface of object 610 at the impact point 616(P). l The rotational normal vector 622 is calculated using Snell's law, where the down direction vector 632(t) l The angle of refraction of ) relative to the rotational normal vector 622(n) l And calculate.

[0090] Then, the rendering application 130 is based on the direction from the hit point 616(P) along the direction vector 630(t) to the direction vector 632(t). u ) and the downward direction vector 628(t) l The width of the refracted ray cone 634 is calculated using the distance between lines 631 and 629 defined in the model. In some embodiments, the width of the refracted ray cone 634 is calculated as...

[0091] w = w u +w l (3)

[0092] Where w u It is calculated as the length along the direction orthogonal to the direction vector 630(t), from the hit point 616(P) to the direction vector 632(t). u The line 631 is defined, and w l It is calculated as the length along the opposite direction orthogonal to the direction vector 630(t), from the hit point 616(P) to the direction vector 628(t). l Line 629 is defined by direction vector 630(t). In this case, a line along a direction orthogonal to direction vector 632(t) can be connected to the line defined by direction vector 632(t). u The line 631 defined by ) intersects the line 631, and the line in the opposite direction orthogonal to the direction vector 630(t) can intersect the line defined by the direction vector 628(t). l The lines 629 defined intersect to calculate the length w. u and w l .

[0093] Furthermore, the rendering application 130 can calculate the semi-cone angle of the refracted ray cone 634, denoted by α2, as the upward direction vector 632(t). u ) and the downward direction vector 628(t) l The angle between α and β is half of the angle between α and β. It should be noted that the refracting cone 634 can expand or contract. In some embodiments, the half-cone angle α2 can be calculated together with the sign indicating whether the refracting cone 634 is expanded or contracted.

[0094]

[0095] in It is part of the cross product in 2D, which indicates by a symbol whether the refracted ray cone 634 expands or contracts.

[0096] Although the description of the refracted intermediate ray 603, upper ray 607 and lower ray 605 is as follows Figure 6However, in some cases, one or more such rays can be totally internally reflected. As described above, total internal reflection occurs when (1) the angle of incidence of the ray with respect to the object surface is greater than the critical angle associated with the medium of the object and the surrounding medium, and (2) the ray is attempting to travel from a medium with a higher optical density to a medium with a thinner optical density. In some embodiments, when the middle ray of the ray cone is totally internally reflected, the rendering application 130 generates a reflected ray cone instead of a refracted ray cone. The reflected ray cone can be generated in any technically feasible manner, including using well-known techniques. On the other hand, when the upper ray is totally internally reflected (but the middle ray is not totally internally reflected), the rendering application 130 generates a refracted ray cone according to the techniques described above, except that the upper direction vector 632(t) of the refracted ray cone is... u The calculation is as follows:

[0097]

[0098] Alternatively, n can be used instead of n in equation (5). u Similarly, when the current ray undergoes total internal reflection (but the intermediate rays are not), the rendering application 130 can generate a refracted ray cone according to the above technique, except that the downward direction vector 628(t) of the refracted ray cone... l The calculation is as follows:

[0099]

[0100] Alternatively, n can be used instead of n in equation (6). l .

[0101] Although this document primarily describes perfect refraction, the techniques disclosed herein can also be used for coarse refraction when, for example, a bidirectional transmittance distribution function (BTDF) based on a microfacet is used to generate random refraction directions based on surface roughness. In some embodiments, the half-vector of the incident direction for the refracted ray can be used as the normal, since the half-vector is the normal of the microfacet used for refracting the ray. In this case, a stochastic evaluation technique of the microfacet BTDF can be used to generate the half-vector. In other embodiments, the half-vector can be calculated using the following equation:

[0102]

[0103] Figure 7 An approximation of the refracted ray cone according to various embodiments is shown. As shown in the figure, using the above combination... Figure 6 The method generates a refracting ray cone 704 through multiple rays 710 of the refracting ray cone 700. i(Hereinafter collectively referred to as ray 710, and individually as ray 710) The refracted ray cone 702 is approximately approximated. In particular, the refracted ray cone 704 grows at the same rate as the refracted ray cone 702. As a result, the refracted ray cone 704 can be used to approximate the occupied area of ​​the refracted ray cone 702 without changing the refraction direction of the intermediate ray of the ray cone 700, which is the intermediate ray of the refracted ray cone 704.

[0104] Illustratively, the geometry of ray cone 700 is significantly different from that of refracted ray cone 702, while refracted ray cone 704 only approximates the geometry of refracted ray cone 702. In some embodiments, rendering application 130 performs isotropic texture filtering before refraction and anisotropic texture filtering after refraction to compensate for the approximation made using refracted ray cones. For example, anisotropic filtering can be performed to compensate for defects in refracted ray cone 704 relative to refracted ray cone 702. In other embodiments, only isotropic texture filtering can be performed. In some embodiments, under certain circumstances, texture can be sampled without performing texture filtering. For example, when the first hit of the ray cone along the path is refraction, texture filtering cannot be performed before refraction.

[0105] Figure 8A An exemplary image 800 rendered using a refracting ray cone and isotropic texture filtering according to various embodiments is shown. As shown, image 800 depicts a virtual 3D scene including glass objects, and image 800 is rendered from a view such that most of the scene is seen through the glass objects. Zoom-in views 802 and 804 of two regions in image 800 are also shown.

[0106] Figure 8B An exemplary image 810 is shown, rendered using a refracting ray cone and anisotropic texture filtering according to various other embodiments. As shown, the exemplary image 810 is rendered by performing anisotropic texture filtering after refraction, as described above. Figure 5 and Figure 7 The above describes the process. In some embodiments, anisotropic texture filtering can be performed after refraction to compensate for the approximation made using a refracting ray cone. Magnifications 812 and 814 of two regions in image 810 are also shown.

[0107] Figure 8C An exemplary ground-based image 820 according to the prior art is shown. Ground-based image 820 is generated by randomly sampling screen-space ray cones pixel by pixel, with 10K samples per pixel. Zoom-in views 822 and 824 of two regions within ground-based image 820 are also shown.

[0108] like Figures 8A-8CAs shown, image 810 rendered using a refracting ray cone and anisotropic texture filtering is closer to the actual ground image 820 than image 800 rendered using a refracting ray cone and isotropic texture filtering. Furthermore, the result produced by anisotropic texture filtering is closer to the actual ground in the region of magnification 812 than in the region of magnification 814, because the ray cone model cone and the region in magnification 814 need to be more than just a cone.

[0109] Figure 9A Exemplary image 900 rendered using refracting ray cones and isotropic texture filtering according to various embodiments is shown, while Figure 9B An exemplary image 910 rendered using full-resolution texture (mip0) according to prior art is shown. As shown, images 900 and 910 are rendered using 1000 samples per pixel (SSP). On the same hardware, image 900, rendered using a refractive ray cone and isotropic texture filtering, can be rendered faster than image 910 rendered using full-resolution texture. For example, experience shows that the combination of the above... Figure 5-7 The described ray cone tracing using isotropic texture filtering is typically about 10% faster than differential ray tracing using isotropic texture filtering, and about 12% faster than differential ray tracing using anisotropic texture filtering. Furthermore, image 900 rendered using refracting ray cones and isotropic texture filtering is visually almost identical to image 910 rendered using full-resolution textures.

[0110] Figure 10 This is a flowchart of method steps for calculating pixel color using a ray cone tracing technique that implements a refracted ray cone, according to various embodiments. Although combined Figure 1-4 The system described herein outlines the method steps, but those skilled in the art will understand that any system configured to execute the method steps in any order is within the scope of this embodiment. Although described for tracking a single ray cone, the method steps can be repeated to track multiple ray cones when rendering an image.

[0111] As shown in the figure, method 1000 begins at step 1002, where the rendering application 130 traces a ray cone through the scene until the ray cone intersects with geometry within the scene at a hit point. Specifically, the ray cone can be traced into the scene via pixels in screen space until it intersects with a triangle in the geometry at the hit point.

[0112] In step 1004, the rendering application 130 determines whether the geometry surface at the hit point is textured. If the geometry surface at the hit point is textured, then in step 1006, the rendering application 130 causes the GPU's texture units to perform texture filtering based on one or more textures associated with the geometry surface. Any technically feasible texture filtering can be performed, such as isotropic or anisotropic texture filtering. In some embodiments, isotropic texture filtering is performed before refraction and anisotropic texture filtering is performed after refraction to compensate for defects in the refracted ray cone. In some embodiments, textures can be sampled without performing texture filtering in certain cases. For example, texture filtering cannot be performed before refraction when the first hit of the ray cone along the path is refraction.

[0113] In step 1008, the rendering application 130 receives filtered texture values ​​from the GPU's texture units. The filtered texture values ​​represent the texture color associated with a pixel in screen space, through which a ray cone is traced in step 1002.

[0114] In step 1010, the rendering application 130 applies or accumulates the filtered texture values ​​to the ray cone traced by pixels in step 1002. The applied or accumulated texture filter values ​​contribute to the color of pixels in the rendered image. As described, the rendered image can be, for example, an image or frame from a video game or movie, an image generated by an architecture or design application, or any other application, etc. Although this document describes the application or accumulation of anisotropically filtered texture values ​​to pixels, in other embodiments, filtered texture values ​​can be used in any technically feasible manner.

[0115] In step 1012, the rendering application 130 determines whether refraction has occurred. In some embodiments, determining whether refraction has occurred includes determining whether the angle of incidence formed by the intermediate ray of the ray cone and the surface of the object hit by the ray cone is greater than a critical angle associated with the medium of the object and the surrounding medium when the intermediate ray attempts to travel from an optically denser medium to an optically thinner medium. Beyond the critical angle, the light is totally internally reflected, rather than refracted. Determining whether refraction has occurred may further include determining whether the intermediate ray is reflected when it attempts to travel from an optically thinner medium to an optically denser medium.

[0116] If refraction occurs, in step 1014, the rendering application 130 generates a refracted ray cone. Figure 11 This is a flowchart of method steps for generating a refracting ray cone in step 1014 according to various embodiments. Although combined with… Figure 1-4 The system describes these method steps, but those skilled in the art will understand that any system configured to perform the method steps in any order falls within the scope of this embodiment.

[0117] As shown in the figure, in step 1102, the rendering application 130 calculates the direction of the intermediate ray of the refracted ray cone based on the direction of the intermediate ray of the incident ray cone. As described, in some embodiments, Snell's law can be used to calculate the direction of the intermediate ray of the refracted ray cone.

[0118] In step 1104, the rendering application 130 calculates a 2D coordinate system. As described, in some embodiments, the 2D coordinate system can be defined using the point of impact as the origin, the direction of the vector perpendicular to the surface of the object hit by the ray cone as one axis, and the direction of the vector that lies in the same plane as the middle ray of the ray cone and is tangent to the surface of the object as another axis.

[0119] In step 1106, the rendering application 130 calculates the direction of the upper hit point and one side of the refracted ray cone in a 2D coordinate system. In some embodiments, the rendering application 130 may first determine whether the upper ray associated with the corresponding side of the incident ray cone is refracted or totally internally reflected. If the upper ray is refracted, then the direction of one side of the refracted ray cone can be calculated based on the direction of the corresponding side of the incident ray cone and the rotation normal vector, as described above. Figure 6 As described. Furthermore, the rotation normal vector can be calculated based on the curvature of the object surface hit by the ray cone. On the other hand, if the upper ray undergoes total internal reflection, the direction of one side of the refracted ray cone can be calculated according to Equation 5.

[0120] In step 1108, the rendering application 130 calculates the lower hit point and the direction of the other side of the refracted ray cone in a 2D coordinate system. Similar to step 1106, in some embodiments, the rendering application 130 can determine whether the lower ray associated with the other corresponding side of the incident ray cone is refracted or totally internally reflected. If the lower ray is refracted, then the direction of the other side of the refracted ray cone can be calculated based on the direction of the other corresponding side of the incident ray cone and another rotation normal vector, as described above. Figure 6 On the other hand, if the ray is totally internally reflected, then the direction of the other side of the refracted ray cone can be calculated according to Equation 6.

[0121] In step 1110, the rendering application 130 calculates the width associated with the refracted ray cone based on the point of impact of the incident ray cone's intermediate ray hitting the object and the rays associated with the direction of the intermediate ray and the directions of the two sides of the refracted ray cone. In some embodiments, the width of the refracted ray cone is calculated based on the distance from the point of impact along a direction orthogonal to the intermediate ray of the refracted ray cone to the ray defined by the direction vectors associated with the two sides of the refracted ray cone, as described above. Figure 6 As stated above.

[0122] In step 1112, the rendering application 130 calculates the expansion angle associated with the refracting ray cone based on the directions of the two sides of the refracting ray cone. In some embodiments, the expansion angle may be calculated as a half-cone angle and a sign indicating whether the refracting ray cone is expanding or contracting, as described above. Figure 6 As stated above.

[0123] return Figure 10 If the rendering application 130 determines in step 1012 that refraction will not occur, then method 1000 continues to step 1016, wherein the rendering application 130 determines whether reflection occurs based on whether the surface of the object hit by the ray cone is reflective.

[0124] If no reflection occurs, method 1000 ends. Alternatively, if a reflection occurs, method 1000 continues to step 1018, where the rendering application 130 generates a reflected ray cone. The reflected ray cone can be generated in any technically feasible manner, including using well-known techniques.

[0125] Then, method 1000 returns to step 1002, where the rendering application 130 tracks (reflects or refracts) the light cone across the scene until the light cone intersects with geometry in the scene at another point of impact.

[0126] Although this document primarily describes refraction at surfaces, in some cases, the incident ray may be below the disturbed surface normal. To address such situations, in some embodiments, the incident vector associated with the ray may be clamped to a distance of up to 90 degrees from the surface normal.

[0127] In summary, the disclosed technique provides an improved ray cone tracing technique for realizing refracted ray cones. In the improved ray cone tracing technique, when a ray cone being traced through a virtual 3D scene hits the surface of geometry in the scene and is refracted, a refracted ray cone is generated by (1) calculating the direction of the intermediate ray and the direction of the upper and lower edges of the refracted ray cone in a 3D coordinate system; and (2) given such calculations, further calculating the width and expansion angle of the refracted ray cone. The 3D refracted ray cone is then traced through the scene. Furthermore, isotropic texture filtering can be performed before generating the refracted ray cone, and anisotropic texture filtering can be performed using the refracted ray cone and any subsequent ray cones to determine the color of pixels in the rendered image.

[0128] The disclosed technique has at least one technical advantage over existing techniques in that it implements a refracting ray cone, which can be used to render more realistic images of virtual scenes comprising objects constructed from media that cause light refraction. Furthermore, the disclosed technique uses ray cone tracing, which has a lower computational cost than many ray tracing techniques that can be used to trace refracted rays, such as differential ray tracing. These technical advantages represent one or more technical improvements over existing methods.

[0129] 1. In some embodiments, a computer-implemented method for rendering one or more graphical images includes: tracing a ray cone through a three-dimensional (3D) graphical scene; generating a refracted ray cone based on the ray cone and a two-dimensional (2D) coordinate system; and rendering the graphical image based on the refracted ray cone.

[0130] 2. The computer-implemented method as described in Clause 1, wherein generating the refracted ray cone comprises: calculating a first hit point and the direction of an intermediate ray associated with the refracted ray cone based on the direction of an intermediate ray associated with the ray cone; calculating a second hit point and the direction of a first side of the refracted ray cone in the 2D coordinate system based on the direction of a first side of the ray cone and a first rotation normal vector; and calculating a third hit point and the direction of a second side of the refracted ray cone in the 2D coordinate system based on the direction of a second side of the ray cone and a second rotation normal vector.

[0131] 3. The computer-implemented method as described in Clause 1 or 2 further comprises: calculating the first rotation normal vector and the second rotation normal vector based on the curvature of the object hit by the ray cone in the 3D graphics scene.

[0132] 4. The computer-implemented method as described in any one of Clauses 1-3, further comprising: calculating the width associated with the refracting ray cone based on the first hit point, a first ray associated with the direction of the first side of the refracting ray cone, and a second ray associated with the direction of the second side of the refracting ray cone.

[0133] 5. The computer-implemented method as described in any one of Clauses 1-4, further comprising: calculating an angle associated with the refracting ray cone based on the direction of a first side of the refracting ray cone and the direction of a second side of the refracting ray cone.

[0134] 6. A computer-implemented method as described in any one of Clauses 1-5, wherein rendering the graphic image includes performing one or more isotropic texture filtering operations based on the refracting ray cone.

[0135] 7. A computer-implemented method as described in any one of Clauses 1-6, wherein rendering the graphic image comprises: performing one or more isotropic texture filtering operations based on the ray cone; and performing one or more anisotropic texture filtering operations based on the refracted ray cone.

[0136] 8. The computer-implemented method as described in any one of Clauses 1-7, further comprising: determining that refraction occurs at the point of impact where the ray cone intersects with an object in the 3D graphics scene.

[0137] 9. A computer-implemented method as described in any one of Clauses 1-8, wherein the graphical image is rendered in association with a video game, film, or architectural or design application.

[0138] 10. In some embodiments, one or more non-transitory computer-readable media storing program instructions that, when executed by at least one processor, cause the at least one processor to perform steps including: tracing a ray cone through a three-dimensional (3D) graphics scene; generating a refracted ray cone based on the ray cone and a two-dimensional (2D) coordinate system; and rendering a graphics image based on the refracted ray cone.

[0139] 11. One or more non-transitory computer-readable media as described in Clause 10, wherein generating the refracted ray cone comprises: calculating a first hit point and the direction of an intermediate ray associated with the refracted ray cone based on the direction of an intermediate ray associated with the ray cone; calculating a second hit point and the direction of a first side of the refracted ray cone in the 2D coordinate system based on the direction of a first side of the ray cone and a first rotation normal vector; and calculating a third hit point and the direction of a second side of the refracted ray cone in the 2D coordinate system based on the direction of a second side of the ray cone and a second rotation normal vector.

[0140] 12. One or more non-transitory computer-readable media as described in Clause 10 or 11, the step further comprising: calculating the first rotation normal vector and the second rotation normal vector based on the curvature of the object hit by the ray cone in the 3D graphics scene.

[0141] 13. One or more non-transitory computer-readable media as described in any one of clauses 10-12, the step further comprising: calculating the width associated with the refracting ray cone based on the first hit point, a first ray associated with the direction of the first side of the refracting ray cone, and a second ray associated with the direction of the second side of the refracting ray cone.

[0142] 14. One or more non-transitory computer-readable media as described in any one of clauses 10-13, the step further comprising: calculating an angle associated with the refracting ray cone based on the direction of a first side of the refracting ray cone and the direction of a second side of the refracting ray cone.

[0143] 15. One or more non-transitory computer-readable media as described in any one of Clauses 10-14, the step further comprising: determining that at least one of an intermediate ray associated with the ray cone, a third ray associated with the direction of a first side of the ray cone, or a fourth ray associated with the direction of a second side of the ray cone is totally internally reflected.

[0144] 16. One or more non-transitory computer-readable media as described in any one of Clauses 10-15, the step further comprising: calculating the 2D coordinate system based on intermediate rays associated with the ray cone.

[0145] 17. One or more non-transitory computer-readable media as described in any one of Clauses 10-16, wherein rendering the graphic image includes performing one or more texture filtering operations based on the ray cone and the refracted ray cone.

[0146] 18. One or more non-transitory computer-readable media as described in any one of Clauses 10-17, wherein the one or more texture filtering operations comprise: one or more isotropic texture filtering operations performed before generating the refracting ray cone, and one or more anisotropic texture filtering operations performed after generating the refracting ray cone.

[0147] 19. In some embodiments, a system includes: one or more memories storing instructions; and one or more processors coupled to the one or more memories and configured, when executing the instructions, to: trace ray cones through a three-dimensional (3D) graphics scene; generate refracted ray cones based on the ray cones and a two-dimensional (2D) coordinate system; and render a graphics image based on the refracted ray cones.

[0148] 20. The system as described in Clause 19, wherein the one or more memories and the one or more processors are included in one or more computing systems that provide at least one virtualized environment or cloud computing environment.

[0149] Any element of any claim recited in any claim and / or any combination of any element described in this application falls within the scope of this disclosure and protection in any way.

[0150] For illustrative purposes, various embodiments have been described, but are not intended to be exhaustive or limited to the disclosed embodiments. Many modifications and variations will be apparent to those skilled in the art without departing from the scope and spirit of the described embodiments.

[0151] Various aspects of this embodiment can be implemented as a system, method, or computer program product. Therefore, aspects of this disclosure can take the form of a completely hardware embodiment, a completely software embodiment (including firmware, resident software, microcode, etc.), or an embodiment combining software and hardware aspects, which are generally referred to herein as “modules” or “systems.” Furthermore, aspects of this disclosure can take the form of a computer program product contained in one or more computer-readable media having computer-readable program code contained thereon.

[0152] Any combination of one or more computer-readable media may be used. A computer-readable medium can be a computer-readable signal medium or a computer-readable storage medium. A computer-readable storage medium can be, for example, but not limited to, electronic, magnetic, optical, electromagnetic, infrared, or semiconductor systems, apparatuses, or devices, or any suitable combination thereof. More specific examples (not an exhaustive list) of computer-readable storage media will include the following: an electrical connection having one or more wires, a portable computer floppy disk, a hard disk, random access memory (RAM), read-only memory (ROM), erasable programmable read-only memory (EPROM or flash memory), optical fiber, portable optical disc read-only memory (CD-ROM), optical storage devices, magnetic storage devices, or any suitable combination thereof. In the context of this document, a computer-readable storage medium can be any tangible medium that can contain or store programs for use by or associated with an instruction execution system, apparatus, or device.

[0153] The foregoing description, with reference to flowchart illustrations and / or block diagrams of methods, apparatus (systems), and computer program products according to embodiments of this disclosure, has described various aspects of this disclosure. It should be understood that each block of the flowchart illustrations and / or block diagrams, and combinations of blocks in the flowchart illustrations and / or block diagrams, can be implemented by computer program instructions. These computer program instructions can be provided to a processor of a general-purpose computer, a special-purpose computer, or other programmable data processing apparatus to produce a machine. When executed by the processor of a computer or other programmable data processing apparatus, the instructions enable the implementation of the functions / actions specified in the flowchart illustration and / or block diagram blocks or blocks. Such processors can be, but are not limited to, general-purpose processors, special-purpose processors, special-purpose processors, or field-programmable gate arrays.

[0154] The flowcharts and block diagrams in the figures illustrate the architecture, functionality, and operation of possible implementations of systems, methods, and computer program products according to various embodiments of the present disclosure. In this regard, each block in a flowchart or block diagram may represent a module, segment, or portion of code, comprising one or more executable instructions for implementing a specified logical function. It should also be noted that in some alternative implementations, the functions labeled in the blocks may not appear in the order indicated in the figures. For example, depending on the functions involved, two blocks shown consecutively may actually be executed substantially simultaneously, or sometimes these blocks may be executed in reverse order. It will also be noted that each block shown in the block diagrams and / or flowcharts, and combinations of blocks shown in the block diagrams and / or flowcharts, may be implemented by a system based on dedicated hardware or a combination of dedicated hardware and computer instructions that performs the specified function or action.

[0155] While the foregoing describes embodiments of this disclosure, other and further embodiments of this disclosure may be devised without departing from its essential scope, the scope of which is determined by the following claims.

Claims

1. A computer-implemented method for rendering one or more graphic images, the method comprising: Tracing light cones through 3D graphics scenes; A two-dimensional 2D coordinate system is calculated based on the normal to the surface at the first hit point where the ray cone intersects with the surface in the three-dimensional 3D graphics scene, and by projecting the direction vector associated with the ray cone onto the tangent plane at the first hit point where the ray cone intersects with the surface. A refracted ray cone is generated based on the ray cone and the two-dimensional 2D coordinate system; as well as Graphic images are rendered based on the refracted light cone.

2. The computer-implemented method of claim 1, wherein generating the refracted ray cone comprises: Based on the direction of the intermediate ray associated with the ray cone, calculate the first hit point and the direction of the intermediate ray associated with the refracted ray cone; Based on the direction of the first side of the ray cone and the first rotation normal vector, calculate the second hit point of the refracted ray cone and the direction of the first side in the 2D coordinate system; as well as Based on the direction of the second side of the ray cone and the second rotation normal vector, the third hit point and the direction of the second side of the refracted ray cone are calculated in the 2D coordinate system.

3. The computer-implemented method as described in claim 2, further comprising: The first rotational normal vector and the second rotational normal vector are calculated based on the curvature of the surface.

4. The computer-implemented method as described in claim 2, further comprising: The width associated with the refracting ray cone is calculated based on the first hit point, the first ray associated with the direction of the first side of the refracting ray cone, and the second ray associated with the direction of the second side of the refracting ray cone.

5. The computer-implemented method as described in claim 2, further comprising: The angle associated with the refracting ray cone is calculated based on the direction of the first side of the refracting ray cone and the direction of the second side of the refracting ray cone.

6. The computer-implemented method of claim 1, wherein rendering the graphic image includes performing one or more isotropic texture filtering operations based on the refracting ray cone.

7. The computer-implemented method of claim 1, wherein rendering the graphic image comprises: Perform one or more isotropic texture filtering operations based on the light cone; as well as Perform one or more anisotropic texture filtering operations based on the refracted light cone.

8. The computer-implemented method as described in claim 1, further comprising: Refraction was determined to occur at the first point of impact.

9. The computer-implemented method of claim 1, wherein the graphic image is rendered in association with a video game, movie, or architectural or design application.

10. One or more non-transitory computer-readable media storing program instructions that, when executed by at least one processor, cause the at least one processor to perform the following steps: Tracing light cones through 3D graphics scenes; A two-dimensional 2D coordinate system is calculated based on the normal to the surface at the first hit point where the ray cone intersects with the surface in the three-dimensional 3D graphics scene, and by projecting the direction vector associated with the ray cone onto the tangent plane at the first hit point where the ray cone intersects with the surface. A refracted ray cone is generated based on the ray cone and the two-dimensional 2D coordinate system; as well as Graphic images are rendered based on the refracted light cone.

11. One or more non-transitory computer-readable media as claimed in claim 10, wherein generating the refracted ray cone comprises: Based on the direction of the intermediate ray associated with the ray cone, calculate the first hit point and the direction of the intermediate ray associated with the refracted ray cone; Based on the direction of the first side of the ray cone and the first rotation normal vector, calculate the second hit point of the refracted ray cone and the direction of the first side in the 2D coordinate system; as well as Based on the direction of the second side of the ray cone and the second rotation normal vector, the third hit point and the direction of the second side of the refracted ray cone are calculated in the 2D coordinate system.

12. The non-transitory computer-readable medium of claim 11, wherein the step further comprises: The first rotational normal vector and the second rotational normal vector are calculated based on the curvature of the surface.

13. The non-transitory computer-readable medium of claim 11, wherein the step further comprises: The width associated with the refracting ray cone is calculated based on the first hit point, the first ray associated with the direction of the first side of the refracting ray cone, and the second ray associated with the direction of the second side of the refracting ray cone.

14. The non-transitory computer-readable medium of claim 11, wherein the step further comprises: The angle associated with the refracting ray cone is calculated based on the direction of the first side of the refracting ray cone and the direction of the second side of the refracting ray cone.

15. The non-transitory computer-readable medium of claim 11, wherein the step further comprises: It is determined that at least one of the intermediate ray associated with the ray cone, the third ray associated with the direction of the first side of the ray cone, or the fourth ray associated with the direction of the second side of the ray cone is totally internally reflected.

16. One or more non-transitory computer-readable media as claimed in claim 10, wherein rendering the graphic image includes performing one or more texture filtering operations based on the ray cone and the refracted ray cone.

17. One or more non-transitory computer-readable media as claimed in claim 16, wherein the one or more texture filtering operations comprise: One or more isotropic texture filtering operations performed before the generation of the refracting ray cone, and one or more anisotropic texture filtering operations performed after the generation of the refracting ray cone.

18. A system comprising: One or more memories that store instructions; as well as One or more processors, coupled to the one or more memories and configured to: Tracing light cones through 3D graphics scenes; A two-dimensional 2D coordinate system is calculated based on the normal to the surface at the first hit point where the ray cone intersects with the surface in the three-dimensional 3D graphics scene, and by projecting the direction vector associated with the ray cone onto the tangent plane at the first hit point where the ray cone intersects with the surface. A refracted ray cone is generated based on the ray cone and the two-dimensional 2D coordinate system; as well as Graphic images are rendered based on the refracted light cone.

19. The system of claim 18, wherein the one or more memories and the one or more processors are included in one or more computing systems, the computing systems providing at least one of a virtualized environment or a cloud computing environment.