Simon’s Graphics Blog

Simon's Graphics Blog

last update:

This is really just teaser post for my next update. I’ve not done much on traversal yet (hence the world of spheres), but I’ve made some progress on shading. Here’s a screenshot of a pure CUDA renderer left for 20 seconds or so to get a nice smooth result:

CUDA Path Tracing


I’ve been doing a bit more GPU programming recently, here are some things I found when writing CUDA programs. This all refers to the CUDA compiler in the recent 3.2RC, and based on my experiences with GTX 275 hardware. In particular this advice may need to be tweaked for Fermi architecture GPUs, since I have yet to experiment with one.

Using Optix

Optix version 2.0 was released recently, so I gave it a go by plugging it into an existing multi-core path tracer. This path tracer can submit tens of thousands of ray queries as a batch so should be a good match for Optix and the GPU.

I needed a random number generator for a CUDA project, and had relatively few requirements:

  • It must have a small shared memory footprint
  • It must be suitable for Monte Carlo methods (i.e. have long period and minimal correlation)
  • It must allow warps to execute independently when generating random numbers

I thought I’d have a go at implementing some path tracing in CUDA. Let’s start simple: a classical path tracer with explicit direct lighting. Lots of hacks:

  • No BVH yet, every ray tests the 30 triangles of the Cornell Box
  • Every surface is lambertian (so cosine weighted hemisphere sampling for spawning rays)
  • Hardcoded for a single area light (which the camera cannot see)
  • Uses copy-pasted Moller intersection test from CPU code
  • Random number generation got moved to a texture read (with the texture data updated CPU-side) to avoid absurd register counts

Here are a collection of papers/links on the topic of Metropolis Light Transport (MLT). The core principle of detailed balance that underpins the Metropolis-Hastings algorithm is extremely neat, and its application to light transport (in particular using Veach’s path integral formulation) is very aesthetically pleasing. This post doesn’t really go anywhere, just provides links for further reading.

At work I wrote a global illumination system from scratch. It used classical ray tracing for the direct lighting, and photon mapping with final gather for the indirect term. I use the past tense since we’ve now switched over to using lightcuts as the main renderer, which due to the work of an awesome colleague, is giving us better results (and faster).

To complete the set, I thought I’d have a go at implementing a bidirectional path tracer, a full Veach, if you will…

The Cornell Box

This article presents an explanation of two techniques that can be used to perform DXT colour compression. They were designed during the development of an open source DXT compression library called squish.

This page has been updated from its 2004 original. Code for the function generator and rotations can now be found at

Spherical Harmonic Function Generator

The real-valued spherical harmonics can be defined as:

$$\begin{align*} Y^l_m(\phi, \theta) &= \Theta^l_{|m|}(\theta) \, \Phi_m(\phi) \\ \Theta^l_m(\theta) &= \sqrt{\frac{2l + 1}{4\pi}\frac{(l - m)!}{(l + m)!}}P^m_l(\cos\theta) \\ \Phi_m(\phi) &= \begin{cases} \sqrt{2}\cos{m\theta}, &m>0 \\ 1, &m=0 \\ \sqrt{2}\sin{|m|\theta}, &m<0 \end{cases} \end{align*}$$