Using Optix
Optix version 2.0 was released recently, so I gave it a go by plugging it into an existing multi-core path tracer. This path tracer can submit tens of thousands of ray queries as a batch so should be a good match for Optix and the GPU.
I liked:
- Ease of use. Wow this thing makes GPU ray tracing easy: I wrote a few tiny CUDA functions, the runtime reported nice errors for my bugs, I fixed the bugs and it worked as expected!
- Net performance win. It improved the performance of the path tracer, but not by much (see below).
I disliked:
- Everything is synchronous. All optix calls seem to block for completion, so I couldn't find a way to pipeline memory transfers with GPU work in a single optix context. Since my use case involved heavy interop between CPU and GPU, this was a big performance loss.
- No CUDA interop. There seems to be no support for using CUDA allocations in Optix kernels. So in particular you can't use page-locked host memory to remove all those redundant (blocking) copies completely.
In conclusion I have mixed feelings about Optix. I think it's a great tool for hobby projects or small demos, but I need async calls and much improved CUDA interop before I'd use it for anything larger.