Tuesday, March 23, 2010

Floating Point Depthbuffers in OSG

I submitted a new demo to the Open Scene Graph project that shows how to set up a floating point depth buffer. The standard way to do this is to create a Frame Buffer Object and attach a depth texture with a floating point format. That won't work if you're doing multisampling antialiasing; it that case, you attach a "render buffer" that has the appropriate number of samples to the FBO. Open Scene Graph takes care of many of the details for you.

The demo shows how to set up an FBO with a floating point depth buffer, as well as the effects of "inverting" the depth test. Setting 1.0 to correspond to the near plane and 0.0 to the far plane turns out to have better behavior than the more usual way. Also, the demo allows the user to cycle between all the available multisampling modes, as well as between floating point and integer depth buffers, of course. If you can't wait for this to show up in the OSG source tree, get the patch yourself.

Monday, March 01, 2010

Vertex Cache Optimization for OSG

I implemented Tom Forsyth's Vertex Cache Optimization algorithm in Open Scene Graph (OSG). The quick summary is that indexed lists of triangles in a mesh are more efficient than traditional representations like triangle strips. GPUs keep a cache of the post-transform values for the most recently used vertices in a mesh. This can avoid a whole lot of work, such as rerunning vertex shaders. Tristrips don't make good use of this cache because they introduce a new vertex for every triangle and only reuse the last two vertices. On the other hand, a random mesh order isn't good either; the triangles in a mesh need to be in a good order to take proper advantage of the cache. Tom describes an efficient algorithm for optimizing a mesh for the vertex cache. It produces an order that is "oblivious" to the actual cache size i.e., it will work well on a range of hardware.


The algorithm does work as advertised. I tried it on the Happy Buddha model, which is a monster-size mesh with over a million triangles. On my machine the optimization reduces the draw time for this mesh by 38 percent, which is enormous. This is a special case because the mesh is so large and regular, and Your Milage May Vary, but it does seem like a worthwhile optimization for large meshes. I've submitted my code to the OSG project.