Rendering Volume Filling Triangles in OpenGL (with no buffers)

Posted on Updated on

This is the promised follow-up to Rendering a Screen Covering Triangle in OpenGL (with no buffers), except this time the goal is to write a shader that accesses every location in a 3d texture (volume).  We use the same screen covering trick as before to draw a triangle to cover a viewport match to the X and Y dimensions of the volume, and we use instanced rendering to draw repeated triangles for each layer in the Z-dimension.

The vertex shader looks the same as before with the addition of the instanceID.

flat out int instanceID;

void main()
{
	float x = -1.0 + float((gl_VertexID & 1) << 2);
	float y = -1.0 + float((gl_VertexID & 2) << 1);
	instanceID  = gl_InstanceID;
	gl_Position = vec4(x, y, 0, 1);
}

The fragment shader can then recover the voxel coordinates from gl_FragCoord and the instanceID.

flat in int instanceID;

void main()
{
	ivec3 voxelCoord = ivec3(gl_FragCoord.xy, instanceID);
	voxelOperation(voxelCoord);
}

Very similar to drawing the single screen covering triangle, we set our viewport to the XY-dimensions of the volume, bind a junk VAO to make certain graphics drivers happy, and call glDrawArraysInstanced with the Z-dimension of the volume, so that we draw a triangle per-slice of the volume.

glViewport(0, 0, width, height);
glBindVertexArray(junkVAO);
glDrawArraysInstanced(GL_TRIANGLE_STRIP, 0, 3, depth);

Which would look sort of like the following:

VolumeFillingTriangles

This can be useful for quickly processing a volume. Initially, I used this as an OpenGL 4.2 fallback (instead of compute shaders) so that I could still use the NSight debugger, until I realized this approach was actually outperforming the compute shader. Of course, when to use compute shaders, and how to use them effectively deserves a post of its own.

Rendering a Screen Covering Triangle in OpenGL (with no buffers)

Posted on Updated on

This one has been on the backlog for ages now.  Anyway, this is an OpenGL adaptation of a clever trick that’s been around for quite awhile and described in DirectX terms by Cort Stratton (@postgoodism) in the “An interesting vertex shader trick” on #AltDevBlogADay.

It describes a method for rendering a triangle that covers the screen with no buffer inputs.  All vertex and texture coordinate information are generated solely from the vertexID.  Unfortunately, because OpenGL uses a right-handed coordinate system while DirectX uses a left-handed coordinate system the same vertexID transformation used for DirectX won’t work in OpenGL.  Basically, we need to reverse the order of the triangle vertices so that they are traversed counter-clockwise as opposed to clockwise in the original implementation. So, after a bit of experimentation I came up with the following adaptation for OpenGL:

void main()
{
	float x = -1.0 + float((gl_VertexID & 1) << 2);
	float y = -1.0 + float((gl_VertexID & 2) << 1);
	gl_Position = vec4(x, y, 0, 1);
}

This transforms the gl_VertexID as follows:

gl_VertexID=0 -> (-1,-1)
gl_VertexID=1 -> ( 3,-1)
gl_VertexID=2 -> (-1, 3)

We can easily add texture coordinates to this as well:

out vec2 texCoord;

void main()
{
	float x = -1.0 + float((gl_VertexID & 1) << 2);
	float y = -1.0 + float((gl_VertexID & 2) << 1);
	texCoord.x = (x+1.0)*0.5;
	texCoord.y = (y+1.0)*0.5;
	gl_Position = vec4(x, y, 0, 1);
}

Which is going to provide in that homogeneous clip space region a position value varying from -1 to 1 and texture coordinates varying from 0 to 1 exactly as OpenGL would expect, all without need to any create any buffers. All you have to do is make single call to glDrawArrays and tell it to render 3 vertices:

glDrawArrays( GL_TRIANGLES, 0, 3 );

This draw a triangle that looks like the following:

glScreenSpaceTriangle

It’s surprising how often this comes in handy, in a later post I’ll describe how to adapt this trick to efficiently access the elements of a 3D texture.  It also amuses me greatly that Iñigo Quilez’s amazing demo/presentation “Rendering World’s With Two Triangles” could actually be renamed “Renderings Worlds With One Triangle.”

Readings on physically-based rendering

Posted on Updated on

Another nice collections of links and papers too valuable to lose among all my bookmarks.  This time on physically-based rendering, put together by Kostas Anagnostou (@thinkinggamer).

List maintained and updated over at his blog Interplay of Light:

http://interplayoflight.wordpress.com/2013/12/30/readings-on-physically-based-rendering/

 

Layered Reflective Shadow Maps for Voxel-based Indirect Illumination

Posted on Updated on

So, a lot has happened. I completed my Doctorate, almost moved to Norway, but then ended up moving to Canada instead (Victoria, BC). I now work for the Advanced Technology Group at Intel, where I was very fortunate enough  to have the opportunity to assist a new colleague of mine, Masamichi Sugihara (@masasugihara), with his publication “Layered Reflective Shadow Maps for Voxel-based Indirect Illumination,” which has been accepted to HPG 2014.


Check out the preprint here

We introduce a novel voxel-based algorithm that interactively simulates both diffuse and glossy single-bounce indirect illumination. Our algorithm generates high quality images similar to the reference solution while using only a fraction of the memory of previous methods. The key idea in our work is to decouple occlusion data, stored in voxels, from lighting and geometric data, encoded in a new per-light data structure called layered reflective shadow maps (LRSMs). We use voxel cone tracing for visibility determination and integrate outgoing radiance by performing lookups in a pre-filtered LRSM. Finally we demonstrate that our simple data structures are easy to implement and can be rebuilt every frame to support both dynamic lights and scenes.

Hire Masamichi!

Due to some rather shortsighted reorganization, Masasmichi is currently pursuing employment opportunities that will either; allow him to stay in Canada, or return to Japan. If you are interested in hiring a top-notch graphics coder, please get in touch.

Realtime Global Illumination techniques collection

Posted on Updated on

Martin Thomas (@0martint) has put together a very nice collection of links and papers for realtime global illumination techniques.

Check it out over at his blog:
http://extremeistan.wordpress.com/2014/05/11/realtime-global-illumination-techniques-collection/

Bindless textures can “store”

Posted on Updated on

I don’t know how I missed this when Nvidia released NV_bindless_texture, I guess because all the samples I saw used bindless textures to demonstrate a ridiculous number of texture reads. But I realized when reading the recently released ARB_bindless_texture extension that they can also be used to “store,” or write, to a very large number of textures (using ARB_shader_image_load_store functionality). Which finally gets rid of that extremely pesky MAX_IMAGE_UNITS limitation I’ve been complaining about. The only downside is that I can no longer run my program at home on my GTX 480.

AtomicCounters & IndirectBufferCommands

Posted on

I’ve made use of Atomic Counters and Indirect Buffers in the past, but always in the most straightforward manner. I.e. create a dedicated buffer for the atomic counter, and another for the Indirect Command Buffer, increment the counter in a shader then write the Atomic Counter value into the Indirect Command Buffer using the Image API, ending up with a shader that looks something like below.

#version 420

layout(location = 0) in ivec3 inputBuffer;

layout(r32ui, binding = 0) uniform uimageBuffer outputBuffer;
layout(r32ui, binding = 1) uniform uimageBuffer indirectArrayCommand;
layout(       binding = 0) uniform atomic_uint  atomicCounter;

void main()
{
	// ...
	// do some stuff
	// ...

	if(someCondition == true)
	{
		//increment counter
		int index = int(atomicCounterIncrement(atomicCounter));

		//store stuff in output buffer
		imageStore(outputBuffer, index, uvec4(someStuff)));
	}

	memoryBarrier();

	//Store the atomicCounter value to the count (the first element) of the DrawArraysIndirect command
	imageStore(indirectArrayCommand, 0, uvec4(atomicCounter(atomicCounter)));
}

This works fine, but one annoying thing about this approach is that it consumes an extra image unit (of the max 8 available). Fortunately, it turns out that it is unnecessary to create an extra atomic counter and perform the synchronization with the indirect draw command. It is possible to simply bind the appropriate element of the indirect draw buffer directly to the atomic counter.

// This binds the count element of the Indirect Array Command Buffer directly as an atomic counter in the shader
// (no need for copy from dedicated atomic counter)
glBindBufferRange(GL_ATOMIC_COUNTER_BUFFER,        // Target buffer is the atomic counter
                  0,                               // Binding point, must match the shader
                  IndirectArrayCommandBuffer_id,   // Source buffer is the Indirect Draw Command Buffer
                  0,                               // Offset, 0 for count, 1 for primCount (instances), etc...
                  sizeof(GLuint));

This allows us to get rid of Indirect Buffers image unit binding, which simplifies the shader as shown below. The main reason I’ve found to do this is reduce the number of image units required by the shader, as its very easy to hit the limit of 8.

#version 420

layout(location = 0) in ivec3 inputBuffer;

layout(r32ui, binding = 0) uniform uimageBuffer outputBuffer;
layout(       binding = 0) uniform atomic_uint  atomicCounter;

void main()
{
	// ...
	// do some stuff
	// ...

	if(someCondition == true)
	{
		//increment counter
		int index = int(atomicCounterIncrement(atomicCounter));

		//store stuff in output buffer
		imageStore(outputBuffer, index, uvec4(someStuff)));
	}
}