When using stream out with a geometry shader, it is common to want to take the results of one stream out pass and bind it as input to another pass. However, the geometry shader can emit a variable number of primitives; because of this, the programmer must read the number of primitives rendered back from the GPU, causing a stall. To mitigate this problem, DrawAuto was introduced in Direct3D 10 and recently in OpenGL. DrawAuto keeps everything on the GPU by automatically binding the stream out buffers to an input slot and issuing the draw call with the appropriate number of primitives filled in.
Draw indirect takes this one step further by allowing a buffer to be used as the arguments to a draw call. For example, consider the following compute shader.
RWBuffer<uint> args : register(u0);
void CS(uint3 id : SV_DispatchThreadID)
/* Perform some computation here */
if (id.x == 0 && id.y == 0 && id.z == 0)
args = 1000;
args = 1;
args = 0;
args = 0;
This compute shader writes out 4 unsigned integers to a buffer from thread 0. These values represent the arguments for our draw call.
And on the CPU side of things:
The result is the same as if
DrawInstanced(1000, 1, 0, 0)had been called. How cool is that?
There are a few draw indirect calls: