Advanced Shader Memory Usage

Introduction

In addition to simple uniform variables and other data passed from one pipeline stage to the next, there are several other types of buffers that can be used by shader programs. The options differ in several ways including whether the buffer types are opaque and whether they are writable. The table below lists the types of buffers in the order in which they were incorporated into OpenGL.

Type	GLSL data type g→blank, i, u, …	Opaque?	If opaque, GLSL built-in functions used	Allowed access	First appeared
Image Texture	g`sampler`*	Y	`texture`* `textureSize`, `texelFetch`	R	2.0
Texture Buffer Object (TBO)	g`samplerbuffer`	Y	`textureSize`, `texelFetch`	R	4.1
Image Data	g`image`*	Y	`imageLoad`, `imageStore`, `imageSize` `imageAtomic`*	RW	4.2
Shader Storage Buffer Object (SSB)	`buffer`	N		RW	4.3

We have already studied and used sampler-based image data buffers for ordinary texture mapping. The others are all much more recent additions to OpenGL, and we will take an introductory look at those here.

Texture Buffer Objects (TBOs)

As we saw earlier in the course, ordinary image textures are accessed in shaders using the built-in texture* functions. Their data is sent to the GPU using glTexImage* and stored in a manner specific to their use as image-based textures.

TBOs are essentially a special type of 1D image textures and can be used to store a 1D array of data. Among the differences from ordinary image textures is that the storage for the data is associated with a buffer object (created, as usual, using glGenBuffers). In fact, this is where they get their name "Texture Buffer Object" since it is a (1D) texture associated with a buffer. As a result, they can be much larger than ordinary image textures, perhaps as large as a few gigabytes or more.

As can be seen in the table above, TBOs are declared in your shader program as a gsamplerbuffer (instead of a gsampler), and they are accessed using texelFetch (instead of texture*). Unlike texture*, texelFetch does not do any filtering. In fact, texelFetch is given a single integer index: 0 ≤ index < textureSize(theSamplerBuffer), and hence the TBO is treated simply as a (large) singly-dimensioned read-only array.

Example: The image on the right was generated using TBOs to hold multiple attributes (some measured; some simulated) from Hurricane Isabel. The OpenGL code draws a single rectangle twice. The first time, it uses a shader program consisting of just a vertex and fragment shader that accesses a selected scalar field from the data set to determine color. Here the temperature scalar field is used, and colors vary from dark green (cold) to bright green (warm). There is no data over land, hence the white area depicting the eastern United States is simply left in the background color.

The second time the rectangle is drawn, we use a shader program consisting of all four shader types (vertex → tessellation → geometry → fragment) that accesses TBOs holding wind velocity vectors. The tessellation shader samples the vector field and outputs points with associated PVAs describing the position and wind velocity vector at each point. The geometry shader then actually creates the lines representing the vectors. Finally, the fragment shader colors the vector lines based on some attribute. In the image on the right, the vectors are colored by speed (i.e., the length of the velocity vector).

Importance of GPU buffers and their interaction with the shader programs: We require dynamic interpolation and resampling of the scalar and vector fields. This is done differently for the scalar and vector fields, but both require the use of GPU buffers and shader programs to be effective. Specifically:

For the scalar field, this interpolation and resampling happens in the fragment shader. Depending on the current zoom level, we may be viewing more or less detail from the scalar field. For each pixel, we determine the grid cell that contains it, and we compute its relative location within the cell. That allows us to calculate a temperature for the pixel from a weighted sum of the temperatures of the four surrounding cell vertices.
For the vector field, the tessellation control shader establishes a grid at the current required resolution (specified in a uniform variable). Then each tessellation evaluation shader instance resamples the vector field at the location specified by its gl_TessCoord in a manner analogous to that described for the scalar field in the fragment shader. The tessellation evaluation shader then outputs a point that holds the interpolated vector along with other PVAs to the Geometry shader which then turns each such point into a line segment representing the direction and length of the vector.

This diagram illustrates the various buffers and variables used in the shaders that produced the image on the right.

Image Data

The primary new capability that this facility adds is that it is possible for shaders to write in addition to read this type of object. That is, the shader programs can actually modify this type of buffer data. Since shader programs run in a massively parallel fashion, race conditions are an issue. One solution to that is possible using the family of imageAtomic* routines in your GLSL code.

Shader Storage Buffer Objects (SSBs)

Shader Storage Buffer objects remove the veil of "opaqueness". That is, these types of buffers can be directly accessed without using built-in GLSL functions. In addition, they can have specific named fields just like a C/C++ struct or class. Like Image Data, they can also be written as well as read. There is a different set of atomic functions that can be used to avoid race conditions when writing SSBs: atomic*.

Example: Several physical processes (CT scans, MRI, etc.) and/or simulation algorithms produce data sets characterized as a 3D array of numeric values. So-called voxel data sets can be imagined as in the diagram on the left. There is one numeric value in each subcube of this voxel grid. The value may be a scalar (e.g., temperature, absorbance, etc.), a vector (e.g., a velocity vector), or anything else.

The image on the right was generated using a ray tracing algorithm executing in the fragment shader that traces rays through the voxel grid. The data in this case correspond to the measured absorbance. That is, each subcube of the voxel grid has an absorbance in the range 0 ≤ a ≤ 255. Transfer functions are used to map values in the given range to renderable properties such as translucency and color.

The CPU OpenGL code simply draws the six faces of the cube. The fragment shader traces a ray from each pixel on a cube face through the cube. At each sample point along each ray, an interpolated voxel value is computed using an inverse distance weighting of the 6 subcube voxel data values of the subcube containing the current sample point. The interpolated value is used to get an alpha which is then accumulated while proceeding along the array. The tracing of the ray stops when the alpha value gets sufficiently close to 1, or when the ray exits the back of the voxel grid, whichever comes first.

The "numBytes" of voxel data was read into an internal array on the CPU called attrArray. We create a Shader Storage Buffer and send the data to the GPU as follows:

glGenBuffers(1, voxelGrid);
glBindBuffer(GL_SHADER_STORAGE_BUFFER, voxelGrid[0]);

// EITHER:
glBufferStorage(GL_SHADER_STORAGE_BUFFER, numBytes, attrArray, 0); // Immutable; Requires at least OpenGL 4.4
// OR:
glBufferData(GL_SHADER_STORAGE_BUFFER, numBytes, attrArray, GL_STATIC_READ); // What we have been using to date

int bindingPointIndex = 0; // layout (std430, binding = 0) buffer VoxelGrid
glBindBufferBase(GL_SHADER_STORAGE_BUFFER, bindingPointIndex, voxelGrid[0]);

Since you as the GLSL programmer can directly access the buffer data as opposed to being forced to use GLSL built-in functions, you must specify something about the structure of the data buffer. The GLSL declaration of the voxel data buffer shown above being sent to the GPU is:

layout (std430, binding = 0)
buffer VoxelGrid
{
    int d[];
} voxelGrid;

Then in the body of the shader, the data is accessed as:

int index = …;
float oneDataVal = voxelGrid.d[index];

Importance of the GPU buffer and its interaction with the fragment shader: It should be clear that each fragment shader invocation requires access to the complete voxel grid so that it can step along the ray that starts at the pixel on a cube face and proceeds through the voxel cube.