In an advanced application,
OpenGL’s order of operation and the pipeline nature of the system may be
important. Examples of such applications are those with multiple
contexts and multiple threads, or those sharing data between OpenGL and
other APIs such as OpenCL. In some cases, it may be necessary to
determine whether commands sent to OpenGL have finished yet and whether
the results of those commands are ready. OpenGL includes two commands to
force it to start working on commands or to finish working on commands
that have been issued so far. These are
and
There are subtle differences between the two. The first, glFlush,
ensures that any commands issued so far are at least placed into the
start of the OpenGL pipeline and that they will eventually be executed. glFinish,
on the other hand actually ensures that all commands issued have been
fully executed and that the OpenGL pipeline is empty. The problem is that glFlush
doesn’t tell you anything about the execution status of the commands
issued—only that they will eventually be executed, and while glFinish does ensure that all of your OpenGL commands have been processed, it will empty the OpenGL pipeline, causing a bubble and reducing performance, sometimes drastically.
Sometimes it may be necessary to know whether OpenGL has finished executing commands up to some point.
This is especially useful when you are sharing data between two
contexts or between OpenGL and OpenCL, for example. This type of
synchronization is managed by what are known as sync objects.
Like any other OpenGL object, they must be created before they are used
and destroyed when they are no longer needed. Sync objects have two
possible states: signaled and unsignaled.
They start out in the unsignaled state, and when some particular event
occurs, they move to the signaled state. The event that triggers their
transition from unsignaled to signaled depends on their type. The type
of sync object we are interested in is called a fence sync, and one can
be created by calling
GLsync glFenceSync(GL_SYNC_GPU_COMMANDS_COMPLETE, 0);
The first parameter is a token specifying the event we’re going to wait for. In this case, GL_SYNC_GPU_COMMANDS_COMPLETE
says that we want the GPU to have processed all commands in the
pipeline before setting the state of the sync object to signaled. The
second parameter is a flags field and is zero here because no flags are
relevant for this type of sync object. The glFenceSync function returns a new GLsync
object. As soon as the fence sync is created, it enters (in the
unsignaled state) the OpenGL pipeline and is processed along with all
the other commands without stalling OpenGL or consuming significant
resources. When it reaches the end of the pipeline, it is “executed”
like any other command, and this sets its state to signaled. Because of
the in-order nature of OpenGL, this tells us that any OpenGL commands
issued before the call to glFenceSync have completed, even though commands issued after the glFenceSync may not have reached the end of the pipeline yet.
Once the sync object has been
created (and has therefore entered the OpenGL pipeline), we can query
its state to find out if it’s reached the end of the pipeline yet, and
we can ask OpenGL to wait for it to become signaled before returning to
the application.
To determine whether the sync object has become signaled yet, call
glGetSynciv(sync, GL_SYNC_STATUS, sizeof(GLint), NULL, &result);
When glGetSynciv returns, result (which is a GLint) will contain GL_SIGNALED if the sync object was in the signaled state and GL_UNSIGNALED
otherwise. This allows the application to poll the state of the sync
object and use this information to potentially do some useful work while
the GPU is busy with previous commands. For example, consider the code
in Listing 1.
Listing 1. Working While Waiting for a Sync Object
GLint result = GL_UNSIGNALED;
glGetSynciv(sync, GL_SYNC_STATUS, sizeof(GLint), NULL, &result);
while (result != GL_SIGNALED) {
DoSomeUsefulWork();
glGetSynciv(sync, GL_SYNC_STATUS, sizeof(GLint), NULL, &result);
}
|
This
code loops, doing a small amount of useful work on each iteration until
the sync object becomes signaled. If the application were to create a
sync object at the start of each frame, the application could wait for
the sync object from two frames ago and do a variable amount of work
depending on how long it takes the GPU to process the commands for that
frame. This allows an application to balance the amount of work done by
the CPU (such as the number of sound effects to mix together or the
number of iterations of a physics simulation to run, for example) with
the speed of the GPU.
To actually cause OpenGL to
wait for a sync object to become signaled (and therefore, for the
commands in the pipeline before the sync to complete), there are two
functions that you can use:
glClientWaitSync(sync, GL_SYNC_FLUSH_COMMANDS_BIT, timeout);
or
glWaitSync(sync, 0, GL_TIMEOUT_IGNORED);
The first parameter to both functions is the name of the sync object that was returned by glFenceSync. The second and third parameters to the two functions have the same names but must be set differently.
For glClientWaitSync, the second parameter is a bitfield specifying additional behavior of the function. The GL_SYNC_FLUSH_COMMANDS_BIT tells glClientWaitSync
to ensure that the sync object has entered the OpenGL pipeline before
beginning to wait for it to become signaled. Without this bit, there is a
possibility that OpenGL could watch for a sync object that hasn’t been
sent down the pipeline yet, and the application could end up waiting
forever and hang. It’s a good idea to set this bit unless you have a
really good reason not to. The third parameter is a timeout value in
nanoseconds to wait. If the sync object doesn’t become signaled within
this time, glClientWaitSync returns a status code to indicate so. glClientWaitSync won’t return until either the sync object becomes signaled or a timeout occurs.
There are four possible status codes that might be returned by glClientWaitSync. They are summarized in Table 2.
Table 2. Possible Return Values for glClientWaitSync
Status Returned by glClientWaitSync | Meaning |
---|
GL_ALREADY_SIGNALED | The sync object was already signaled when glClientWaitSync was called and so the function returned immediately. |
GL_TIMEOUT_EXPIRED | The timeout specified in the timeout parameter expired, meaning that the sync object never became signaled in the allowed time. |
GL_CONDITION_SATISFIED | The sync object became signaled within the allowed timeout period (but was not already signaled when glClientWaitSync was called). |
GL_WAIT_FAILED | An error occurred (such as sync not being a valid sync object), and the user should check the result of glGetError() to get more information. |
There are a couple of things
to note about the timeout value. First, while the unit of measurement is
nanoseconds, there is no accuracy requirement in OpenGL. If you specify
that you want to wait for one nanosecond, OpenGL could round this up to
the next millisecond or more. Second, if you specify a timeout value of
zero, glClientWaitSync will return GL_ALREADY_SIGNALED if the sync object was in a signaled state at the time of the call and GL_TIMEOUT_EXPIRED otherwise. It will never return GL_CONDITION_SATISFIED.
For glWaitSync,
the behavior is slightly different. The application won’t actually wait
for the sync object to become signaled, only the GPU will. Therefore, glWaitSync
will return to the application immediately. This makes the second and
third parameters somewhat irrelevant. Because the application doesn’t
wait for the function to return, there is no danger of hanging, and so
the GL_SYNC_FLUSH_COMMANDS_BIT is not
needed and would actually cause an error if specified. Also, the timeout
will actually be implementation dependent and so the special timeout
value GL_TIMEOUT_IGNORED is specified
to make this clear. If you’re interested, you can find out what the
timeout value used by your implementation is by calling glGetInteger64v with the GL_MAX_SERVER_WAIT_TIMEOUT parameter.
You might be wondering, “What
is the point of asking the GPU to wait for a sync object to reach the
end of the pipeline?” After all, the sync object will become signaled
when it reaches the end of the pipeline, and so if you wait for it to
reach the end of the pipeline, it will of course be signaled. Therefore,
won’t glWaitSync just do nothing? This
would be true if we only considered simple applications that only use a
single OpenGL context and that don’t use other APIs. However, the power
of sync objects is harnessed when using multiple OpenGL contexts. Sync
objects can be shared between OpenGL contexts and between compatible
APIs such as OpenCL. That is, a sync object created by a call to glFenceSync on one context can be waited for by a call to glWaitSync (or glClientWaitSync) on another context.
Consider this. You can ask one
OpenGL context to hold off rendering something until another context has
finished doing something. This allows synchronization between two
contexts. You can have an application with two threads and two contexts
(or more, if you want). If you create a sync object in each context, and
then in each context you wait for the sync objects from the other
contexts using either glClientWaitSync,
you know that when all of the functions have returned, all of those
contexts are synchronized with each other. Together with thread
synchronization primitives provided by your OS (such as semaphores), you
can keep rendering to multiple windows in sync.
An example of this type of
usage is when a buffer is shared between two contexts. The first context
is writing to the buffer using transform feedback, while the second
context wants to draw the results of the transform feedback. The first
context would draw using transform feedback mode. After calling glEndTransformFeedback, it immediately calls glFenceSync. Now, the application makes the second context current and calls glWaitSync
to wait for the sync object to become signaled. It can then issue more
commands to OpenGL (on the new context), and those are queued up by the
drivers, ready to execute. Only when the GPU has finished recording data
into the transform feedback buffers with the first context does it
start to work on the commands using that data in the second context.
There are also extensions and other functionality in APIs like OpenCL that allow asynchronous writes to buffers. You can use glWaitSync
to ask a GPU to wait until the data in a buffer is valid by creating a
sync object on the context that generates the data and then waiting for
that sync object to become signaled on the context that’s going to
consume the data.
Sync objects only ever go from
the unsignaled to the signaled state. There is no mechanism to put a
sync object back into the unsignaled state, even manually. This is
because a manual flip of a sync object can cause race conditions and
possibly hang the application. Consider the situation where a sync
object is created, reaches the end of the pipeline and becomes signaled,
and then the application set it back to unsignaled. If another thread
tried to wait for that sync object but didn’t start waiting until after
the application had already set the sync object back to the unsignaled
state, it would wait forever. Each sync object therefore represents a
one-shot event, and every time a synchronization is required, a new sync
object must be created by calling glFenceSync.
Although it is always important to clean up after yourself by deleting
objects when you’re done with them, this is particularly important with
sync objects because you might be creating many new ones every frame. To
delete a sync object, call
glDeleteSync(sync);
This
deletes the sync object. This may not occur immediately; any thread
that is watching for the sync object to become signaled will still wait
for its respective timeouts, and the object will actually be deleted
once nobody’s watching it any more. Thus, it is perfectly legal to call glWaitSync followed by glDeleteSync even though the sync object is still in the OpenGL pipeline.