As we have already stated, Core GL, or CGL
as it is usually referred to, is the lowest-level support for OpenGL on
Mac OS X. We cover here just a few quick and easy but useful
recipes for using CGL in our Cocoa-based application. There may be Cocoa
equivalents to some of these, but the CGL version will also work with
your GLUT-based OpenGL programs or even a higher level third-party C++
framework you might choose to use. You can also use CGL exclusively to
create a full-screen context and render to it as needed, but as we’ve
just shown, this is no longer necessary with Snow Leopard.
All CGL functions we are interested in require the
current CGL context as one of the parameters. In any OpenGL application,
you can retrieve the current CGL context by calling CGLGetCurrentContext.
CGLContextObj CGLGetCurrentContext(void);
Sync Frame Rate
In our previous example program SphereWorldFS, our
event loop ran and rendered at full speed as many frames per second as
possible. This is useful when doing performance testing of your
rendering or processing code, as the frames per second is a simple
metric of just how fast your code can execute. In a shipping
application, there are two drawbacks to this, however. First, in
addition to excessive use of the GPU, you are also taking up all the
cycles on one of your CPU cores (at least!). If you consider that your
display refreshes typically 60 times per second, there is no real need
or purpose to displaying more than 60 frames per second. That excess GPU
power could be used to generate more sophisticated rendering effects,
or the CPU power could be used to improve other application processing
performance or perhaps add more detail or features to the application or
game.
Second, because the display only refreshes so many
times per second, rendering more frames per second than the display can
show causes tearing. Tearing occurs
when the buffer swap occurs at any point other than the vertical retrace
of the screen. Essentially, you get two different frames displayed
on-screen at the same time. The old frame occupies the area of the
display above the current display refresh position, and the bottom of
the screen is then filled with the new buffer contents. This is
especially jarring when the view is moving horizontally in the scene. Figure 1 shows a typical tearing example, where the display briefly shows two different frames.
In a double-buffered application, such as our
previous full-screen example, the swap interval sets the number of
vertical retraces that should occur before the buffer swap occurs.
Setting this value to one forces no more than one frame per vertical
retrace, while setting it to two allows two vertical retraces between
buffer swaps. For example, if the swap interval was set to one, and the
display refresh rate was 60 (about typical), you would get no more than
60 fps. For a swap interval of two, you’d get a maximum of 30 fps, and
so on. You set the swap interval with the CGL function CGLSetParameter.
GLint sync = 1;
CGLSetParameter (CGLGetCurrentContext(), kCGLCPSwapInterval, &sync);
Note, this does not “fix” the frame rate to equal the
refresh of the monitor. If your rendering, or CPU code for that matter,
takes an excessive amount of time, you may get less than the full
refresh rate of your monitor. What you still gain, however, is that the
buffer swaps only occur between refreshes, thus eliminating the tearing
issue.
Increasing Fill Performance
Fill performance refers to the performance overhead
in rendering that specifically relates to the time spent writing data to
pixels in the frame buffer. One easy way to improve fill performance is
to simply render to a smaller window, or in the case of the full-screen
application such as a
game, to change the screen resolution to a smaller value. Before Snow
Leopard, it was not uncommon for a full-screen OpenGL game, for example,
to change the screen resolution before running, capture the display,
and so on. Now that we no longer need the display capturing solution, we
can make use of CGL’s ability to change the size of the back buffer
instead of changing the screen resolution. Changing the back buffer to
be smaller than the front buffer has the added fill performance benefit,
without the need for a display mode change. The contents of the back
buffer are then automatically stretched to fill the entire display when
the buffer swap occurs.
To set the back surface size, we set the CGL parameter kCGLCPSurfaceBackingSize to the integer dimensions that we want. In addition, we must enable the kCGLCESurfaceBackingSize feature with CGLEnable. The following code shows how you would do this for a desired new size of newWidth x newHeight.
GLint dim[2] = {newWidth, newHeight};
CGLSetParameter(CGLGetCurrentContext(), kCGLCPSurfaceBackingSize, dim);
CGLEnable(CGLGetCurrentContext(), kCGLCESurfaceBackingSize);
Multithreaded OpenGL
The OpenGL driver does a significant amount of
processing of your rendering data before it eventually shows up on the
hardware for rendering. On OS X 10.5 or later, you can enable a
multithreaded OpenGL core that offloads some of these tasks to another
thread. On a multicore system, this can have a positive performance
impact. You can enable this feature by calling CGLEnable on the kCGLCEMPEngine flag.
CGLEnable(CGLGetCurrentContext(), kCGLCEMPEngine);
This does not always improve performance,
and in fact sometimes can reduce performance! If your OpenGL code is not
hampered by CPU processing, this may have little to no effect on your
rendering performance, for example. For another, if your rendering code
calls a lot of functions that produce pipeline stalls (glGetFloatv, glGetIntegerv, glReadPixels, etc.), these too can interfere with this potential optimization.