1. Lighting Optimizations
If your app has highly tessellated geometry
(lots of tiny triangles), then lighting can negatively impact performance.
The following sections discuss ways to optimize lighting. Use your time
wisely; don’t bother with these optimizations without first testing if
you’re fill rate bound. Applications that are fill
rate bound have their bottleneck in the rasterizer, and lighting
optimizations don’t help much in those cases—unless, of course, the app is
doing per-pixel lighting!
There’s a simple test you can use to determine
whether you’re fill rate bound. Try modifying the parameters to
glViewport to shrink your viewport to a small size. If
this causes a significant increase in your frame rate, you’re probably
fill rate bound.
If you discover that you’re neither fill rate
bound nor CPU bound, the first thing to consider is simplifying your
geometry. This means rendering a coarser scene using larger, fewer
triangles. If you find you can’t do this without a huge loss in visual
quality, consider one of the optimizations in the following
sections.
Note:
If you’re CPU bound, try turning off Thumb
mode; it enables a special ARM instruction set that can slow you down.
Turn it off in Xcode by switching to a Device configuration and going to
Project→Edit Project Settings→Build Tab→Show
All Settings, and uncheck the “Compile for Thumb” option under the
Code Generation heading.
1.1. Object-Space Lighting
To specify an infinite light source, set the
W component of your light position to
zero. But what are those “certain circumstances” that we mentioned?
Specifically, your model-view matrix cannot have non-uniform scale
(scale that’s not the same for all three axes).
1.2. DOT3 Lighting Revisited
If per-vertex lighting under OpenGL ES 1.1
causes performance issues with complex geometry or if it’s producing
unattractive results with coarse geometry, I recommend that you consider
DOT3 lighting. This technique leverages the texturing hardware to
perform a crude version of per-pixel lighting.
1.3. Baked Lighting
The best way to speed up your lighting
calculations is to not perform them at all! Scenes with light sources
that don’t move can be prelit, or “baked in.” This can be accomplished
by performing some offline processing to create a grayscale texture
(also called a light map). This technique is
especially useful when used in conjunction with multitexturing.
As an added benefit, baked lighting can be
used to create a much higher-quality effect than standard OpenGL
lighting. For example, by using a raytracing tool, you can account for
the ambient light in the scene, producing beautiful soft shadows (see
Figure 1). One popular offline tool for this is
xNormal, developed by Santiago
Orgaz.
2. Texturing Optimizations
If your frame rate soars when you try disabling
texturing, take a look at the following list:
Don’t use textures that are any larger than
necessary. This is especially true when porting a desktop game; the
iPhone’s small screen usually means that you can use smaller
textures.
Older devices have 24MB of texture memory;
don’t exceed this. Newer devices have unified memory, so it’s less of
a concern.
Use a compressed or low-precision format if
possible.
Use texture atlases to reduce the number of
bind calls.
Another tip: it won’t help with frame rate, but
your load time might be improved by converting your image files into a
“raw” format like PVRTC or even into a C-array header file.
3. Culling and Clipping
Avoid telling OpenGL to render things that
aren’t visible anyway. Sounds easy, right? In practice, this guideline can
be more difficult to follow than you might think.
3.1. Polygon Winding
Consider something simple: an OpenGL scene
with a spinning, opaque sphere. All the triangles on the “front” of the
sphere (the ones that face that camera) are visible, but the ones in the
back are not. OpenGL doesn’t need to process the vertices on the back of
the sphere. In the case of OpenGL ES 2.0, we’d like to skip running a
vertex shader on back-facing triangles; with OpenGL ES 1.1, we’d like to
skip transform and lighting operations on those triangles.
Unfortunately, the graphics processor doesn’t know that those triangles
are occluded until after it performs the rasterization step in the
graphics pipeline.
So, we’d like to tell OpenGL to skip the
back-facing vertices. Ideally we could do this without any CPU overhead,
so changing the VBO at each frame is out of the question.
How can we know ahead of time that a triangle
is back-facing? Consider a single layer of triangles in the sphere; see
Figure 2. Note that the triangles have
consistent “winding”; triangles that wind clockwise are back-facing,
while triangles that wind counterclockwise are front-facing.
OpenGL can quickly determine whether a given
triangle goes clockwise or counterclockwise. Behind the scenes, the GPU
can take the cross product of two edges in screen space; if the
resulting sign is positive, the triangle is front-facing; otherwise,
it’s back-facing.
Face culling is enabled like so:
glEnable(GL_CULL_FACE);
You can also configure OpenGL to define which
winding direction is the front:
glFrontFace(GL_CW); // front faces go clockwise
glFrontFace(GL_CCW); // front faces go counterclockwise (default)
Depending on how your object is tessellated,
you may need to play with this setting.
Use culling with caution; you won’t always
want it to be enabled! It’s mostly useful for opaque, enclosed objects.
For example, a ribbon shape would disappear if you tried to view it from
the back.
As an aside, face culling is useful for much
more than just performance optimizations. For example, developers have
come up with tricks that use face culling in conjunction with the
stencil buffer to perform CSG operations (composite solid geometry).
This allows you to render shapes that are defined from the intersections
of other shapes.
3.2. User Clip Planes
User clip planes
provide another way of culling away unnecessary portions of a 3D scene,
and they’re often useful outside the context of performance
optimization. Here’s how you enable a clip plane with OpenGL ES
1.1:
void EnableClipPlane(vec3 normal, float offset)
{
glEnable(GL_CLIP_PLANE0);
GLfloat planeCoefficients[] = {normal.x, normal.y, normal.z, offset};
glClipPlanef(GL_CLIP_PLANE0, planeCoefficients);
}
Alas, with OpenGL ES 2.0 this feature doesn’t
exist. Let’s hope for an extension!
The coefficients passed into
glClipPlanef define the plane equation; see Implicit plane equation.
Example. Implicit plane equation
One way of thinking about Implicit plane equation is interpreting A, B, and C as the components to
the plane’s normal vector, and D as the distance from the origin. The
direction of the normal determines which half of the scene to cull
away.
Older Apple devices support only one clip
plane, but newer devices support six simultaneous planes. The number of
supported planes can be determined like so:
GLint maxPlanes;
glGetIntegerv(GL_MAX_CLIP_PLANES, &maxPlanes);
To use multiple clip planes, simply add a
zero-based index to the GL_CLIP_PLANE0 constant:
void EnableClipPlane(int clipPlaneIndex, vec3 normal, float offset)
{
glEnable(GL_CLIP_PLANE0 + clipPlaneIndex);
GLfloat planeCoefficients[] = {normal.x, normal.y, normal.z, offset};
glClipPlanef(GL_CLIP_PLANE0 + clipPlaneIndex, planeCoefficients);
}
This is consistent with working with multiple
light sources, which requires you to add a light index to the
GL_LIGHT0 constant.
3.3. CPU-Based Clipping
The way to solve this problem is to use a
bounding volume hierarchy (BVH), a tree structure
for facilitating fast intersection testing. Typically the root node of a
BVH corresponds to the entire scene, while leaf nodes correspond to
single triangles, or small batches of triangles.