A Short Guide to 3D Graphics Performance testing

From Wikiid
Revision as of 03:05, 21 October 2010 by SteveBaker (Talk | contribs) (Created page with ' == Understand the system == In most 3D applications - whether under OpenGL, OpenGLES, WebGL or Direct3D, there are four principle places where speed bottlenecks can happen: # T…')

(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Jump to: navigation, search

Understand the system

In most 3D applications - whether under OpenGL, OpenGLES, WebGL or Direct3D, there are four principle places where speed bottlenecks can happen:

  1. The CPU - you have to calculate what meshes you're going to draw and set them up for rendering. This tends to be more or less a fixed cost per mesh.
  2. The transmission link between CPU and GPU which is a cost that depends on the number of vertices you send multiplied by the number of per-vertex attributes - plus the cost of updating textures and shaders that you might change during that frame.
  3. The GPU's vertex processor. This is the per-vertex processing cost of transforming/lighting your vertex data - without shaders, the cost roughly depends on the number of vertices times the number of lights you have turned on - with shaders, the cost roughly depends on the number of vertices times the complexity of your shader.
  4. The GPU's pixel/fragment processor. This is the per-pixel cost for pixels that pass clipping. The cost roughly depends on the number of pixels you draw onto the screen times the number of textures you use and/or the complexity of your fragment shader.

Find the bottleneck

If your application is running slower than you'd hoped - then you need to establish which of these four things is the problem.

GPU Pixel processing

Pixel processing time is easy to understand - reduce the size of the window you're rendering to (keeping everything else the same). If your program goes faster in rough proportion to the area of the window (height x width) - then pixel processing is the bottleneck.

CPU processing

If you have eliminated that then since CPU time generally doesn't depend on the number of vertices you draw, then you can (just as a test) keep rendering to a tiny window (to more or less eliminate pixel processing costs) and deliberately halve the number of triangles in each mesh. If your application's performance increases by roughly a factor of two then you were obviously not limited by the CPU's per-mesh costs - if your performance hardly changes - then probably you're drawing too many objects or doing too much per-mesh work in the CPU and you need to improve your code somehow.

GPU Vertex transmission/processing

Figuring out whether the transmission costs (2) or the GPU's vertex processing (3) costs are your problem is tricky. But since both depend mostly on number of vertices, you probably don't need to.

Eliminate the bottleneck

If CPU time is the culprit

You either need to optimize your code so that other CPU time-sinks are reduced (eg make physics, collision, AI, etc faster) - or you need to improve your field-of-view culling so you draw fewer meshes that are off-screen - or you need to reduce the number of meshes in your art (eg by combining multiple parts of an object into a single object using tricks like texture atlassing).

In my experience, the last of these is the first thing most people should be looking at.

If GPU vertex transmission/processing is the culprit

Then your meshes are too complex or you have too many (fixed function) light sources or your vertex shader is too complex. Use level-of-detail to reduce the number of vertices in meshes that are further from the eyepoint. Consider doing occlusion queries to reduce the number of meshes you draw. Optimize light source culling.

If GPU pixel processing is the culprit

Simplify your pixel shaders. Consider doing a depth-only pass before your 'beauty' pass. Can you reduce the resolution of your textures? Can you draw to a smaller window? Can you make better use of approximate front-to-back rendering order?