OpenGL Skeletal Animations

Skeletal Animation Data Storage

Videogame character animations are typically done using skeletal animations. This means taking a mesh, and assigning portions of the mesh to a solid bone object, similarly to the way the skin and muscle tissue are attached to your real-life bones. As the bones are transformed, the corresponding parts of the mesh will be transformed as well. These transforms are applied to every single vertex in the mesh, potentially every frame, so the way that they are stored is critical to ensuring reasonable performance. Character meshes frequently have upwards of 100,000 vertices - modern graphics processors have no problem handling this math, so long as the data is stored efficiently.

A mesh, with a skeleton being applied to it in Blender. The bones here are represented as translucent pins.

While in a piece of software like Blender these bones have a physical shape, they are merely representations of transformations. If a bone rotates, all of the vertices attached to that bone rotate the same way, with the same origin. The bones can also translate and scale. Just like other types of transformations in 3D graphics, these transformations can be represented as a 4x4 matrix (mat4 in GLSL). This is also referred to as an affine transformation matrix. Because we know our bones represent transformations, we don't have to do anything with the physical bone object. Instead, we can calculate the transformation of each bone and apply them to our vertices in our vertex shader. These transforms must be updated every time we want to check the animation state of our skeletal mesh. Typically, we check every frame, unless our animation is paused or in an otherwise 'static' state. We need to be careful not to load too much data this way, or it could result in an unnecessary performance cost. On the other hand, our vertex data is only buffered once, so we can easily expand the size of our mesh if we want to include extra attributes, for instance: bone weight values.

Vertex Shader

Every Frame

Uniforms

B1 transform
B2 transform
...
Bn transform

Each transformation is a 4x4 matrix of floats. Each float is 4 bytes, for a total of 64 bytes per bone.

Buffer binding

Mesh Load

Vertex Data

For a typical layout, each vertex takes about 32 bytes: position and normal are each stored as vec3 (3 floats), and UV coordinates as vec2 (2floats), for a total of 8 floats.

Although it depends on the driver level implementation as to the degree, binding a uniform to the current OpenGL state or updating a uniform-buffer is much faster than trying to buffer all new vertices. This also has the benefit of letting us use the GPU for applying our bone transformations to the vertices - the matrix multiplication necessary for this application is essentially a collection of floating point multiplications and additions. CPUs can do the math, but GPUs are designed and optimized for running many of these types of operations simultaneously.

Now that our 'bones' are in the GPU and accessible by the vertex shader, we need to actually know which vertices to apply the bone transformations to, and more importantly, decide how we want to store this data. Vertex shaders operate per vertex, as the name suggests, so we flip our idea around. Instead of trying to find out which vertices to apply the bone transforms to, we can find out which bones affect each vertex. That way, we can leverage the parallel processing power of the GPU!

In my example mesh, the complete skeleton has 50 bones, and 3,691 unique vertices. Using a simple vertex layout, each vertex is represented by 8 floats:

Position - 3 floats (vec3)
Normal - 3 floats (vec3)
Texture coordinates - 2 floats (vec2)

layout(location = 0) in vec3 position;
layout(location = 1) in vec3 normal;
layout(location = 2) in vec2 texCoord;

In GLSL, a float has a fixed size of 4 bytes, so we have 32 bytes per vertex. In total, that's 118 kB. Not bad, although this is a pretty simple mesh. As a Wavefront *.obj, the mesh file is about 500kB - it contains quite a few other pieces of data. Now, we just add the weight each bone applies to the vertex, as a float, to our vertex data. Each bone needs a float, so we need 50 floats per vertex for this skeleton... increasing our vertex data size to 232 bytes! For the entire mesh, we are now up to 856kB. That's too much data for such a simple mesh. For reference, Kratos, the main character of God of War (2018) was made of about 80,000 polygons. Depending on how the model was laid out, it had 1 to 3 vertices per polygon - vertices can be shared after all! Taking the middle estimate, 160k vertices, means the model would take up 37MB of our VRAM with this scheme.

When inspecting a vertex in Blender, you can see how the weights are represented.

Fortunately, this can be pared down quite a bit. If we limit the number of bones that can influence a vertex, we end up saving a lot of space. After all, even if a skeleton has 50 bones, not all of those are actually going to affect the same areas of the mesh - maybe only one or two apply any kind of transform. With this method, we will need to store the bone index: 1 integer per bone. In Unreal Engine 4 the default limit of bone influences is 8. Following this pattern, every vertex contains 8 additional floats, and 8 additional integers for indices. This lands us at the much more reasonable 64 additonal bytes of vertex data - totaling 96 bytes per vertex. We can further save data by packing our index values together - we don't have 2^32 - 1 (roughly 4 billion) bones, so a lot of our index data isn't going to be used. Limiting the bone index to an unsigned byte still gives us access to the range over [0, 255], and shrinks our vertex data size to 72 bytes. Internally, our vertex data looks something like this:

Vertex

72 bytes

Position

12 bytes

Normal

12 bytes

Texture Coordinates

8 bytes

Bone1 Weight

4 bytes

Bone1 Index

1 byte

In GLSL, this looks something like this:

layout(location = 0) in vec3 position;
layout(location = 1) in vec3 normal;
layout(location = 2) in vec2 texCoord;

layout(location = 3) in float boneWeight[8];
layout(location = 4) in int boneIndex[8];

uniform mat4 boneTransforms[50];

The VBO data is still represented as a 4-byte primitive type in GLSL - in this case integer, even if the layout specifies it as a byte.

Our VAO should be set up like this in C++

const size_t size = sizeof(float) * 16 + sizeof(char) * 8;
glEnableVertexAttribArray(0);
glVertexAttribPointer(0, 3, GL_FLOAT, GL_FALSE, size, 0);
glEnableVertexAttribArray(1);
glVertexAttribPointer(1, 3, GL_FLOAT, GL_FALSE, size, sizeof(float) * 3);
glEnableVertexAttribArray(2);
glVertexAttribPointer(2, 2, GL_FLOAT, GL_FALSE, size, sizeof(float) * 6);
glEnableVertexAttribArray(3);
glVertexAttribPointer(3, 8, GL_FLOAT, GL_FALSE, size, sizeof(float) * 8);
glEnableVertexAttribArray(4);
glVertexAttribPointer(4, 8, GL_UNSIGNED_BYTE, GL_FALSE, size, sizeof(float) * 16);

The bone transformation values can be stored in either a Uniform Buffer Object, or simply bound for each separate skeletal mesh.

#version 400

layout(location = 0) in vec3 position;
layout(location = 1) in vec3 normal;
layout(location = 2) in vec2 texCoord;
layout(location = 3) in float boneWeight[8];
layout(location = 4) in int boneIndex[8];

uniform mat4 model;
uniform mat4 viewProjection;
uniform mat4 boneTransforms[50];

void main()
{
    vec3 wPosition = (model * vec4(position, 1.0)).xyz;
    vec3 transformedPosition;
    for (int i = 0; i < 8; i++)
    {
        if (boneWeight[i] > 0.01f)
        {
            transformedPosition += (boneTransforms[boneIndex[i]] * vec4(wPosition, 1.0)).xyz * boneWeight[i];
        }
    }
    wPosition = transformedPosition;
    gl_Position = viewProjection * vec4(transformedPosition, 1.0);
}