Runtimes 2 - Data oriented design for interpolating keyframesThe first decision is how to store keyframes (poses). There is an intentional difference between the point of view from the toolset, and the point of view from in-game code.
Defining by structure
(animation -> bone -> property -> timeline) makes the most conceptual sense, so
Basis works on this level. This is counter-intuitive to the way a CPU is built to process data. Basis also supports a keyframe based workflow
(animation -> keyframes -> bones) for simpler situations or for traditional frame-based animations. This is closer to what we want in the SDK.
The SDK/runtimes should be designed around data processing rather than object structure. There are many complex systems in a game, and they are all fighting over a limited allocation of milliseconds per frame. Factors to consider:
- Parallelizability
- SIMD
- Cache utilization
- Memory usage
- Branching
On the object level, let's store a collection of animations, each with a collection of poses/keyframes. Within these keyframes will be an implicitly defined bone structure, based on data. Here is the important detail: Skeletons will not usually be very big. Some skeletons may have > 100 bones, but this will
not be the typical case. Many will have only a handful of bones.
How can we best represent this bone structure so that interpolating between 2 of them is efficient?
Parallelization will then happen on the object level if used. Multithreading a loop that may only process 10 or 20 bones for a hundred objects at 60fps is not ideal.
SIMD to non-SIMD context switching is a big cause of latency, and our arrays of data will be very small, so designing the interpolation stage for SIMD processing is not a great idea.
Cache utilization will be optimized by grouping data together which is processed together. Because SIMD is not an optimal choice, pure SoA is not necessary, and instead a pose can be a structure of arrays of bones' property
groups. Keyframes will be used in 2 pipeline stages - interpolation and skeleton transformation prior to skinning. So, data will be interpolated by association. A bone holds a lot of information which is separable (collision, skin, pre-computed curve indices for each component, parent, transform, color, etc).
Interpolating transform data, for example, could loop through an array of length/angle for pose 1, pose 2, and output pose, as well as an array of blends because each bone will typically use different curves. Due to the nature of L1 cache lines this should work out just fine, though the first lookup for each of these 4 memory blocks will stall.
Interpolating curves for the blend arrays will happen in a group. Scale-x/scale-y group together. Color interpolation is its own data - though it fits with the scaling group as it affects skinning, adding it would create an awkward sized structure. Position interpolation (for roots) will happen in a group.
Other non-interpolating properties such as parent, skin id, and so on, can be reviewed at a later stage of design. For now we can group them together.
Memory usage is going to be a bit higher by storing all bones in every keyframe. If a walking animation has a nice breathing motion which only requires 2 keys with curves in the chest, but does something more complex with its legs and arms since it's walking, memory could be better utilized with animations being built from bone timelines. However, because the typical case will have many bones keying together, the extra memory required will be negligible.
Branching should not be much of a consideration here. Virtual functions/inheritance is not an issue, and due to the separation of data we can just process each block confidently. The only real question is how to handle object roots because they can be attached to and detached from other bones. However, we may get away with branched processing of roots because their data is already isolated, and branch prediction will succeed 99% of the time because they will typically
not be attached to anything.
By looking at each factor, the ideal data structure has been defined on its own. Keyframes will contain arrays of bone properties such as transform property, scaling property, etc. while seeking to be packed and aligned on reasonable memory boundaries.