2024-01-06

Implementing Better Collisions In My 2D Platformer

4500 words, 20 minute read

What Is A Collision?

Game worlds are (more often than not) representations of objects in some physical space. Between frames of rendering, we simulate the evolution of the state of the world over some period of time and render a visual representation of that state to the screen.

Entities in the world can interact in many ways, but the most common interaction vector is likely a collision. A collision between two entities occurs when some pair of shapes defining the volumes of two objects in space intersect. This raises two distinct challenges. How can we detect that two entities are in collision? And how should we “resolve” that collision?

The challenge of collision detection is more technical in nature. For complex arbitrary shapes in 3D space, reliably detecting collisions can be expensive. Furthermore, a naive implementation would need to check collisions between every possible pair of colliding bodies, which quickly becomes untenable when large numbers of colliders exist in the game world. In the end there is always a compromise that must be made in order to detect collisions “well enough” without incurring large performance costs. Of course, the question of what “well enough” is depends entirely on the kind of game being made. This is something that should be defined early in the projects development, much in the same way that graphical budgets are imposed early to guide the development of visual assets.

Rocket League - Psyonix

In Rocket League the physics simulation runs at 120Hz for improved precision in collision detection. Running at a lower speed would result in unpredictable resolution of the impact between players and the ball, rendering the game unplayable. As a competitive online game, this places certain requirements on the network layer to be able to pass all that data between the server and client effectively.

Luckily for me, I’m creating a simple 2D platformer, and processors have come a long way since the NES. My game will not feature a large number of dynamic colliders, and for the few bodies that will require collision checks, I am more than happy to pay an increased computational cost for reliable and precise collision detection.

The second question, of collision resolution, is as much a design challenge as a technical one. In reality, there are well established laws of classical mechanics that govern how two objects should react to a collision. Some games take the approach of implementing a full physics simulation, where colliding bodies have associated physics materials that influence their collision response. In our case thats clearly overkill so we will take the alternative route, we can use physics and linear algebra to help us resolve collisions in a way that feels natural and satisfying to the player, while also servicing the core mechanics of the game at hand.

Axis-Aligned Bounding Boxes

In order to make life easier for ourselves, we will treat all colliders in our 2D space as Axis-Aligned Bounding Boxes (AABB). This introduces the unfortunate constraint that we cannot have rotating bodies, but affords a lot in the way of optimisation and algorithmic simplicity. Here I will be restricting the discussion to collisions in 2D space, but one upside of using AABBs is that all of the maths and algorithms here should generalise easily into 3D space. Also, while AABBs are unsuitable for primary collision detection in many games, many optimisation and culling techniques rely on a lot of the same underlying maths that I will present here.

Axis-Aligned Bounding Box

For any entity in our 2D game world we can draw an axis-aligned bounding box (AABB) around it.

/* "maths/geometry.hpp" */

struct AABB
{
    f32 x;
    f32 y;
    f32 w;
    f32 h;
};

/* "physics/collisions.hpp" */

bool Physics2D::OverlapAABB(AABB a, AABB b)
{
    bool colliding = true;

    // Horizontal overlap check.
    colliding &= (a.x < b.x + b.w && a.x + a.w > b.x) || (a.x + a.w > b.x && a.x < b.x + b.w);
    
    // Vertical overlap check.
    colliding &= (a.y < b.y + b.h && a.y + a.h > b.y) || (a.y + a.h > b.y && a.y < b.y + b.h);

    return colliding;
}

In my game, bounding boxes are defined by a pair of Vec2s defining the location of the bottom left corner and dimensions of the box. Given this definition it is easy to check if two bounding boxes overlap. This is fast and simple, but does not yield any information about the overlap, only whether or not they are overlapping. If we want to be able to resolve collisions we need some more information. One could decide to extend this method to yield the rectangle defining the intersection of the two bounding boxes. A common approach to resolving these collisions is to move the dynamic body along the shortest path that reduces the size of this overlap to zero. Again, this is relatively fast and can work well for many games, but there are a few big problems with this method of collision detection and response.

Example of resolving an overlap between two AABBs

One way of resolving an AABB overlap is to take the shortest side of the overlap rectangle, and move one or both entities along that direction to close the overlap.

Aliasing in Discrete Collision Detection

The root problem is a result of the discrete nature of the collision detection. Consider for a second the example of a high-frequency audio recording, sampled at a low bitrate. The highest frequency present in a recording can be thought of as defining a characteristic time between “features” of the data. When we sample the audio at a frequency below twice this largest frequency, aliasing will occur, this is the loss of information as smaller features of the data are not captured by a coarse grain measurement. In general to avoid aliasing we must sample at twice the highest frequency present in the waveform. This is known as the Nyquist theorem, and its application go far beyond just audio recording and transmission.

Example showing how undersampling a signal leads to loss of information

By sampling a wave at too low a frequency we can lose information, here the peaks of the original wave are not correctly captured in the second case.

In the context of our physics simulation we can think of the framerate of our game as the sampling frequency of some signal. What is the information present in this signal? Well it’s the positions and velocities of our bodies, and crucially, information about which bodies are colliding at any given time. Given our games framerate is likely limited in some capacity, it is then clear that aliasing will occur if the objects in our world are moving fast enough. In our case aliasing would result in us incorrectly deducing the direction in which we can resolve a collision, or even worse, missing a collision entirely!

Example showing how undersampling our objects positions leads to loss of information

If we imagine two objects moving along a line, over time their position on the line traces out a curve. The objects collide when those curves cross. Here we can see that undersampling the curves results in at least one pair of missed collisions.

These issues can be alleviated somewhat by reducing the maximum speed at which objects in your world can move, imposing minimum requirements on the dimensions of colliders, or by increasing the sampling rate of the physics simulation. In general, the vast majority of games will rely on discrete collision detection for the majority of their collisions, and as such will need to utilise some or all of these tactics. For important colliders such as player capsules and vehicles, many games will instead pay a performance price and turn to continuous collision detection techniques to achieve far superior detection.

Example of how aliasing leads to incorrect or missed collisions

Here we can see that a fast moving object can tunnel directly through another if the sampling rate of our physics simulation is too low. Similarly while we might still observe a collision in many cases, due to how far through the target collider the object has moved we might incorrectly deduce how to resolve the collision.

Continuous Detection to the Rescue!

The key issue with discrete collisions is that we are checking the states of the objects at each point in time, and if the objects are not directly in collision in any two consecutive moments, we can only conclude that a collision did not occur. There is an alternative approach which can fix some of these issues, which is to somehow simulate moving the object between the two discrete points and checking if a collision might have occured at any point along that path. This is not perfect by any means, and is computationally far more demanding than discrete collision detection, but can massively reduce the chances of missed collisions and tunneling phenomena.

There are two main approaches to continuous collision detection: Speculative & Sweeping algorithms. Speculative approaches work by computing an AABB that encapsulates both the initial and final positions of the object. This bounding box is then used to compute a superset containing all contacts that could occur between the object and other bodies along its path. These contacts can then be used as constraints which can tell the physics engine how to prevent any occurrences of tunneling. On the other hand, Sweeping algorithms cast the shape of the collider along a ray between the initial and final positions and calculate the points along that ray at which the shape comes into or out of collision. This is the approach I have decided to implement for the player collisions in my game. For further information about these two approaches and some of the advantages and drawbacks associated with them, this entry in the Unity manual is a great primer.

Swept AABB Bounding Boxes: Raycasting

The principle of swept AABB collision detection is simple. We use the velocity of an object to determine where it will be on the next frame and draw a Ray from its initial to final position. We then iterate over each other collider in the scene and cast the bounding box of our object along that Ray, searching for points at which the two shapes intersect. The earliest time of intersection represents the point of contact between the two objects. We can also use some simple algebra to extract the contact normal, which will help our collision resolution. The first step in implementing this system is to define a Ray.

Diagram showing the definition of a Ray

A Ray is simply a straight line that extends infinitely far in each direction.

A Ray is simply defined by a point origin $\vec O$ and a direction $\vec D$. It is an infinitely long line extending in both directions. Since the Ray is a one dimensional object, we can parametrise any point on the Ray by a single float, which we call $t$.

$$ \mathcal{\vec R}(t) = \vec O + t\vec D$$

We colloquially call $t$ the time parameter. Algorithms that attempt to find the first intersection of a Ray with another object are thus known as “time of intersection” algorithms.

/* "maths/geometry.hpp" */

struct Ray
{
    Vec2 Origin;
    Vec2 Direction;

    Ray(Vec2 origin, Vec2 direction)
        : Origin(origin), Direction(direction) 
    {
        Direction.NormaliseInPlace();
    }

    static Ray FromTo(Vec2 from, Vec2 to)
    {
        return Ray(from, to - from);
    }
};

Given a Ray and an AABB, we now want to be able to deduce if and where an intersection occurs. This process is known as raycasting. The algorithm I will use goes like this: We first decompose the Ray into its two components. We calculate the time at which the $x$ value of the Ray passes the $x$ values of the left and rightmost edges of the bounding box, and label these $t_{near, x}$ and $t_{far, x}$ respectively. We also do the same for the $y$ components, checking against the top and bottom edges. We should take care to reorder the near and far values here if necessary, such that $t_{near} < t_{far}$.

Diagram showing a pair of Raycasts against an AABB

We can determine the near and far intersection distances of each component of a Ray with the minima and maxima of an AABB in those dimensions. From these values it is possible to deduce whether or not a collision has occurred.

For example, if we position the left and right edges of the rectangle at $x_{left}$ and $x_{right}$, we find $$ t_{near, x} = Min\left(\frac{x_{left} - O_{x}}{D_{x}}, \frac{x_{right} - O_{x}}{D_{x}}\right) $$

In my implementation I bundle the near and far times for each component into a pair of Vec2s for brevity, and make sure to check for any infinities arising from a rogue divide-by-zero.

/* "physics/collision.hpp" */

bool Physics2D::Raycast(Ray ray, AABB target)
{
    Vec2 targetPos = {target.x, target.y};

    Vec2 tNear = (targetPos - ray.Origin) / ray.Direction;
    Vec2 tFar = (targetPos + Vec2{target.w, target.h} - ray.Origin) / ray.Direction;

    if (std::isnan(tNear.x) || std::isnan(tNear.y)) { return false; }
    if (std::isnan(tFar.x) || std::isnan(tFar.y)) { return false; }

    if (tNear.x > tFar.x) { std::swap(tNear.x, tFar.x); }
    if (tNear.y > tFar.y) { std::swap(tNear.y, tFar.y); }

    /* ... */
}

Okay, so now it’s time to interpret these values. Get out a piece of paper and draw a series of parallel rays across a bounding box. It is clear that for a Ray to intersect our bounding box, two conditions must be met.

$$\begin{aligned} t_{near, x} &< t_{far, y} \\ t_{near, y} &< t_{far, x} \end{aligned}$$

You can convince yourself that these hold for any Ray direction and origin by drawing out similar diagrams. Note that it is also vital that we have ordered the near and far times appropriately at this point. If an intersection has occured, we can get the two times of intersection by taking the maxima and minima of the near and far times respectively. There is an edge case to be mindful of here, if $t_{far} < 0$ this implies both intersection points lie in the Ray’s past, and as such we can conclude no collision occured. This kind of situation might arise in the frames following an elastic collision, where the colliding objects move away in opposite directions. In code the following checks can be performed easily.

/* "physics/collision.hpp" */

bool Physics2D::Raycast(Ray ray, AABB target)
{
    /* ... */

    if (tNear.x > tFar.y || tNear.y > tFar.x) { return false; }

    f32 tHitNear = std::max(tNear.x, tNear.y);
    f32 tHitFar = std::min(tFar.x, tFar.y);

    if (tHitFar < 0.0f) { return false; }

    return true;
}

Great! We can now cast a Ray against a bounding box and determine if an intersection occurs. However if we are to resolve collisions, we will likely need more information than just whether or nor a collision occured. Luckily for us, we have everything we need to determine the hit location, distance, and contact normal. The first two are simple, obtained by passing hit times back into the Ray equation to yield the location of the Ray at which the intersections occur. Obtaining the contact normal takes a little more work, but is simplified greatly by the fact that we are dealing with AABBs. All we have to do to figure out the contact normal is to determine which of the four edges of the bounding box contains the near intersection point. We can then store the hit info and return via an out parameter. Finally we also pass a maximum distance in to allow us to ignore ray intersections past a certain distance. The final raycasting procedure looks like this:

/* "maths/geometry.hpp" */

struct RaycastHit
{
    Vec2 Point;
    Vec2 Normal;
    f32 Distance;
};

/* "physics/collision.hpp" */

bool Physics2D::Raycast(Ray ray, AABB target, RaycastHit& outHit, f32 maxDistance)
{
    // Grab the near and far intersection times for the rays components.

    Vec2 targetPos = {target.x, target.y};

    Vec2 tNear = (targetPos - ray.Origin) / ray.Direction;
    Vec2 tFar = (targetPos + Vec2{target.w, target.h} - ray.Origin) / ray.Direction;

    if (std::isnan(tNear.x) || std::isnan(tNear.y)) { return false; }
    if (std::isnan(tFar.x) || std::isnan(tFar.y)) { return false; }

    if (tNear.x > tFar.x) { std::swap(tNear.x, tFar.x); }
    if (tNear.y > tFar.y) { std::swap(tNear.y, tFar.y); }

    // Now interpret results.

    if (tNear.x > tFar.y || tNear.y > tFar.x) { return false; }

    f32 tHitNear = std::max(tNear.x, tNear.y);
    f32 tHitFar = std::min(tFar.x, tFar.y);

    if (tHitFar < 0.0f) { return false; }

    // Fill out hit info.

    outHit.Point = ray.Origin + ray.Direction * tHitNear;
    outHit.Distance = tHitNear;

    if (tNear.x > tNear.y)
    {
        if (ray.Direction.x < 0.0f)
        {
            outHit.Normal = {1.0f, 0.0f};
        }
        else 
        {
            outHit.Normal = {-1.0f, 0.0f};
        }
    }
    else if (tNear.x < tNear.y)
    {
        if (ray.Direction.y < 0.0f)
        {
            outHit.Normal = {0.0f, 1.0f};
        }
        else 
        {
            outHit.Normal = {0.0f, -1.0f};
        }
    }

    return (tHitNear < maxDistance);
}

Swept AABB Bounding Boxes: Boxcasting

Now that we have built out the mathematical foundations of raycasting, we can turn our attention to the challenge of casting an AABB along a Ray and checking for collisions. It might seem like a difficult challenge at first, and like me you might initially think of doing things like a Raycast from each corner of the bounding box, but there is a very cheap solution we can find if we exploit the fact that we have restricted ourselves to using AABBs. Imagine placing two bounding boxes in contact and holding one still while we slide the second around the first, maintaining contact between the two as we do so. If we follow the center of the moving shape, we find that it traces out an expanded bounding box. It turns out that if we want to detect collisions between two AABBs it suffices to calculate the expanded AABB and peform a raycast from the center of our dynamic body against that expanded shape. I was really blown away at the elegance and simplicity of this idea when I first saw it, and to make things even better it’s dead simple to implement in code.

/* "maths/geometry.hpp" */

bool Physics2D::AABBcast(AABB source, AABB target, Vec2 direction, RaycastHit& outHit, f32 maxDistance)
{
    Ray ray = Ray({source.x + source.w/2.0f, source.y + source.h/2.0f}, direction);
    target.x -= source.w/2.0f;
    target.y -= source.h/2.0f;
    target.w += source.w;
    target.h += source.h;
    return Raycast(ray, target, outHit, maxDistance);
}

Diagram showing a boxcast via a raycast against an expanded AABB

By calculating the expanded AABB of the target by adding half the size of our objects AABB in each direction, we can perform a raycast to determine if and when a collision will occur.

That’s it, we now have everything we need to detect and characterise collisions between moving bodies in our game world! I’ll now take some time to explore what we can do with all that useful collision information we collected, and then outline some potential optimisations we can make to stop this eating away at our precious framerate.

How I Choose To Resolve Collisions In My Game

There are many choices to be made when determining how to resolve a collision. In reality, we have laws of motion such as the conservation of momentum and energy which dictate the initial and final states of objects in collision. If we wanted to, we could assign a coefficient of restitution and mass to our colliders and do all the fun maths to get semi-realistic collisions working. In many games though, there is simply no need for this, a lot of 2D games which do simulate bouncing will assume perfectly elastic collisions and simply reflect the bodies velocity across the contact normal. In my case, as of right now I have no need to simulate bouncing, my attention is firmly pointed at the character controller. There are only a few key things I’d like my collision resolution to do.

The character should never catch or snag on geometry. Collisions should feel consistent, fair, and reliable.
The character should retain the component of their velocity pointing along the contact surface to give a sliding motion.
The sliding motion should have the correct velocity, it should “feel right”.

Okay, maybe that last one is a bit hand-wavey, but you get the point. The key here is that the player should collide with walls, floors, ceilings, and carry on sliding along the surface with some appropriate velocity. Luckily for us, we have the contact normal, and the power of linear algebra on our side.

What we want to do is alter the players velocity such that it moves up to surface, and slides along to a point in line with where it would have been if it were allowed to pass through. This is colloquially known as move-and-slide. The first thing we need is a vector that points along our surface. Since we are in 2D and our normals are axis-aligned this is easy, we just swap the components of the normal vector. (Note this algebra is not applicable for arbitrary rotated surfaces)

$$ \vec S = (n_y, n_x)$$

We then take this vector and calculate its dot product with the players initial velocity, giving us a final sliding speed in the direction of that vector. Finally, we multiply the speed by the surface vector, giving a final velocity to apply to the player. This only works because we have AABBs, and it is not exactly physically realistic, but it works really well for my purposes and gives a satisfying fluid feel to the character movement along surfaces.

$$ \vec v_f = \vec S \cdot (\vec S \cdot \vec v_i) $$

Looking Forward: Broad-Phasing & Quadtrees

For my purposes, the solution and implementation outlined here works really well, but other games have other requirements, and the naive version outlined here would certainly start to cause performance issues if the number of colliding entities grew too large. In general there are two ways to optimise code, the first is to go into the functions being called a bunch of times in our tight loops and start tinkering, finding ways to reduce the numbers of expensive mathematical operations we are doing for each collision. We could even rewrite some areas of our maths library to make use of SIMD instruction sets. The second thing we can do, which should always be the first thing to pay attention to, is to try and figure out if we are doing any unnecessary work that can be cut entirely.

Consider a situation where we have a game world that exists in length scales of meters, and the players movement speed is on the order of 10$ms^{-1}$, running at a solid 30 fps, we can be fairly confident that the distances covered by the player in a given frame are on the order of 1$m$ or less. Therefore we can be confident that the player is not going to be in collision with a large number of objects in the game world in any given frame. One way to confidently discard these potential collisions without performing our expensive raycasting algorithm is to do a cheaper “broad-phase” collision check first. In our case we can simply draw a bounding box around the players current position and next position, then check for any overlaps between that expanded AABB and the objects in our scene. We then know that we only need to perform the expensive collison checks for objects that were captured by the broad-phase check.

This broad phase collision checking is a good start, but for very large numbers of entities even this can end up being a fairly expensive procedure. We still need to check the expanded bounding box of every dynamic object against every collider in the scene! This has an algorithmic complexity that scales as $O(N^2)$, not good! How can we confidently discard some objects from even this broad phase check such that we can reduce the alogrithmic complexity of our simulations and improve scalability for large numbers of entities? One possible approach is to build and maintain special datastructures called spatial accelaration structures which can allow us to rapidly query distances between objects in the game world, and confidently reason about which pairs of objects will not possibly come into contact in a given frame. One such example of this is the quadtree, a simple tree like structure which recursively evenly subdivides space in our game world into smaller and smaller chunks.

To construct a quadtree for our game world, we start by defining a root node of the tree, which has an associated bounding box, in this case it encloses our entire game world. We then split that bounding box into 4 equally sized children, and attach them as child nodes of the root. We then do the same, recursively, until we reach some desired depth in the tree. In order to fill our quadtree with useful data, we then iterate over all of the objects in our game world and inspect their AABB. We start at the root node and walk down the tree. At each layer we check for overlaps between our entities AABB and the AABB of the children nodes. There are two possibilities: If the entity AABB lies entirely within the AABB of any child, we proceed down that edge and carry on until the other outcome occurs, where the entity AABB overlaps more than one of the child bounding boxes. In this case we can attach this entities bounding box to the current node of the tree. Once we have filled out our quadtree with the bounding boxes of all entities in our game world, we can now make use of the relational information encoded in its structure to speed up our collision detection.

For any given entity, since it is entirely enclosed within the bounding box $A$ of its associated node within the quadtree, we can confidently say that the object will not come into contact with objects in adjacent branches of the tree. This is because they lie entirely within a different bounding box $B$ which does not overlap $A$ at all. It is then clear to see that for a given entity, we only need to include entities which exist on the same node or a descendant of that node in the broad-phase collision check.

Building and maintaining this quadtree is not free, you need to store more data, which incurs a memory cost, and you need to constantly update the tree structure between frames, which can be very costly. This is especially true where we have moving objects whose expanded bounding boxes must constantly be recalculated between frames. Tree structures are great for when you need to encode relationships between data and query those relationships quickly, but they are undoubtedly expensive to traverse, requiring pointer lookups at every step. The advantage they offer now however, is that we have a way to reduce the algorithmic complexity of our broad-phase check to $O(log(N))$. For many simple games the expense of maintaining a quadtree is simply not worth it and might even incur a performance penalty, but for any game that needs to simulate collisions between a large number of entities they are incredibly powerful.

Conclusion

We’ve discussed what collisions are, and why they are important in games. We’ve learnt about the both collision detection and resolution, and have discussed some of the considerations that must be made during the development of any game. We learned about some of the pitfalls associated with discrete collision detection, and have seen how the more expensive approach of continuous detection can resolve some of those issues. We briefly touched on one approach to collision resolution, and took some time to explore potential optimisation techniques, namely the use of broad-phasing and spatial accelaration structures. I plan to post more articles like this one as I move further along with development of my game. Source code is publicly available on GitHub for anyone to view, if you have any questions or feedback feel free to drop me an email at samhaskell@proton.me.