Capturing screenshots with Rust + OpenGL

Intro

One ongoing area of side projects for me has been game development. It's an area which I've often found interesting, and while the only games I've completed so far have been small games for Ludum Dare with the deadline hanging over, I've often had a few larger side projects running around.

In general for these side projects, I tend to write a lot of the lower level code myself, mostly because it's something that I enjoy learning about. Part of this means that I can't just rely on the framework or engine rendering my game correctly, and instead need some way to test it. Unit tests clearly aren't enough here, while they can verify that I'm calling the functions I expect, they really can't verify that this results in something cohesive being rendered to the screen.

Inspired by tools that exist in my day job of web development like Percy, I decided to take the approach of comparing screenshots taken to reference screenshots.

For some reason, I was under the impression that Factorio had used a similar approach and detailed it in one of their FFF blogs, but when researching for this blog post, I was unable to track this down. They certainly perform integration testing but it appears screenshots are not involved.

Getting the screenshots

The first step in this challenge was to obtain the pixel data of the currently rendered game. The game in question is using OpenGL, and luckily, OpenGL provides a function to read back the current framebuffer, in glReadPixels.

For this example, I'm using Rust, and the libraries which I use to wrap OpenGL don't provide a safe wrapper around the low level library, so I need to duck into unsafe code.

enum XrayError {
    CaptureError
}

type XrayResult<T> = Result<T, XrayError>;

fn capture_image(x: i32, y: i32, width: u32, height: u32) -> XrayResult<DynamicImage> {
    let mut img = DynamicImage::new_rgba8(width, height);
    unsafe {
        let pixels = img.as_mut_rgba8().unwrap();
        let pixel_buffer_ptr = pixels.as_mut_ptr() as *mut c_void;
        let height = height as i32;
        let width = width as i32;

        gl::PixelStorei(gl::PACK_ALIGNMENT, 1);
        gl::ReadPixels(
            x, y, width, height, 
            gl::RGBA, gl::UNSIGNED_BYTE, 
            pixel_buffer_ptr);

        let error_code = gl::GetError();
        if error_code != gl::NO_ERROR {
            return Err(XrayError::CaptureError);
        }
    }

    Ok(img)
}

Obtaining an image buffer

So. let's break this function down. The first step is to create a buffer for our pixel data to be stored in when read from OpenGL.

let mut img = DynamicImage::new_rgba8(width, height);
let pixels = img.as_mut_rgba8().unwrap();
let pixel_buffer_ptr = pixels.as_mut_ptr() as *mut c_void;

The first step here is to create a DynamicImage of the appropriate size. This enum is used as the main type for images in the image crate. Specifically, we're going with rgba format pixels, simply because it makes some of the maths easier later.

The next step is to get the image buffer for our newly created image. Since we have our pixels in rgba format, we use the as_mut_rgba8 call. This returns an Option, as if you were handling arbitrary images, the pixels could use a different format and you would need a more expensive conversion to get them in this format. However, as the image was just created on the previous line, we can simply unwrap it.

If OpenGL was a Rust library, then we could stop there. However, since it's a low level API and its bindings are defined in C, we need to obtain a void * for the buffer to pass to OpenGL. We can use the as_mut_ptr() method to get a pointer to the buffer, which we then cast to the appropriate type for OpenGL. (Similarly, our image API works in terms of unsigned integers, but OpenGL is defined in signed integers, so we need to convert our unsigned integers to signed integers).

Reading pixel data from OpenGL

The next two calls are where we interact with OpenGL itself.

gl::PixelStorei(gl::PACK_ALIGNMENT, 1);
gl::ReadPixels(x, y, width, height, 
               gl::RGBA, gl::UNSIGNED_BYTE, 
               pixel_buffer_ptr);

The first call to glPixelStorei configures the PACK_ALIGNMENT. This is used to describe how pixels are laid out in memory. To cut it short, a value of 1 indicates that there is no padding between rows in the image, and consequently the raw image data is stored in memory without any special treatment. Some image formats and optimization techniques may rely on rows being aligned on specific byte values, but the image crate expects no padding in its image buffer, so we tell OpenGL to produce image data in this format.

The next call is the actual call to glReadPixels. The first four parameters are simply an offset and dimensions for the area actually captured, relative to the bottom left of your framebuffer. The fifth and sixth parameters reference the format pixels are written in. RGBA means that each pixel has all four components, and UNSIGNED_BYTE indicates that each component is a unsigned number in the range 0-255. Effectively, this gives us a buffer with 32-bit RGBA image data.

The final parameter is the pointer to the buffer where the image data ultimately gets written.

let error_code = gl::GetError();
if error_code != gl::NO_ERROR {
    return Err(XrayError::CaptureError);
}

Error handling in OpenGL is handled by the glGetError() function. This returns either the constant GL_NO_ERROR or an error code if an error occured. We map this to a more Rust-like error handling scheme by returning an Err if an error code was returned by the underlying API, or an Ok(DynamicImage) if the call succeeded.