Part 3: Compression formats
Inexperienced developers may be tempted to save textures in a compressed format like jpg to save filesize, believing this increases performance. I'll say it again:
the filesize of your texture has no bearing on how much VRAM it takes up.
The jpg format itself is also a lossy format, meaning the way it saves filesize is to throw away data from the image. It does this by reducing the number of colours in your image, and grouping them together. You can never get this data back after saving, it's gone forever – by saving in jpg, you permanently degrade your image. Never use it for textures.
Other developers may be tempted by the png format as it offers superior quality whilst maintaining a relatively small filesize. It also supports alpha channels (transparency), which jpg does not. Png is a lossless format, meaning data is never thrown away, instead clever algorithms encode and decode the image data at either end. This requires extra processing but results in a smaller filesize. Png is a popular format for artists to supply textures in because it is lossless, but it's not ideal for game engines because it requires extra processing.
In truth, there is only one format you should be saving your textures in; Direct Draw-Surface, otherwise known as dds. It is the format native for directx, meaning your graphics card will convert any other image format to dds in order to display it. Jpg goes in, dds comes out. Png goes in, dds comes out. You can actually see the effects of this conversion process in action in Game Guru. Here's a comparison between an asset using png textures vs. using dds. Despite dds being a lossy format like jpg, the details appear sharper when using dds because your graphics card isn't having to convert a png to dds.
Comparison of PNG vs. DDS
The tool you use to save dds files matters. Nvidia do
a free DDS plugin for Photoshop which is widely considered the best tool out there in terms of the quality of the textures it produces. If you don't own Photoshop, a decent free option is
Paint.net, which is what I've used here. Regardless of the tool you use, you'll quickly realise dds comes in a huge variety of different flavours and compression types, so it's vital we consider which one is best for which type of texture.
To see how each compression format affects a texture, we'll use this stress-testing texture I made.
The stress-tester texture has graduated coloured bands and high detail areas designed to find compression artefacts
The dds format is lossy, it compresses your texture by chopping it up into 4x4 blocks, finding the two most contrasting colours in the block, and estimating the colours of the other fourteen pixels in the block must all have values somewhere between the two outliers. To stress-test this, our test texture has a band of tightly-packed multicoloured pixels which are impossible to arrange on a straight line from one colour to another.
A pattern impossible to replicate with 4x4 block compression
BC1
BC1 is the oldest format and offers the most compression and least strain on the GPU. It has a bit-depth of 16 rather than 24. Instead of using 8-bit colours per channel, BC1 compresses colours using 5:6:5 compression; this means that instead of 256 possible values in each channel, there are 32 for red (2^5), 64 for green (2^6), and 32 for blue (2^5). The lack of available colours causes a noticeable 'step' from one shade to the next, this is called “banding”.
Colour banding in DDS BC1
Because there are more available shades of green than of red and blue, you'll also find BC1 textures struggle to replicate greys well; often producing green or purple tinted pixels.
Green and purple artefacts are common in many DDS formats
It also totally falls apart when dealing with complex colour blocks.
Well...I guess that means our stress-test works
BC1 does support an alpha channel, but only a 1-bit alpha channel; a pixel is either fully opaque or fully transparent, there is no way to have smooth transitions between the two. This can still be useful for instances where you need a clear separation between opaque and transparent, such as on foliage and wire fences.
1-bit alpha channels in DDS BC1
By squeezing all this information into 16-bits, BC1 offers the greatest saving in terms of VRAM at the cost of outright quality. If your texture doesn't have too much grey in it, doesn't need a complex alpha channel, and isn't too tightly-detailed, BC1 is a viable option.
512 x 512 x 16 = 4,194,304 bits
4,194,304 / 8 = 524,288 bytes or 512kB VRAM
BC4
BC4 is a grayscale format that offers the same benefits of filesize and VRAM usage as a BC1 texture but with significantly better quality. Being grayscale, it only has to store one channel of data, not three, allowing us to have the full compliment of 256 shades of grey. BC4 is ideally suited to grayscale textures like _metalness, _AO, and _gloss, but it does not support alpha channels.
BC4 is excellent for grayscale textures
BC3
BC3 is essentially a combination of BC1 and BC4. It stores RGB colours in the same 5:6:5 compression as BC1, but stores an alpha channel in full 8-bit grayscale like a BC4. BC3 is a popular choice for any texture that requires a full alpha channel in order to simulate semi-transparency, or graduations. With 16 bits for colour and 8 for alpha, it has a bit depth of 24.
512 x 512 x 24 = 6,291,456 bits
6,291,456 / 8 = 786,432 bytes or 768kB VRAM
BC3: The best of both worlds?
BC5
BC5 is sometimes recommended for tangent-space normal maps. It uses two grayscale BC4-style channels to store X and Y information, with Z being reconstructed later in the pixel shader itself. The upside of this is increased fidelity in the normal map (compared to BC1 or BC3 where you don't have the full range of 256 colours), the downside is the extra memory required to load the texture, since it is has a bit-depth of 16+16 = 32, and the additional processing required in the pixel shader.
512 x 512 x 32 = 8,388,608 bits
8,388,608 / 8 = 1,048,576 bytes or 1,024kB or 1MB VRAM
'Signed' or 'unsigned' defines which channel is reconstructed in the pixel shader
BC6 and BC7
These two newer formats are only supported by D3D11-level hardware (directX11 and above), and only became an option to us Game Guru-ers since the DX11 update in 2018. These exotic and complicated formats use sophisticated techniques to pick the best compression method for each 4x4 block instead of just applying one to the entire image. The specifics are a little too complex to go into here (plus I'm not sure I fully understand them myself!) but I will link to a fantastic blog post at the end of this guide where you can do further reading if interested (it's where I learned most of this!).
BC7 compression is designed to take advantage of DirectX11 hardware
BC6 and BC7 do better with our stress test with BC1 and BC3, but it's important to remember we're stilling dealing with a lossy format so inevitably the results won't be perfect. In most real-world scenarios, BC7 will produce the best results, almost indistinguishable from a png original. BC6 is designed for storing floating point data as used in HDR images. Since Game Guru doesn't support HDR textures, you can probably ignore this format.
Stress-test comparison
The main downside to BC6 and BC7 is the increased encoding time required to save these textures in the first place. Paint.net offers different compression speeds, with slower being more accurate but taking longer. The resulting images are full 32-bit RGBA, making them twice as costly both in terms of filesize and VRAM usage.
512 x 512 x 32 = 8,388,608 bits
8,388,608 / 8 = 1,048,576 bytes or 1,024kB or 1MB VRAM
Despite the increase in quality, surprisingly very few AAA games use more than a handful of BC7 textures. This may be partly down to their increased filesize and memory footprint, but it also may have something to do with compatibility on older consoles that don't support the format.
B8G8R8
You'll notice that none of the block compression methods so far give a 100% accurate rendition of the png stress-test image. Both the Nvidia photoshop plugin and Paint.net do feature various other uncompressed formats of dds which are less commonly used because of their large filesize. If you really want 100% accuracy, your best bet is B8G8R8, which uses full 8-bit colour channels to save an uncompressed 24-bit colour image.
B8G8R8: I don't know why the channels are swapped around and not R8G8B8
512 x 512 x 24 = 6,291,456 bits
6,291,456 / 8 = 786,432 bytes or 768kB VRAM
Whilst it smashes the competition in terms of accuracy, the lack of compression triples the filesize compared to a BC3 or BC7 image. Over the course of an entire game, this will add up. I don't recommend using this method.
But seriously though, which format do I pick?
Now that you're aware of the different compression formats available, the question is which one do you use? This is really dependent on the texture itself but, in general, I would stick to the following rules:
PBR:
_color (no complex alpha needed, and not much greys/muted colours) = BC1
_color (complex alpha needed, or lots of greys/muted colours) = BC7
_normal = BC1
_gloss = BC4
_metalness = BC4
_ao = BC4
_emissive = BC1
_illumination = BC1
_surface = BC1 (Game Guru Max only)
DNS:
_D (no complex alpha needed, and not much greys/muted colours) = BC1
_D (complex alpha needed, or lots of greys/muted colours) = BC3
_N = BC1
_S = BC4
_I = BC1
It's always a balancing exercise between texture quality versus GPU memory requirements, and this has to be decided on a case-by-case basis. But this is also why you need to optimise both resolution and compression, as it buys you more VRAM to play with for your more important assets. Saving a few megabytes on background props might allow you to spend a few more megabytes textures closer to the player.
Hopefully this in-depth look at texture optimisation will save you lots of performance and enable your games to run more smoothly on a wider variety of hardware.
Further reading (and sources):
Understanding BCn Texture Compression Formats
http://www.reedbeta.com/blog/understanding-bcn-texture-compression-formats/
Torque 3D texture compression
http://docs.garagegames.com/torque-3d/official/content/documentation/Artist%20Guide/Formats/TextureCompression.html
DDS Files and DXT Compression
https://w3dhub.com/forum/topic/417101-dds-files-and-dxt-compression/
Steam hardware survey – December 2020
https://store.steampowered.com/hwsurvey/Steam-Hardware-Software-Survey-Welcome-to-Steam