CMSC 435/634 Assignment 5

Before you start

This assignment is another extension to the "GLapp" framework. The changes we will make are orthogonal to either of the previous two assignments, so you can use your assn4 solution, your assn3 solution, or even the starter GLapp code as a basis. No matter which you use, be sure to do all of your work for this assignment in the "GLapp" directory of your repository so we can find it. If you want to start from an older version, be sure to use "git checkout commitID GLapp" to get the previous version without impacting your prior assignment history.

The Assignment

For this assignment, you will implement DXT1 texture compression for the color texture files as you load them. Do not use this compression for the normal or props files. This game developer blog post uses D3D naming conventions for the various texture compression formats, but still does a pretty good job of explaining them. Khronos (the organization that makes the OpenGL specification) has a page with some relatively explicit technical details.

You process each 4x4 block of the image independently. For the purposes of this assignment, you can assume our texture dimensions are always divisible by 4 (no partial blocks). For each block, you will find two representative colors, which you encode into 16 bits each using 5 bits for red, 6 green, and 5 blue. Then you'll have two bits for each texel in the block, representing color 0, color 1, 1/3 of the way between the two, and 2/3 of the way between the two. That makes a total of 64 bits or 8 bytes of data for each block.

Note that the data format assumes little-endian byte ordering (the same as Intel and Apple processors), but the byte ordering diagrams on the Khronos page are big-endian. This actually complicates things, especially if you know you'll only be running on little-endian CPUs. For us, you can pack the colors with a much more straightforward combination of shifts and or's (I'm calling the representative colors k0 and k1 here, for key colors to try to avoid some confusion with all of the things named "r" or "c"):

uint64_t k0 = (k0red<<(5+6)) | (k0green << 5) | k0blue;
uint64_t k1 = (k1red<<(5+6)) | (k1green << 5) | k1blue;
uint64_t block = k0 | (k1 << 16) | (codes << 32)

The selection of representative colors makes a big difference on image quality, but a good first choice is to use the largest eigenvector of the covariance matrix: \[ M= \left(\begin{matrix} \sum{(r_i-\bar{r})(r_i-\bar{r})} & \sum{(r_i-\bar{r})(g_i-\bar{g})} & \sum{(r_i-\bar{r})(b_i-\bar{b})} \\ \sum{(g_i-\bar{g})(r_i-\bar{r})} & \sum{(g_i-\bar{g})(g_i-\bar{g})} & \sum{(g_i-\bar{g})(b_i-\bar{b})} \\ \sum{(b_i-\bar{b})(r_i-\bar{r})} & \sum{(b_i-\bar{b})(g_i-\bar{g})} & \sum{(b_i-\bar{b})(b_i-\bar{b})} \end{matrix}\right) \] The eigenvector corresponding to the largest eigenvalue of this matrix is the direction of greatest variance (sometimes called the principal direction or principal axis). This is relatively easy to compute using the power iteration algorithm. There are many sources online that describe this algorithm, but the basic idea is to start with an initial guess for the vector, $\hat{a}_0$, then do several iterations of $\hat{a}_{i+1}=\mathrm{normalize}(M\hat{a}_i)$. This will quickly converge to the principal direction, $\hat{a}$, especially if you start with a good initial guess (e.g. $(1,0,0)$, $(0,1,0)$, or $(0,0,1)$ corresponding to the longest dimension of the bounding box of the colors in the block). You also only need a reasonably close approximation to $\hat{a}$, so do not need to iterate to full convergence.

How ever you compute $\hat{a}$, you can project each texel in the block onto a line in this direction through the average color of the block, \[ p_i=(\vec{c}_i-\bar{c})\cdot\hat{a} \] Use the min and max projections on this axis as the representative colors. \[ \vec{k}_0 = \bar{c} + p_\mathit{min}\hat{a}\\ \vec{k}_1 = \bar{c} + p_\mathit{max}\hat{a} \] Scaling the projection of each texel on this axis will give you a number between 0 and 3 to use for the selection bits, and adding some randomness between the colors can help to avoid some banding artifacts. \[ s_i=\left\lfloor 3 * \frac{p_i-p_\mathit{min}}{p_\mathit{max}-p_\mathit{min}} + \mathit{randFloat} \right\rfloor \] Pack the $\vec{k}_0$, $\vec{k}_1$, and the $s_i$ into a 64-bit integer. Be careful about the order of $\vec{k}_0$ and $\vec{k}_1$, since swapping them will change the palette. Also, be sure to remap the selection bits to the DXT order. In DXT, the sequential colors from $\vec{k}_0$ to $\vec{k}_1$ are given by bit codes 00, 10, 11, 01 where the order from the $s_i$ equation is 00, 01, 10, 11.

Once all of the blocks have been constructed, instead of glTexImage2D, you will load the texture data using glCompressedTexImage2D, with texture format GL_COMPRESSED_RGB_S3TC_DXT1_EXT. The glGenerateMipmap helper function won't work on these textures, so just load level 0 and tell OpenGL this is the only level with glTexParameteri(GL_TEXTURE_2D, GL_TEXTURE_MAX_LEVEL, 0).

634 only

Create a texture manager that handles all texture file loading and compression. It should track the file name of each image, and only load and compresses it if it has not already been seen. For new textures, use glGenTextures to allocate a new ID to hold the texture, but for already loaded textures, return the ID of the previously loaded copy. You will want to move the texture ID generation and image loading out of the Object class. I recommend using a std::map from file name to texture ID.

Extra Credit

There are three extra credit options at 5 points each, for a total of 15 points of possible extra credit if you do all three. You are only eligible for extra credit if you submit on the original due date.

Generate your own mip map levels by averaging each 2x2 block of texels in one level to compute a texel in the next level, then compress each level independently. You can choose a GL_TEXTURE_MAX_LEVEL that stops once any dimension is 4 (so you still don't need to handle partial blocks).
Support 1-bit transparency with GL_COMPRESSED_RGBA_S3TC_DXT1_EXT (note the RGBA instead of RGB in the format name). Get finer gradations in transparency by randomly choosing opaque or transparent for each texel proportional to the alpha value, as was done for the color palette selection.

Since PPM files are only RGB, you will need to support an alternate image file format with transparency. The uncompressed TGA format is pretty simple, and supports transparency. I have provided one set of files (fence-*) that includes a tga color file. Copy these into your data directory to use. You only need to support enough of the specification to read this color TGA file (i.e. only uncompressed RGBA).

You will need to change the fragment shader to see the transparency. Change the ColorTexture lookup to keep the alpha channel, and include a test to discard the fragment based on that value (e.g. if (color.a < 0.5) discard;).
Normals and non-color data like our "props" maps often need higher quality than provided by the DXT1 compression, and their channels are not well enough correlated to use a single pair of representative colors and set of texel bits for all channels. Both the normal map and props maps can actually be stored in just two channels, using one of the RG compression methods. The third channel of the normal map can be reconstructed in the shader, and we are not even using the third component of the props map. Implement RG compression for these, including whatever shader changes are necessary.

Tips

It can be helpful to temporarily modify the application to render just a single textured object, and modify the vertex shader to compute the screen coordinates from texture coordinates to be able to focus on just the texture and what your compression is doing to it. Further, you can modify the texture coordinate to screen coordinate transform to zoom in on one block, and modify the fragment shader or application code to use point sampling instead bilinear filtering on the texture while debugging your compression.

Add a function to reload textures and set up a key or keys so you can switch back and forth between options while the program is running. For example, using your compression code and using uncompressed texture, or between a couple of compression options.

As a first step toward compressing, try using the average color of the block for both representative colors. The color bit codes won't matter, and you'll be able to debug any byte order and layout issues. Then try that color as the first representative color, with (0,0,0) for the other color, and all selection bits 00. Then you can try different choices for selection bits to see each color from the palette between black and the average color of the block.

As a next step, compute the actual representative colors, and go through a similar set of tests to debug both the color choice and selection bits. This will also help ensure your representative color ordering is correct, since if it is swapped, a selection bit value of 11 always gives you black.

What's missing

There are several features and details that we skip over in this assignment:

We don't handle partial blocks. For images (or mip levels) with sizes that are not multiples of 4, you pad the block, but ignore the extra padding pixels when choosing your representative colors
We never use the second mode of the RGB DXT format, which has one bit code for black, and the other three for representative color 0, color 1, or half way between them. This can produce better results for some blocks with black pixels, since you can ignore those and get a better set of representative colors for the remaining pixels. This is especially useful for texture atlases, where it is not uncommon to have black pixels surrounding each region
Similarly, we never use the second mode for the RG format, which reserves two bit codes for 0 and 1.
While endpoints of the principal direction are generally a reasonable choice for representative colors, for best results you might want to use this as a starting point for a nonlinear optimization to minimize the block error
Some games use a variable bit-rate supercompression to further compress the fixed bit-rate GPU-renderable textures to reduce the size for network transmission, and on disk, and in some cases for loading to the GPU.

What to turn in

Do your development in the GLapp directory so we can find it. Also include an assn5.txt file at the top level of your repository telling us about your assignment. Tell us what works, and what does not. Also tell us what (if any) help you received from books, web sites, or people other than the instructor and TA. Especially if you do the third extra credit, I expect that you will have to use external sources to discover the correct set of OpenGL calls to use.

Turn in this assignment electronically by pushing your source code to your class git repository by 11:59 PM on the day of the deadline and tagging the commit assn5. See the assn0 project description if you accidentally tag the wrong commit and need to retag.

You must make multiple commits along the way with useful checkin messages. We will be looking at your development process, so a complete and perfectly working assignment submitted in a single checkin one minute before the deadline will NOT get full credit. Individual checkins of complete files one at a time will not count as incremental commits. Do be sure to check in all of your source code, but no build files, log files, generated images, zip files, libraries, or other non-code content.

If you attempt any extra credit, you must describe it in your assn5.txt. Do not expect the TA to scour your commit messages or code to figure out you did an extra credit and grade it for you.

CMSC 435/634: Introduction to Computer Graphics

Assignment 5
Texture
Due April 26, 2026

Before you start

The Assignment

634 only

Extra Credit

Tips

What's missing

What to turn in

CMSC 435/634: Introduction to Computer Graphics

Assignment 5 Texture Due April 26, 2026

Before you start

The Assignment

634 only

Extra Credit

Tips

What's missing

What to turn in

Assignment 5
Texture
Due April 26, 2026