Maelstrom for Gameboy Advance - Technical Documentation
Contents
Introduction
What is "Tempest"?
Brief
Implementation
Data structures
Functions
Improvements for the future
Credits
The following contains the design and implementation documentation for the game Maelstrom, based on Tempest. It covers the initial brief and requirements of the game, as well as an explanation of how the C code works. Maelstrom is a work in progress, and will have more features added to it in the following months.
Tempest is a classic arcade game in which the player is presented with a 3D view down the length of a hollow tunnel. The player controls a ship which is attached to the tunnel's inner surface, and must destroy and evade enemies coming toward them up the tunnel. Once a certain score is reached, the player warps through the tunnel to the next, where they face a new design of tunnel and new enemies.
To create an updated version of the classic arcade game "Tempest", but expanding on the original to use true 3D wireframe display to allow the player's view of the tunnel to change as they move around its outside edge. The first phase is to implement the basic capability to move our ship around the tunnel and have our view of the tunnel change in response to this, and the generation and attack of enemies and their destruction. See also section: "Improvements for the future".
In order to define more accurately the project requirements, the following decisions were made regarding the functionality of the game.
Only the interior of the tunnel will be viewed during normal gameplay. This is to enable significant simplifications to be made regarding the depth sorting of objects in the scene. By making this restriction, we can guarantee our tunnel will not occlude other objects in the scene (including ourselves and our enemies), and therefore can always be drawn first.
Although lines are clipped in 2D screen space, within the limitations of the game it was not believed that clipping lines in 3D space was necessary, since the tunnel would never be close enough or angled enough to allow any polygon to intersect our viewplane.
Objects were restricted to appearing only on the surface of the tunnel. Doing so allows them to be positioned using only a reference to the tunnel polygon on which they reside, and their distance along that polygon.
The following information needs to be maintained by the game:
These contain the 2D co-ordinates of each of the points on the cross section of the tunnel. The game extrudes these into a three dimensional model of the tunnel when each tunnel is needed. The numpoints part of the Tunnel structure indicates how many points there are in the tunnel cross section, and therefore how many elements in crossec contain useful data.
These contain the three dimensional points of the tunnel generated from the 2D cross section above. This information only needs to be stored for the tunnel currently in use.
This array holds the indices of pairs of vertices which form the lines that make up the tunnel.
Contains the 3D coordinates of the tunnels after they have been rotated to form our view of the tunnel.
Contains the two dimensional coordinates of the points which will actually be drawn on the screen. These are derived after carrying out the perspective divide (which moves vertices towards the vanishing point by dividing by their depths, or Z ordinates), then dropping the now unnecessary Z ordinate. The tunnel can now be completely drawn as a 3D image projected on a 2D screen by drawing lines between each pair of eyevertex points as they are listed in the tline array.
These structures hold the geometric data for objects that appear on the tunnel's surface. Individual instances of objects will reference one of these models.
This structure holds data relating to an individual instance of object on the tunnel, including ourselves. The Valid element states whether this object is actually a valid instance, or if it has been deleted. Polygon contains the tunnel polygon this instance currently resides on. Desired position states which polygon we are attempting to get to. Depth is the distance down the tunnel, while speed is the rate at which we move along it. Model refers to which of the models in the structure above we actually draw when we render this object, while the Viewvertex and Eyevertex elements contain the data pertaining to view and eye transformations. This is because although the model used will not change, the conditions under which it is viewed differ for each instance, and hence it is necessary to store the information here on a per-instance basis, rather than in the model structure.
This array holds the angle of rotation and x and y translations needed to place an object on any of the tunnel's surfaces. These are pre-calculated at the start of every level.
Our view of the tunnel changes as our ship's position on the surface of the tunnel changes. This array is pre-calculated at the start of every level, and dictates the angles by which to rotate and shift the tunnel for each of our possible positions on its surface.
These tables provide fixed-point lookup values for sin, cos and inverse tan, removing the need for the inclusion of a maths library. The values within are integer due to the Gameboy Advance's extremely poor floating point performance. The scaling factor applied to these arrays is divided out at the end of the calculation in which they are used. Each uses a factor which is a power of two to allow divides to be performed by bit-shifting, significantly increasing speed.
This function takes the supplied Tunnel cross section and extrudes it into the required 3D points and lines. Generating the points is simply a matter of using the cross section to generate a further two sets of points, which will define the near and far ends of the tunnel, by copying the cross sectional values into an array twice with two different sets of depths (Z ordinates) each time. Lines are defined between two adjacent points on the tunnel's near side, then from the rightmost nearside line to the matching point on the far edge, then finally from this point back to the left to the other matched point. This creates three lines for each of the faces of the tunnel. This prevents each of the interior lines being drawn twice: the left hand line of each face is not drawn, and instead gets filled in by the right interior line of the polygon preceeding it. It is possible to make this optimisation only because we know that we will always draw all the faces of the tunnel. Because of this, we also don't need to group the lines up into polygons, or have a normal for each face; the lines are simply drawn as lines, without considering visibility.
Takes the vertices of the tunnel in world space (thistunnel.tvertex[][3]) and rotates them by the appropriate Yaw and Pitch values (These values are generated by the function Precalcview(Tunnel))
The calculations performed to generate the viewvertices are simply multiplied out rotation matrices as found in most computer graphics books, such as 3D Computer Graphics by Alan Watt. The bitshifts at the end of the statements remove the shifting done on the trigonometry lookup tables.
After this function, we now have the rotated tunnel in 3D space.
Translates the tunnel away from the viewpoint
This divides the X and Y ordinates of our Viewvertices to give Eyevertices with perspective. X and Y are divided proportionally to Z; the bitshift right by 6 reduces the denominator, Z, by a factor of 64. This number is merely based on observations and produces a good perspective effect.
After this function, we have 2D coordinates for our tunnel in perspective.
This function takes the Eyevertices generated above and draws them to the screen, carrying out the required clipping. The clipping routine uses the following algorithm:
Take X of start point
If >=XPIXELS
{
Take X of end point
If<XPIXELS //If this false, both off screen in same direction, so don't draw
{
Calculate t, fixed point fraction along the line at which X==XPIXELS
clipped startpoint Y=startpoint Y + t*(endpoint Y - startpoint Y)
clipped startpoint X=XPIXELS
}
If<0
{
//Similar method for clipping to X==0 boundary as we used for X==XPIXELS
}
//Now compare clipped start point generated above with Y boundary. May be that no clipping
occurred, in which case Clipped startpoint Y = startpoint Y
If clipped startpoint Y >=YPIXELS
{
//Similar as for XPIXELS
}
If clipped startpoint Y <0
{
//Same again
}
// We've now clipped the Startpoint against all four boundaries
If clipped startpoint inside screen:
THEN
Proceed as above with clipping of second point, and Drawline between resulting points
ELSE
Since the point could not be clipped to within the screen, the whole line lies outside
it. There is no point in trying to clip the second point, so we exit. No line is drawn
The functions which draw the objects on the tunnel walls work in a similar way, but operate on the ship structures instead.
This function creates a lookup table of rotations in the Z axis and translations in the X and Y axis necessary to place an object on any Polygon of the tunnel. It is passed a Tunnel type variable (a cross section).
In order to be able to calculate the angle of each of the tunnel Polygons, we use a lookup table for inverse Tan, which allows us to lookup by gradient and find the appropriate angle. Firstly, we look for special cases in which our line is exactly 0, 90, 180 or 270 degrees. If it is none of these angles, we find which quadrant the angle lies in, and assign an appropriate value to Quadrantadd. This will be added to the angle within the quadrant we calculate later.
Because the gradient of the line can vary anywhere between 0 and infinity, we have further split each quadrant in half. We will determine which of these halves our line lies in, and act accordingly, giving us a range of gradients between 0 and 1, suitable for the inverse tan lookup table. The lookup table is left shifted 6 times, giving us 64 steps to represent the range 0-1. This is more than enough to give one-degree accuracy (0-45 degrees).
Depending on which half of a quadrant a line lies in, our angle is either:
itanlook[m]+quadrantadd
Or:
90+quadrantadd-itanlook[m]
(Where m is the gradient shifted left by 6)
In the special case where: |dx|=|dy|, rollangle=45+quadrantadd.
This function is carried out only once at the start of each level and calculates the values for the "keyframe" polygons only. That is, actual polygons that make up the surface of the tunnel, and not the interpolated "polygons" which are a way of representing the intermediate positions of objects as they move between keyframe polygons. Keyframe, or "real" polygons are those that are multiples of three, as we currently have two interpolated positions between polygons. See next function.
Takes keyframe polygons calculated by Precalcspatial, above, and linearly interpolates to create in between positions. These allow objects moving between keyframe polygons to do so without jumping there instantly.
This determines the rotation and 2D offset of the entire scene, tunnel and objects, for each of our ship's positions on the tunnel wall. Essentially, this function determines how our viewpoint changes as our ship's position changes.
Firstly, we find the largest absolute value in the Spatialtransform lookup table for both X and Y. This gives us the extents of our tunnel. After that, we loop through the Spatialtransform array and find what fraction each of these translations is of the maximum. This fraction is then multiplied by MAXPITCH or MAXYAW values to give the view rotation in both axis for each point on the tunnel. Therefore, we have MAX rotation when our ship is occupying polygons in one of the extremes. MAXPITCH and MAXYAW values are based on observation and are designed to give the biggest range of views of the tunnel while not seriously obscuring the view of any part of it.
We also calculate xoffset and yoffset values in the same way. These are used to translate the whole scene depending on how far we are zoomed in. This prevents our ship from ever being off the screen, even when zoomed in all the way. It is the Mainloop function which determines the fraction of the translation necessary, given the current and min and max zoom values, such that no translation occurs when zoomed out to maximum (MAXZOOM), and the full translation calculated in this function occurs when zoomed in all the way (MINZOOM). The calculated offsets are applied in 2D in the function Perstrantunnel.
This function is responsible for the management of ships, both the creation of new ships and the expiry of old ships as they complete their journey through the tunnel.
Firstly, the function deals with expiring old enemies. These are enemies that have reached the near end of the tunnel (Enemies that are destroyed by being fired upon are removed in another function). The function will pass through the Ship array and check the depth of each of those that are Valid. That is, those which actually contain valid data rather than an expired ship. When they have reached the end, its Valid value is set to 0, flagging it as an unused slot and preventing it from being considered for drawing or other actions.
Notice at this point that our Ship array contains all the objects which lie on the surface of the tunnel, whether it is our ship, enemies, or the shots we fire. This is to simplify the drawing of these objects, as the Drawenemies routinecan simply run through the whole array, rendering all the objects it finds. Element 0 always contains our ship, subsequent elements up to MAXSHIPS contain enemies, while the final MAXSHOTS elements contain shots fired. This function, therefore, operates on elements 1 through to MAXSHIPS.
Secondly, the function will randomly create enemies using a simple pseudo-random number generator (This function from the HAM forum at: http://www.ngine.de). Carrying out a modulus on the number limits its value to 0 -> modnumber-1. By only creating an enemy when one specific value occurs, we can increase the number to decrease the probability of this occurring, allowing us to create a harder or easier game.
When the function has randomly decided if an enemy is to be created, it will search for an unused slot in the Ship array. If it finds one, it will setup all appropriate values, such as speed and model reference, and randomly decide on the starting polygon for the enemy. Notice there is the potential in this function to randomly decide the model used, speed etc., rather than setting constant values, which is not currently used.
This function moves Valid enemies towards us along the tunnel, and also makes a random decision on whether they should move to a new polygon or not, and in which direction. If so, their Desiredposition value is changed to reflect their intended destination. This Desiredposition is acted on in Mainloop, which will change the actual position, Polygon, every frame until it equals Desiredposition. The Lastpress variable, also set in Moveenemies, dictates in which direction Mainloop will move the model in order to reach its Desiredposition.
This function removes shots that have reached the far end of the tunnel, and creates new ones when the player presses A on the Gameboy.
The player can only fire if they occupy a keyframe polygon; they cannot fire on interpolated polygon positions. They can also only fire three shots at once, and can't fire if their last shot is less than a certain distance away. This prevents the three shots being fired on consecutive frames in the brief period the user holds down the button. The function keeps track of the last shot fired for comparison. Each time they try and fire, the last shot is checked for validity and distance. A new shot is created only if invalid, or far enough away. There also has to be at least one free slot in the structure.
This function moves shots down the tunnel and also checks for collisions with enemies.
Each Valid shot is compared against each Valid enemy in turn. Firstly, we check if the Polygon value is the same between the shot and the enemy, before moving on to compare the depths. A shot has hit an enemy when it is behind it by a distance not more than the sum of their speeds. So, if each was travelling at a speed of 20, we have hit that enemy if our shot is on the same polygon and less than 40 units behind it. It is necessary to test for the distance, not just the fact the shot is behind, as it is possible a ship can change Polygon to that which the shot occupies, and become some distance behind the shot. In this case, the shot has missed.
Furthermore, it is necessary to check for all enemies that we have hit, not just stop when we encounter the first, as our shot may have passed two or more enemies, but we may have only discovered the one furthest away (because it could occur first in the Ship array). Therefore, we look for the distance of each enemy we collide with, and record the array index of the closest. Once all comparisons are complete, we invalidate the shot and closest enemy we have found, if we found any collisions at all.
This function is taken from the gbadev forum (http://forum.gbadev.org), and is an implementation of the Bresenham line drawing algorithm, adapted for Mode 4. Although not fully optimised, it is about twice as fast as HAM's line drawing algorithm, which calls another function to plot each pixel, and in turn checks the current display mode every call. This function eliminates those overheads by using an inlined pixel plotter specifically for mode 4.
Because we are writing to memory directly, rather than using HAM's own drawing commands, we need to do the switching between which buffer we draw to that HAM normally takes care of. The function Changebuffer finds which buffer we are currently using and sets the Videobuffer variable appropriately, so that BLIne will always draw to the current backbuffer.
Aside: The function could be further sped up by using the Symmetric Double Step Line Algorithm (Wu and Rokne. See "Graphics Gems" by Andrew S. Glassner), in which we draw two pixels at each end for every iteration of the For loop, thereby reducing to a quarter the number of cycles needed to draw a given line. Furthermore, we could take advantage of the fact that in Mode 4 we write two pixels at once to speed up lines where the gradient is between 0.5 and -0.5 (only when this is the case do pixels appear next to each other on the same horizontal line. It can't happen for lines closer to the vertical than the horizontal). For a fuller discussion of this, see here: http://forum.gbadev.org.
Greater speed still could be achieved by rendering into an intermediate array of u8 values before copying this into the backbuffer in the required double pixel format. Only values other than the clear colour need to be written to the screen, and given we draw a relatively small fraction of the pixels on the screen, this proceedure should be very fast. It also allows us to write double pixels regardless of what line they are part of or what colour they are, but does require a large amount of memory to be allocated (37.5 K).
Finally, compiling the function as ARM rather than THUMB would significantly speed up the code at the cost of memory.
This function is responsible for calling the previously documented functions in order to correctly transform and render the scene to the screen. It is called from function Main when the last drawn frame has been swapped to the front buffer by VBlfunc.
It is also responsible for moving all Valid Ships one step closer to their Desiredposition every frame, and calculates the Gamma and Theta rotation variables by which to transform the whole scene, depending on our ship's current position. The X and Yoffset values are also calculated dependent on the current zoom value (Dolly), and these are acted on in function Perstrantunnel in order to keep our ship in view.
When all actions are complete, Frameready is set to 1. This tells the interrupt function Vblfunc (below) to render the frame.
This function is called every time there is a VBL interrupt. It will only perform any action when the current frame has been flagged as being ready by Mainloop, otherwise it will exit immediately. This method was chosen above turning the interrupt off during rendering a frame to allow for the future addition of tasks that need to be carried out every VBL interrupt, such as sound handling. The function will additionally only execute if it has already been called a certain number of times since the last frame was drawn. This limits the framerate to prevent the game speeding up when there are few objects in the scene (Even if the speed of the objects were altered depending on the update speed, frequent changes in framerate would still be visually annoying).
The function will then look for keypresses. Left and right on the D-pad cause an appropriate change in our ship's Desiredposition, and also sets the Lastpress value which will allow the Mainloop function to know in which direction the ship should be moved to achieve its Desiredposition. The L and R shoulder buttons will change the current zoom (Dolly) value. The A button press is handled in the separate function, Expireandcreateshots.
Once button presses have been processed, the buffers will be flipped to show the newly rendered frame and the backbuffer cleared ready for the next one. Finally, Frameready is set to 0, which will cause Main to call Mainloop and render the next frame.
This game is a work in progress, and there are many areas for improvement. These are:
- Start screen and level introduction. Currently, only the main game is implemented, with no introduction or buildup. Therefore, a simple intro screen is needed that displays the title of the game and "Press start". The start of each level could also be made more interesting by spinning the tunnel in from the distance until it reaches its final "play" position, then fading in our ship before the enemies start appearing. The palette based graphics of Mode 4 make fading simply a matter of changing the appropriate pallette value each frame.
- Scoring, destruction and multiple levels. Currently, we have no way of being destroyed, scoring, or moving to a new level. Being destroyed would require the comparison of our position and all enemy positions every frame to determine when they have struck us, and a short graphical sequence to show our destruction. Scoring would require the implementation of a wireframe text algorithm to display the score (the built in HAM text functions are slow and don't work in bitmap modes). The code is already designed with multiple levels in mind, as the Extrudetunnel function can be passed any cross section for extrusion, and this tunnel forms the basis for the calculation of Spatial and Viewtransforms etc. Tunnel cross sections don't even have to contain the same number of points. All that would be needed in addition to the design of additional tunnels would be a criteria for qualification to the next level, possibly based on the number of enemies generated.
- Elimination of the HAM library. While the HAM environment and functions are useful for getting started writing Gamboy Advance games, it mostly deals with sprite handling, and due to speed issues, much of its bitmap mode capabilities have been bypassed in favour of our own. Ultimately, the HAM library may be replaced entirely. The main functions that need replacing are initialisation functions, such as setting up the screen mode and interrupts, and a buffer swap function which will also change the videobuffer pointer and remove the need to do this in a separate function. Many of the defines and macros in its header file are useful and will probably be kept.
- Random number generation. The seed of the random number generator never changes, and therefore the same sequence of numbers occurs every time. Hence, the enemies always occur in the same pattern and make the same movements, which can be useful for testing but is unlikely to be desireable in the finished game. A simple seed generation tactic could be simply to measure how long it takes the player to press the start button on the title screen, or how long they hold it for, etc.
- Optimisation of lookup tables. There is no need for both Sin and Cos lookup tables, as these two tables are merely 90 degrees out of phase with each other. However, using two, as here, avoids confusion and aids readability.
- Back face culling. To allow a more realistic scene to be rendered by hiding faces on the far side of objects. Currently all objects are only defined in terms of vertices and lines. In order to perform this, faces could be defined which refer to a group of lines (a line may belong to more than one face) and a normal vector. When the normal vector is pointing towards us, as determined by the dot product rule, these lines are flagged as visible and are drawn as we currently do. This method has the disadvantage that it is difficult to automatically calculate, using the cross product rule, the normal vector. Therefore, models could be restructured to points and polygons, in which case a polygon would list its vertices in a certain direction, say anti-clockwise, to the direction they will be viewed from, to ensure normal generation will always create vectors in the proper orientation. Certain lines could be flagged as being internal. For example, a square would be made up of two triangles, each of which has one internal line which doesn't need to be drawn.
With this approach, many of the model's lines will be drawn twice. To avoid this, a table of vertex pairs between which lines have already been drawn could be maintained.
- Colourising and depth sorting. In order for objects in the scene to be drawn in different colours, we would need to draw the objects in order from most distant to closest. Many simple algorithms are available for sorting on a small set of objects such as this.
- Flipping the ship to move it between polygons. Currently the ship just slides between polygons, but the visual effect of the movement could be considerably improved by having the ship "flip" over to the next polygon. i.e. One corner stays fixed to the surface while the rest of the ship rotates to position the ship on the next polygon.
- Compensate for changes in framerate. Currently, changes in the game's framerate cause the objects' speed to change. This is partially prevented by limiting the framerate, but we could still potentially get slowdown. The game could prevent this by changing the amount by which it moves each object per frame when it detects frames are taking longer to render.
3D Computer Graphics by Alan Watt.
Graphics Gems by Andrew S. Glassner
Emanuel Schleussinger for writing HAM
Thanks to all the great people on the HAM forum at: http://www.ngine.de
and also the GbaDev forum at: http://forum.gbadev.org
Dave Theurer for writing Tempest in the first place!