A Beginner's Guide to Hacking Video Game Save States (Fire Emblem 7 on the GBA)


As a fun change of pace, I decided to write up a beginner's guide to hacking save states for video games. In this tutorial, we're going to take a look at Fire Emblem 7 for the Game Boy Advance (GBA). Specifically, we're going to hack the health bar of the final boss (spoilers!) to be 1 HP. In doing so, I'll cover the simplest practical reverse engineering technique for game hacking: differential analysis. I'll be using an emulator for Android called John GBA, so we'll also get into a few details about how it works.

Conceptually, this guide is very similar to the start of the tutorial provided by Cheat Engine. The main difference with this writeup is we're going to hack an actual video game by modifying an emulator's save state data rather than using a debugger on a running toybox program. Also, I'm going to perform all the steps using only standard GNU/Linux programs and a little bit of Python scripting. Thus, this guide is intended for readers familiar with basic programming and terminal CLI who may not have considered hacking a video game before and want a simple realistic example to get started.

Enjoy.

A Primer on Emulators & Save States

I'll start with a brief summary of what an emulator is. Put simply, it reads a program written for one computer architecture and executes it on another. In our case, the emulator I'm using can run games written for the GBA on an Android device. The most basic way of accomplishing this is using interpretation. Specifically, the emulator mimics the original architecture by allocating memory to represent RAM, CPU registers, etc. It then loads the program into the memory just as the emulated system would and steps through each instruction, modifying the memory accordingly. The collective data values contained in the emulated hardware at any given point is the emulation state.

Interpretation is the easiest form of emulation to implement, but slow because each hardware instruction on the original system is translated into one or more (often many) instructions on the emulator's hardware. To get around this, real emulators often compile and optimize the emulation logic "on-the-fly" using a technique called just-in-time (JIT) compilation. Optimizing emulators is an interesting topic for discussion, but beyond the scope of this guide.

What we do need to understand is save states. By pausing the emulation and storing all the data associated with the emulation state, the emulator can return to that state at any later time. All it has to do is read the saved state back into the memory representing the emulated hardware. This gives us a clean way to hack video games by intentionally corrupting save states. All we have to do is modify some data in the saved state and when the emulator loads it, it'll propagate our modifications into the emulated system:

Figure 1

But in order to do that, we first have to figure out the formatting of the emulator's saved states.

Reverse Engineering John GBA's Save State Format

Lucky for us, this emulator's save state formatting is fairly standard and we don't need to understand most of the details for our basic memory hack. First, we know the states have to be saved somewhere in storage. Turns out it's easy to find by just browsing: <internal_shared_storage>/Johnemulators/GBA/state. In this directory we find two file formats: .jg# and .js# where # is the slot number for the save (organizing saves as a linear list of slots is typical for video game emulators). It's easy to see which files we care about for Fire Emblem 7:

$ ls -l
total 301
-rw------- 1 carter carter 113059 Jun 22 02:31 Fire Emblem (U).jg0
-rw------- 1 carter carter 108703 Jun 22 04:24 Fire Emblem (U).jg1
-rw------- 1 carter carter  18502 Jun 22 02:31 Fire Emblem (U).js0
-rw------- 1 carter carter  67051 Jun 22 04:24 Fire Emblem (U).js1

A good first step towards figuring out their formats is to run file and see if there are any recognizable patterns:

$ file -i *
Fire Emblem (U).jg0: application/gzip; charset=binary
Fire Emblem (U).jg1: application/gzip; charset=binary
Fire Emblem (U).js0: image/jpeg; charset=binary
Fire Emblem (U).js1: image/jpeg; charset=binary

Much to our luck, despite the custom file extensions, each save state is actually just a JPEG image and a compressed data file. We don't need to worry about the images, so let's focus on the data files. First, copy them off the Android device into a local directory. Also, make sure to save a backup copy somewhere safe so you won't lose your data if you make a mistake! Next, let's decompress one of them:

# gzip expects compressed files to have the .gz extension,
# so we have to rename it first
mv Fire\ Emblem\ \(U\).jg1 1.gz
gzip -d 1.gz
# after decompressing, gzip will automatically strip the extension
# from the filename

And take a look at the beginning of the data:

$ hexdump -C 1 | head
00000000  0a 00 00 00 46 49 52 45  45 4d 42 4c 45 4d 45 00  |....FIREEMBLEME.|
00000010  41 45 37 45 00 00 00 00  00 00 00 00 01 00 00 00  |AE7E............|
00000020  00 00 00 00 00 00 00 00  01 00 00 00 00 00 00 00  |................|
00000030  00 00 00 00 d8 7d 00 03  00 00 00 00 00 00 00 00  |.....}..........|
00000040  00 00 00 00 1f 00 00 00  00 00 00 04 a0 7f 00 03  |................|
00000050  14 02 00 00 1c 00 00 00  12 00 00 60 1f 00 00 60  |...........`...`|
00000060  a0 7f 00 03 14 02 00 00  1f 00 00 60 00 00 00 00  |...........`....|
00000070  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  |................|
00000080  c0 7d 00 03 0c 02 00 00  d0 7f 00 03 70 fa 0b 08  |.}..........p...|
00000090  3f 00 00 60 00 00 00 00  00 00 00 00 00 00 00 00  |?..`............|

Another lucky break for us. We can see the canonical name for the game (FIREEMBLEM) at the beginning of the data. In fact, if we search the game's ROM (the program as it's stored when it isn't running), we'll see the same name near it's beginning too:

$ hexdump -C Fire\ Emblem\ \(U\).gba | grep -A 10 "FIREEMBLEM"
000000a0  46 49 52 45 45 4d 42 4c  45 4d 45 00 41 45 37 45  |FIREEMBLEME.AE7E|
000000b0  30 31 96 00 00 00 00 00  00 00 00 00 00 d1 00 00  |01..............|
000000c0  12 00 a0 e3 00 f0 29 e1  28 d0 9f e5 1f 00 a0 e3  |......).(.......|
000000d0  00 f0 29 e1 18 d0 9f e5  3c 11 9f e5 18 00 8f e2  |..).....<.......|
000000e0  00 00 81 e5 34 11 9f e5  0f e0 a0 e1 11 ff 2f e1  |....4........./.|
000000f0  f2 ff ff ea 00 7e 00 03  a0 7f 00 03 01 33 a0 e3  |.....~.......3..|
00000100  02 3c 83 e2 00 20 93 e5  02 18 a0 e1 21 18 a0 e1  |.<... ......!...|
00000110  00 00 4f e1 0b 40 2d e9  22 18 02 e0 02 0a 11 e2  |..O..@-.".......|
00000120  fe ff ff 1a 00 20 a0 e3  01 00 11 e2 26 00 00 1a  |..... ......&...|
00000130  04 20 82 e2 02 00 11 e2  23 00 00 1a 04 20 82 e2  |. ......#.... ..|
00000140  04 00 11 e2 20 00 00 1a  04 20 82 e2 08 00 11 e2  |.... .... ......|

We now know that after decompressing, the data file contains the plain bytes of the game's memory. Next, we have to figure out which bytes to overwrite in this 723 KB file.

Differential Analysis

The simplest reverse engineering technique for finding a data variable of interest is differential analysis. Remember that our goal is to modify the health of the final boss for an easy victory. It's reasonable to assume that somewhere in the program's memory, there's an integer that keeps track of of the current health. Also, since the game is actively running, this variable shouldn't be relocated in memory too often. The question is: how do we find it?

We'll use differential analysis. Specifically, we'll save the game's state, inflict some damage on the boss, save the new state, inflict more damage, save and so forth. Since this game shows the HP value and the damage of our attacks, we should know the exact health of the boss at each state we save. All we need to do then is compare the memory of the states and find a location that contains the correct health value across all the states.

Creating Save States

So let's begin. It's time to confront the big bad dragon at the end of Fire Emblem 7:

Figure 2

This overgrown lizard is quite the challenge if you approach him under leveled and without enough good equipment, but rather than working harder by reloading a past save and redoing half the chapters, we're going to work smarter and show the boss who the real master of this game universe is!

At our first save state, he's at full health. Unfortunately, this game is cheeky and hides his HP at the start of the fight:

Figure 3

This won't be a problem though because the game shows exactly how much damage we inflict:

Figure 4

In this case, Eliwood's sword will inflict 21 damage. He's seriously under powered, but can still survive the attack:

Figure 5

At this point, we make our second save state and note that we've inflicted 21 damage so far. As Athos prepares to cast magic on the foul beast, we can now see the HP of the boss:

Figure 6

He had 99 HP when the second save was made, meaning he started with 120. Now Athos inflicts another 20 damage:

Figure 7

And we make a third save state. The HP of the boss at this save is 79:

Figure 8

Three states should be enough for our analysis, so it's time to transfer the data over to a computer and get hacking.

The Analysis

As I mentioned earlier, this is a differential analysis. We're looking for a memory location storing a variable that's 120 in the first state, 99 in the second and 79 in the third. In base 16, this is 0x78, 0x63 and 0x4f. Luckily, all these values are small enough to be encoded as a single byte, so we can scan the game's memory linearly without having to consider factors like alignment or endianness. So let's get to work.

To speed up this analysis, I wrote a script in Python. This code is compatible with Python 2 and 3 and the comments are self-explanatory:

:::python
#!/usr/bin/env python
from __future__ import print_function
import gzip
from struct import unpack
import sys

def decode_gzip(filepath):
    # Python 2 and 3 decode byte I/O differently, this method ensures consistency
    with gzip.open(filepath, 'rb') as fd:
        if sys.version_info.major <= 2:
            return [unpack('B', byte)[0] for byte in fd.read()]
        else:
            return fd.read()

# the names for my save state data files
files = ['Fire Emblem (U).jg1',
         'Fire Emblem (U).jg2',
         'Fire Emblem (U).jg3']

# the expected HP value in each save state
hps = [120, 99, 79]

# decompress the save states
data = [decode_gzip(file) for file in files]

# for each save state, find all the offsets that contain the expected HP value
# this is not the most efficient way of doing intersection search, but it's easier
# to read and efficient enough for our amount of data
match_sets = list()
for state, hp in zip(data, hps):
    matches = [offset for offset, byte in enumerate(state) if byte == hp]
    match_sets.append(set(matches))

# we want the intersecting offsets across save states
intersection = match_sets[0].intersection(*match_sets[1:])

# sort and print the results
offsets = [hex(offset) for offset in sorted(list(intersection))]
print(' '.join(offsets))

Running the search on my states yields three candidate locations:

$ python search.py
0x29222, 0x354b2, 0x91df3

It's common to get more than one candidate because depending on the game's logic, values like health might be maintained in multiple locations for different purposes (e.g. for rendering it as text on the screen, detecting certain state triggers, etc.). As long as the number of results is few, it's usually safe to just overwrite all of them. If the game crashes, we can always go back and make modifications more conservatively.

Modifying a Save State

Now that we know where to make our modifications, let's set the health of the boss to 1 HP! Here's another Python script to do just that:

:::python
#!/usr/bin/env python
from __future__ import print_function
import gzip
from struct import unpack
import sys

def decode_gzip(filepath):
    # Python 2 and 3 decode byte I/O differently, this method ensures consistency
    with gzip.open(filepath, 'rb') as fd:
        if sys.version_info.major <= 2:
            return bytearray([unpack('B', byte)[0] for byte in fd.read()])
        else:
            return bytearray(fd.read())

if len(sys.argv) < 4:
    print("Usage:", sys.argv[0], "<state_file.jq> <patch_value> [<offsets> ...]")
    sys.exit(1)

# parse the command line arguments
state_file = sys.argv[1]
patch_value = int(sys.argv[2], 16)
offsets = [int(value, 16) for value in sys.argv[3:]]

# read in the save state data
data = decode_gzip(state_file)

# overwrite the designated bytes
for offset in offsets:
    data[offset] = patch_value

# write the new save state
with gzip.open(state_file + '.patched', 'wb') as fd:
    fd.write(bytes(data))

Now we just apply the patch:

$ python patch.py Fire\ Emblem\ \(U\).jg1 0x01 0x29222 0x354b2 0x91df3

Which will create a new save state with the filename of the original plus the suffix .patched. All that's left is to replace the original save state in John GBA with our patched version and then we're ready to slay a dragon with ease.

The Result

Not so tough now, are we dragon?

Figure 9

And down it goes in one hit. How embarrassing:

Figure 10

Figure 11

Conclusion

So that's the most basic approach to hacking a video game save state. I hope you've found this guide entertaining and informative. I'd like to reemphasize that this only scratches the surface of the broad area that is reverse engineering. The beauty of differential analysis is we didn't have to know any low-level details about how the Game Boy Advance hardware behaves or the game's logic. This makes differential analysis one of the most transferable techniques. That said, it also has limited applicability. For example, if the game developers wanted to prevent tampering, even the most basic obfuscation (e.g. doubling every integer before storing it in memory) would thwart our analysis. There are also fun hacks that cannot be implemented via a one-time memory patch (e.g. giving your character invincibility). So if you find this topic interesting, I recommend checking out other tutorials floating around on the internet.

Thanks for reading and happy hacking.