In the last article of this series, I explained how new instructions can be tested. The approach shown is very basic and requires manual work. Therefore, it is not sufficient when a larger number of tests are to be executed. During my Master’s thesis, I needed to test all the implemented translation functions for the AVR32 architecture. Hence, I wanted to use a semi-automated approach that did not require additional changes to the emulator or complex external solutions. I came up with a testing framework that automates test execution and evaluation. In this article, I will explain how tests for QEMU translation function can be written.

Why testing is necessary

I already gave a small explanation of why testing should be done in my last article. There is also a section in my thesis on that matter.

For a short recap, let’s consider the following translation function:

static bool trans_ADD_f2(DisasContext *ctx, arg_ADD_f2 *a){
    //...    
    tcg_gen_add_i32(cpu_r[a->rd],  cpu_r[a->rx], cpu_r[a->rx]);    
    //...
}

Do you notice the error?

If not, look at the third argument. It should be cpu_r[a->ry], with a ‘y’ instead of an ‘x’.

The definition of the second add-instruction says that the second operand (referred to as ry) should be added to the first operand (rx). The result should be placed in the destination register (called rd).

The code above is a valid instruction that is syntactically correct. The error is purely semantic. Therefore, it will only be noticed if an emulated program behaves not as intended or if it performs illogical operations. Often, this will be deep down in the execution of the emulated program. Debugging such issues is a really time-consuming task.

The reason for this is that the code above is not directly “executed” when a program is emulated. The code above generates QEMU’s intermediate representation, which is later translated into x86 instructions that are then executed. This additional abstraction makes it hard to identify the source of any illogical behavior in an emulated program. Hence, the translation functions should be tested individually, like you would do unit tests.

Semi-automatic testing framework

For the rest of the article, I will assume that you use my testing framework. The framework will handle test generation and execution. We will focus on defining test cases.

To use the framework, you need to download an AVR32 assembler from the Atmel website. Place the avr32-as file in the framework directory, and everything should work automatically.

Adding the first test

Inside the framework directory, you should already see a tests folder. It already contains a few tests, but for the purpose of this article, I will pretend it is empty. So, let’s work on tests for the ADD instruction.

The add-instruction is defined as (see AVR32 Architecture Document for details):

Operation:
I.  Rd ← Rd + Rs;
II. Rd ← Rx + (Ry<< sa2);

Additionally, the instruction is intended to change the status register of the CPU:

Format I: OP1 = Rd, OP2 = Rs
Format II:OP1 = Rx, OP2 = Ry << sa2
Q: Not affected
V: V ← (OP1[31] ∧ OP2[31] ∧ ¬RES[31]) ∨ (¬OP1[31] ∧ ¬OP2[31] ∧ RES[31])
N: N ← RES[31]
Z: Z ← (RES[31:0] == 0)
C: C ← OP1[31] ∧ OP2[31] ∨ OP1[31] ∧ ¬RES[31] ∨ OP2[31] ∧ ¬RES[31]

Testing the actual operation is an easy task, at least in this case. Testing the status register changes requires a bit more work. We have to define a test case for every possible outcome of the logical conditions above. There is not much difference to regular unit tests. We have to look at number ranges, negative values, edge-cases and so on. I certainly did not define a test for every possible case. But most issues should be addressed.

First, create a new file ADD_f1_01.py inside the test folder. We will start with a simple case and test the status register Z-flag. This flag should be 1 if the result of an add-instruction is 0 and it should be set to 0 if the result contains at least one bit with a value of 1.

Inside the test file, add the following lines:

TEST = """
    # Z-flag: 1
    add r0, r1     
"""

EXPECTED_RESULTS = {
    "r0": 0,
    "r1": 0,
    "sregZ": 1,
    "sregV": 0,
}

As all general-purpose registers should be 0 at the start of the emulator, adding r0 and r1 should produce a 0-result. Therefore, the Z-flag should be set to one.

After saving the file, you can execute the avr32test.py file:

python3 avr32test.py -p [path to qemu-avr32] -t ADD_f1 --build -d

The result should look like this:

Starting execution at 2023-12-03 19:15:54
[avr32generate_tests] Loading test ADD_f1_1
[avr32generate_tests] Writing sfile sfiles/ADD_f1_1
[avr32generate_tests] Assembling test ADD_f1_1: ./avr32-as -o elf-files/ADD_f1_1 sfiles/ADD_f1_1.s
[avr32elfdump] Exporting section '.text'
[ADD_f1_1] PASSED

Congratulations! You executed your first successful test!

Adding a second test

Now, we can test the opposite case: a non-zero result that should set the flat to 1. Create a new file ADD_f1_02.py and add the following content:

TEST = """
    mov r1, 0x1
    add r0, r1     
"""

EXPECTED_RESULTS = {
    "r0": 1,
    "r1": 1,
    "sregZ": 0,
    "sregV": 0
}

Here, we first move the value 1 into register one and then add it to register zero. After the test, both registers should be 1 and the flag should be zero. Execute the test and see if the results match our exceptions.

More complex tests

Testing the V-flag and the C-flag is a bit more complicated. They depend on more complex conditions and require a bit of thinking. For this article, I will look at the V-flag.

The flag should be 1 if both operands are negative (bit 31 is 1, as numbers are represented in two’s complement) AND the result is positive (bit 31 is 0). The flag should also be 1 if both operands are positive AND the result is negative. We already saw that the flag is 0 if both operands are not negative, and the result is also not negative. Hence, one case is already complete. We also should test if the flag stays zero if both operands are negative and the result is also negative. However, I will skip this case for this article. Instead, let’s try to find a test that results in a set flag for the first condition.

First, we need to move negative values into register zero and register one. But we need to make sure that the addition of these values produces a non-negative result.

TEST = """
    movh r0, 0x8000
    movh r1, 0x8000
    sub r1, -0x1
    csrf 0
    csrf 3
    add r0, r1     
"""

EXPECTED_RESULTS = {
    "r0": 0x1,
    "r1": 0x80000001,
    "sregC": 1,
    "sregZ": 0,
    "sregV": 1
}

Here, we use the movh instruction to set the upper two bytes of the registers. Moving 0x8000 into the upper half word results in a negative register value, as a 1 in bit 31 indicates a negative number. Then, we use the sub instruction to subtract a negative one from register one (so, we add one to the register). This is done because the AVR32 instruction set does not provide an instruction to add an immediate value to a register.

Because the sub-instruction also changes the status register, we use the Clear Status Register Flag instruction to clear the register flags that are of interest to us. This helps to prevent false test results.

In the end, we expect register zero to be 1, register one to be 0x80000001, and the C-flag and V-flag to be set. If you look at the status register changes further above, you will notice that negative operands will also set the C-flag. Hence, we expect that the flag also holds the value 1.

After executing the test, you should see a positive result.

Intensive testing

This article only covers the general approach to test case writing. If you want to ensure the correct implementation of your translation functions, you need to test every one of them. This is time-consuming, but I advise you to do the testing parallel to the development of the translation functions. This way, you can catch errors early on.

So far, I have created more than 500 test cases for the AVR32 emulator. They cover the most important pitfalls, but not every possible case. In the future, I will also add them to the public repository.

The next steps

Until now, we have created an example board, implemented instruction translation functions, and finally tested our work. The next article in this series will cover the implementation of complex instructions in helper functions.