Introduction

For my Master’s thesis, Hacking The Stars: A Fuzzing Based Security Assessment Of CubeSat Firmware I needed to implement an emulator for the AVR32 CPU architecture. At that time, there was no such emulator available to the public, neither open source nor commercial. Because of that, I started to implement my own emulator based on QEMU. QEMU is a well-known emulation software that supports various architectures. It also supports hardware emulation, and some older versions can be used with the fuzzing tool AFL.

Unfortunately, the learning curve for new QEMU developers is steep. There was nearly no documentation on the project page. There are very few resources on where to start or how a new architecture can be implemented. And most of the available resources were outdated. If you encounter a problem because QEMU provides you with an obscure error message, there is a high chance that you cannot find any information about it online. The QEMU project also says that you need to read the code to understand how QEMU works. While this is generally true, such a huge codebase is not easy to understand. Especially without any overview documents.

You can probably imagine that the implementation of a whole new architecture in QEMU was a challenging task. Sometimes I needed a week of trial and error to get a single aspect working. The whole process took about 6 months of (part-time) work, and sometimes it was very frustrating. During that time, I gained a lot of experience and learned how to implement a new architecture in QEMU. And how to not implement a new architecture.

With this blog series, I want to share my experiences and learnings. I hope that other new developers and researchers can benefit from this work and have an easier start.

So, let’s get started!

How does QEMU work?

This is an important question. But, at least for now, I will not answer it. Because other people already did, and their answers are still good.

The most helpful resource on QEMU that I found is a blog by airbus-seclab. While some aspects of the blog are not up-to-date anymore, the general description of QEMU seems to be. Therefore, I encourage you to read their posts on the execution loop and the TCG internals if you want to learn more about QEMU’s inner functioning. I will cover some of these aspects in a later post.

What you need to know for the rest of the post is that QEMU works on emulated machines. When you start QEMU, you need to specify which machine you want to run. For QEMU, a machine is the emulated equivalent of an embedded device or a desktop computer. The emulated machine can contain various devices, like flash memories, sensors, or bus systems. A machine also has a specific CPU architecture. Therefore, if we want to add a new architecture, we also need to implement a machine that works on it. This will be the topic of this post. The implementation of the actual CPU architecture emulation is another step. It will be the focus of Part 2 of this series.

What we will do

In this post, I will show you how you can implement an example AVR32 microcontroller. To do so, we will create two emulated hardware devices:

  1. An example board that contains a microcontroller
  2. The microcontroller that contains a flash memory and the CPU

The board will not represent any real hardware, and it also won’t be a very beautiful implementation. My goal is to show you the general process of adding a new device, and for this purpose, it will be good enough. In a later post, I will show you how we can emulate a real hardware device with peripheral hardware.

Preparing the build environment

First we need to install git and some build dependencies:

apt install git libglib2.0-dev libfdt-dev libpixman-1-dev zlib1g-dev ninja-build

Then we can get the code from QEMU’s GitHub repository and check if the build configuration works correctly:

git clone https://github.com/qemu/qemu.git
cd qemu
./configure

If you do not see an error message at the end, everything looks good. If you encounter any errors, there is likely a missing library that needs to be installed. Use apt-cach search to find the corresponding package for your distribution.

Code overview

You will see a long list of folders and files inside the QEMU folder. But do not worry. We will only need a few of them. In general, the repository is organized by ‘topic’. For example, there is the tcg folder that contains the code for the tiny code generator (that generates the emulated CPU instructions) or the hw folder that contains all the code for emulated hardware. Inside the hw folder are many other folders. One for the specific hardware of every CPU architecture and also one for every type of emulated device, like I2C or USB.

The same organization can be found in the target folder. This folder contains the code that is used to actually interpret binary data for our target CPU architecture. We will mostly work inside the hw and target folders, as we want to add a new architecture to QEMU and do not want to modify QEMU itself.

Adding an example hardware board

As mentioned above, when using QEMU, you need to specify a machine that is used to execute the target program.

Let’s add a new folder hw/avr32. Inside this folder, we will create an example hardware board that will be able to execute a generic AVR32 binary image later. We will call our board avr32example_board and, therefore, create the file avr32example_board.c.

Our new file starts with a few includes, a state description and a class description:

#include "qemu/osdep.h"
#include "qemu/units.h"
#include "qapi/error.h"
#include "avr32exp.h"
#include "boot.h"
#include "qom/object.h"
#include "hw/boards.h"

struct AVR32ExampleBoardMachineState {
    MachineState parent_obj;
    AVR32EXPMcuState mcu;
};
typedef struct AVR32ExampleBoardMachineState AVR32ExampleBoardMachineState;

struct AVR32ExampleBoardMachineClass {
    MachineClass parent_class;
};

As the name suggests, the MachineState represents the state of our board during the emulation. Every connected device needs to be part of the state. For example, we included AVR32EXPMcuState. This is the state of a microcontroller that we will implement in one of our next steps. The MachineClass contains attributes of the board. For example, if we add an emulated memory region that represents RAM memory to the board, we would include its size in the MachineClass.

Next, we add some definitions that are necessary to identify the board. They follow QEMU’s conventions and are needed to connect different devices to each other. It is important to be very consistent when using these types, as compiling QEMU fails when they are mixed up.

#define TYPE_AVR32EXAMPLE_BOARD_BASE_MACHINE MACHINE_TYPE_NAME("avr32example-board-base")
#define TYPE_AVR32EXAMPLE_BOARD_MACHINE MACHINE_TYPE_NAME("avr32example-board")

DECLARE_OBJ_CHECKERS(AVR32ExampleBoardMachineState, AVR32ExampleBoardMachineClass,
        AVR32EXAMPLE_BOARD_MACHINE, TYPE_AVR32EXAMPLE_BOARD_MACHINE)

When a new instance of our machine is created, QEMU calls an initiation function. Inside that function, we need to create the MachineState, any memory regions, and we also need to connect other devices to our machine. You will notice that we initialize a child device for the microcontroller. The microcontroller will have a flash memory that is used to load the firmware. We also tell QEMU do the firmware loading. Let’s add the following function:

//The generic MachineState is passed by QEMU
static void avr32example_board_init(MachineState *machine)
{
    //Make a specific MachineState out of the generic one
    AVR32ExampleBoardMachineState* m_state = AVR32EXAMPLE_BOARD_MACHINE(machine);

    //We initialize the mocrocontroller that is part of the board
    object_initialize_child(OBJECT(machine), "mcu", &m_state->mcu, TYPE_AVR32EXPS_MCU);
    //And we connect it via QEMU's SYSBUS.
    sysbus_realize(SYS_BUS_DEVICE(&m_state->mcu), &error_abort);

    //Here we load the firmware file with a load function that we will implment in boot.c
    if (machine->firmware) {
        if (!avr32_load_firmware(&m_state->mcu.cpu, machine,
                                 &m_state->mcu.flash, machine->firmware)) {
            exit(1);
        }
    }
}

Please notice that we use &m_state->mcu.flash as a third parameter for the loading function. This is the emulated flash drive of the microcontroller that we will implement later. QEMU will load the firmware image and write the firmware into the flash memory.

We need to tell QEMU that avr32example_board_init should be used to initialize our board. We also have to provide some information on the capabilities of our machine to QEMU. This is done by the class_init function:

//Generic Objectc is passed by QEMU
static void avr32example_board_class_init(ObjectClass *oc, void *data)
{
    //The generic machine class from object
    MachineClass *mc = MACHINE_CLASS(oc);
    mc->desc = "AVR32 Example Board";
    mc->alias = "avr32example-board";
    
    //Notice that we tell QEMU what function is used to initialize our board here.
    mc->init = avr32example_board_init;
    mc->default_cpus = 1;
    mc->min_cpus = mc->default_cpus;
    mc->max_cpus = mc->default_cpus;
    // Our board does not have any media drive
    mc->no_floppy = 1;
    mc->no_cdrom = 1;
    //We also will not have threads
    mc->no_parallel = 1;
}

We are nearly done. The last missing step is the ‘registration’ of our new machine inside QEMU:

static const TypeInfo avr32example_board_machine_types[] = {
        {
                                //Notice that this is the TYPE that we defined above.
                .name           = TYPE_AVR32EXAMPLE_BOARD_MACHINE,
                                //Our machine is a direct child of QEMU generic machine
                .parent         = TYPE_MACHINE,
                .instance_size  = sizeof(AVR32ExampleBoardMachineState),
                .class_size     = sizeof(AVR32ExampleBoardMachineClass),
                //We need to registers the class inti function 
                .class_init     = avr32example_board_class_init,
        }
};
DEFINE_TYPES(avr32example_board_machine_types)

And that’s it. We added a new machine to QEMU. But that’s only the first step. We need to tell QEMU how it should load firmware files into our emulated device.

The boot process

Depending on the format of the target firmware, we need to specify a loading function for our device. To do so, we create the file boot.h:

#ifndef HW_AVR32_BOOT_H
#define HW_AVR32_BOOT_H

#include "hw/boards.h"
#include "cpu.h"

bool avr32_load_firmware(AVR32ACPU *cpu, MachineState *ms,
                         MemoryRegion *mr, const char *firmware);

#endif // HW_AVR32_BOOT_H

The actual implementation will be in the file boot.c:

#include "qemu/osdep.h"
#include "qemu/datadir.h"
#include "hw/loader.h"
#include "boot.h"
#include "qemu/error-report.h"

bool avr32_load_firmware(AVR32ACPU *cpu, MachineState *ms,
                         MemoryRegion *program_mr, const char *firmware){
    g_autofree char *filename = NULL;
    int bytes_loaded;

    //We get the filename that is specified as 'bios' when QEMU is started
    filename = qemu_find_file(QEMU_FILE_TYPE_BIOS, firmware);
    if (filename == NULL) {
        error_report("Cannot find firmware image '%s'", firmware);
        return false;
    }
    //we use a build-in function to load the firmware from the file to the emulated memory.
    bytes_loaded = load_image_mr(filename, program_mr);

    if (bytes_loaded < 0) {
        error_report("Unable to load firmware image %s as ELF or raw binary",
                     firmware);
        return false;
    }
    return true;
}

Earlier, we called avr32_load_firmware inside the clas_init function and used m_state->mcu.flash as a parameter for the memory region. Now you can see how we use the flash memory and load the firmware into it.

The loading function will only work with raw binary firmware images. In a later post, you will see how we can improve it to also load ELF files.

An emulated microcontroller

We only need one more piece of hardware. Let’s create our example microcontroller that is part of the example board in the file avr32exp.h:

#ifndef HW_AVR32_AVR32EXPC_H
#define HW_AVR32_AVR32EXPC_H
#include "target/avr32/cpu.h"
#include "qom/object.h"
#include "hw/sysbus.h"
#define TYPE_AVR32EXP_MCU "AVR32EXP" //This will be the CPU
#define TYPE_AVR32EXPS_MCU "AVR32EXPS" //This will be a system on chip (SOC)

typedef struct AVR32EXPMcuState AVR32EXPMcuState;
DECLARE_INSTANCE_CHECKER(AVR32EXPMcuState, AVR32EXP_MCU, TYPE_AVR32EXP_MCU)

//The Microcontroller needs a state
struct AVR32EXPMcuState {
    /*< private >*/
    SysBusDevice parent_obj;

    /*< public >*/
    //This is a reference to the CPU logic. We implement it in part 2.
    AVR32ACPU cpu;
    //The flash that will contain the firmware.
    MemoryRegion flash;
};
#endif // HW_AVR32_AVR32EXPC_H

The actual implementation will be in avr32exp.c:

#include "qemu/osdep.h"
#include "qemu/module.h"
#include "qemu/units.h"
#include "qapi/error.h"
#include "exec/memory.h"
#include "exec/address-spaces.h"
#include "sysemu/sysemu.h"
#include "hw/qdev-properties.h"
#include "hw/sysbus.h"
#include "qom/object.h"
#include "hw/misc/unimp.h"
#include "avr32exp.h"

struct AVR32EXPMcuClass {
    /*< private >*/
    SysBusDeviceClass parent_class;

    /*< public >*/
    const char *cpu_type;

    size_t flash_size;
};
typedef struct AVR32EXPMcuClass AVR32EXPMcuClass;
DECLARE_CLASS_CHECKERS(AVR32EXPMcuClass, AVR32EXP_MCU,
        TYPE_AVR32EXP_MCU)

This hardware file again starts with the necessary includes, class, and type definitions. We also need to define a function that initializes the microcontroller:

// This functions sets up the device
static void avr32exp_realize(DeviceState *dev, Error **errp)
{
    //We create a state for the microcontroller form the generic state
    AVR32EXPMcuState *s = AVR32EXP_MCU(dev);
    //And we create a class from the state
    const AVR32EXPMcuClass *mc = AVR32EXP_MCU_GET_CLASS(dev);

    // The AVR32 CPU was defined in the AVR32EXPMcuState
    object_initialize_child(OBJECT(dev), "cpu", &s->cpu, mc->cpu_type);
    //Set the CPU object to realized
    object_property_set_bool(OBJECT(&s->cpu), "realized", true, &error_abort);

    //Init the flash memory region
    memory_region_init_rom(&s->flash, OBJECT(dev),
                           "flash", mc->flash_size, &error_fatal);
    //Here we set the start address of the memory region 
    memory_region_add_subregion(get_system_memory(),
                                0xd0000000, &s->flash);
}

Next, we write the class init functions:

static void avr32exp_class_init(ObjectClass *oc, void *data)
{
    DeviceClass *dc = DEVICE_CLASS(oc);
    //Set the actuall setup function
    dc->realize = avr32exp_realize;
    dc->user_creatable = false;
}

static void avr32exps_class_init(ObjectClass *oc, void *data){

    AVR32EXPMcuClass* avr32exp = AVR32EXP_MCU_CLASS(oc);

    avr32exp->cpu_type = AVR32A_CPU_TYPE_NAME("AVR32EXPC");
    avr32exp->flash_size = 1024 * KiB;
}

static const TypeInfo avr32exp_mcu_types[] = {
        {
                .name           = TYPE_AVR32EXPS_MCU,
                .parent         = TYPE_AVR32EXP_MCU,
                .class_init     = avr32exps_class_init,
        }, {
                .name           = TYPE_AVR32EXP_MCU,
                .parent         = TYPE_SYS_BUS_DEVICE,
                .instance_size  = sizeof(AVR32EXPMcuState),
                .class_size     = sizeof(AVR32EXPMcuClass),
                .class_init     = avr32exp_class_init,
                .abstract       = true,
        }
};

DEFINE_TYPES(avr32exp_mcu_types)

Adding our code to the build system

To be able to compile our code later, we need to add it to the build system. To do so, we need to place the following code into the avr32/meson.build and avr32/Kconfig files:

# file: meson.build
avr32_ss = ss.source_set()
avr32_ss.add(files('boot.c'))
avr32_ss.add(files('avr32exp.c'))
avr32_ss.add(files('avr32example_board.c'))
hw_arch += {'avr32': avr32_ss}

#file Kconfig
config AVR32EXP_MCU
    bool

config AVR32EXAMPLE_BOARD
    bool
    select AVR32EXP_MCU

We also need to add subdir('avr32') to hw/meson.build. Place the line anywhere in the file. Then edit hw/meson.build and add the line source avr32/Kconfig.

QEMU also needs general information that a new architecture is present. This is done in 4 locations:

  1. Add the file configs/devices/avr32-softmmu/default.mak and place this content in it:
    # Default configuration for avr32-softmmu
    # Boards:
    #
    CONFIG_AVR32EXAMPLE_BOARD=y  
    
  2. Add the file configs/targets/avr32-softmmu.mak and place this content in it:
    TARGET_ARCH=avr32
    
  3. Modify include/disas/dis-asm.h and add these lines:
    bfd_arch_avr32,
    #define bfs_mach_avr32expc   50
    
  4. Modify include/sysemu/arch_init.h and add:
    QEMU_ARCH_AVR32 = (1 << 24),
    

The next steps

The emulated machine is the first step in creating a new QEMU architecture. In the next article, I will show you how we actually emulate the execution of CPU functions.