These days, there are many easy to use environments for microcontroller programming. You have the popular Arduino platform but also the IDEs and tools provided by the different chip manufacturers as well as several open source efforts.
While these are certainly great for getting your project going, they often gloss over the actual "bare metal" details of the chip you work with. So let's explore a bit and peek under the hood.
Hardware
For this post, I'm going to use a little STM32 dev board that is generally advertised as the "Blue Pill" and can be found on eBay, Aliexpress or your other favorite source of cheap gadgets from China.
The board is based around an STM32F103C8T6 chip. It contains a 32bit ARM Cortex M3 microcontroller and lots of peripherals. You can typically get the boards for less than 2$ per piece. While the identifier "STM32F103C8T6" might look scary at first, it's actually quite simple: It's a part from STMicroelectronics, has a 32bit CPU, is part of the F1 family, specifically the F103 model. C8T6 encodes things like which chip package is used and how much flash memory is included (in this case 64kB).
Since the chip features an ARM 32bit CPU, we will need a compiler that can target
tthat architecture. I'll use GCC - so the compiler we want is gcc-arm-none-eabi
(targeting ARM, no OS, embedded ABI). If you're using Ubuntu, Debian or Raspbian, you can install
the compiler via apt-get install gcc-arm-none-eabi
.
How do you program a micro controller?
Contrary to your typical $FAVORITE_OS
program, you do not (generally) have any libraries
or environment around your code. Instead, the instructions your code is compiled
into just run directly, one after each other.
Another important aspect is that you do not have virtual memory - so you do not have a private large memory space you can use as you like and you talk to a kernel when you want to do anything with hardware. Instead, you are directly accessing the physical memory (RAM).
The physical memory you can use for your program however does not start at address 0. It is actually a relatively small chunk among other things at are made to look like they are part of memory. So what else do we have there? Let's have a look at the datasheet. In section 4, we can see how memory is organized. The interesting bits are:
First Byte | Last Byte | Length | Contents |
---|---|---|---|
0x0000 0000 |
Depends | 2kB or 64kB | Either Flash memory or hardwired system memory depending on boot pins |
0x0800 0000 |
0x0800 FFFF |
64kB | Flash memory (the C8T6 variant has 64kB. The datasheet shows 128kB) |
0x1FFF F000 |
0x1FFF F7FF |
2kB | System memory with the hardwired bootloader from ST |
0x2000 0000 |
0x2000 4FFF |
20kB | SRAM - our main RAM |
0x4000 0000 |
0x4000 33FF |
141kB | Peripherals (your ADCs, GPIOs, SPIs, UARTs, ... ) |
So we can't just write our data / variables to arbitrary memory locations - they might not be backed by RAM. Or even worse, they might be backed by some memory mapped hardware and we could get interesting side effects.
We now know that address 0 actually either maps to the embedded flash memory or the hardwired system memory.
Which one we get is determined by the boot pins (BOOT0
and BOOT1
). If we select flash, whatever we put
in there appears starting at address 0.
The Linker
While the compiler takes care of translating our C code to machine instructions (for ARM in our case), the linker is responsible for resolving references between functions, globals or in more generic terms "symbols". It also lays out the final binary image.
We can control this process with linker scripts. Here's a minimal example (linker.ld
):
MEMORY
{
FLASH (rx) : ORIGIN = 0x08000000, LENGTH = 64K
RAM (rwx) : ORIGIN = 0x20000000, LENGTH = 20K
}
SECTIONS
{
.text : {
/* Start at 0 for the code in flash */
. = 0;
/* At the very beginning, we need the interrupt vectors.
* We need to mark this KEEP to make sure it doesn't get garbage collected.
* The syntax *(.foo) means take the .foo sections from all files. */
KEEP(*(.interrupt_vectors))
/* The whole interrupt table is 304 bytes long. Advance to that position in case
* the supplied table was shorter. */
. = 304;
/* And here comes the rest of the code, ie. all symbols starting with .text */
*(.text*)
} > FLASH = 0xFF /* Put this in the flash memory region, default to 0xFF */
_end_of_ram = ORIGIN(RAM) + LENGTH(RAM); /* Define a symbol for the end of ram */
}
MEMORY
defines the location an size of flash and ram. We start laying out the text section
at position 0
, first putting the interrupt_vectors section, ensuring it is at least 304
bytes long and then we collect all .text sections with the code. A special symbol _end_of_ram
is defined to point at the byte just after the last usable ram byte - we will need this to
initialize the stack pointer.
The flash memory's default state is 0xFF
and not 0x00
, so we'll set this as the fill value.
Also note that there would be two options for the flash ORIGIN
address: We could use either
0x0000 0000
or 0x0800 0000
, but remember that it only appears at at 0x0000 0000
if we
set the boot configuration as such. So if we chose to boot differently, it will not be there.
Writing some code
With the linker script prepared, let's write some code. The hello world of micro controllers is
to blink an LED. The Blue Pill has one on board which is connected to pin PC13
.
So far, we've only looked at the datasheet. In order to write code that accesses peripherals like the GPIO module to blink the LED, we'll need more information. We can find the necessary details in the document RM0008: The reference manual for the STM32F1xx parts.
Don't get scared by the over 1100 pages of text. We will reference specific sections to figure
out what we're after. A good place to start is section 3.1: System architecture. Here we can
see that there are 1 CPU core and 2 DMA engines accessing flash, ram and peripherals through
a bus matrix. The peripherals sit behind the AHB
on two peripheral buses: APB1
and APB2
.
GPIOC which we need for controlling our LED at PC13
is on APB2
.
Clock gating
One thing that might not immediately be obvious is that the clock for these peripherals can be turned off to conserve power. And they do default to off. So we first need to turn the clock for port C on. Where we do that is described in section 7.3.7: APB2 peripheral clock enable register (RCC_APB2ENR)
So we want to set bit 4 of this register to 1 to enable port C, but where is this register in memory?
We're only being told that it is at address 0x18
within the memory for RCC configuration.
So let's go back to section 3.3 to consult the memory map. There, "Reset and clock control RCC" is
listed at 0x4002 1000
. Adding 0x18
we get 0x4002 1018
.
GPIO config
So what else do we need to do? Section 9 covers GPIO. Looks like have quite some options.
Let's just do a simple push-pull output. Table 20 and 21 tell us that we need CNF=00
and MODE=11
for a max. 50 MHz speed push-pull.
Section 9.2.2 documents the register we need. The bits for pin 13 are 20-23. Note the reset value
0x4444 4444
: We'll have to leave the other bits with those values to avoid changing the
config of any other pins. The section specifies an address offset of 0x04
, but what is the
base for this? Our trusty memory map in section 3.3 points us at 0x4001 1000
for GPIO Port C, so
we get 0x4001 1004
.
Toggling the GPIO pin
Finally, we need to toggle the pin between high and low to blink the LED. We can use the trick described
in section 9.1.2 to atomically set/reset only the pin we're interested in. The BSRR
register for that
purpose is at offset 0x10
as described in section 9.2.5. We need to set bit 13 to make PC13
high and
bit 29 to make it low respectively.
Putting it all together
With this information, we can put together our main.c
:
// This is the symbol defined in the linker script.
extern unsigned long _end_of_ram;
// We'll implement main a bit further down but we need the symbol for
// the initial program counter / start address.
void main();
// This is our interrupt vector table that goes in the dedicated section.
__attribute__ ((section(".interrupt_vectors"), used))
void (* const interrupt_vectors[])(void) = {
// The first 32bit value is actually the initial stack pointer.
// We have to cast it to a function pointer since the rest of the array
// would all be function pointers to interrupt handlers.
(void (*)(void))((unsigned long)&_end_of_ram),
// The second 32bit value is the initial program counter aka. the
// function you want to run first.
main
};
void wait() {
// Do some NOPs for a while to pass some time.
for (unsigned int i = 0; i < 2000000; ++i) __asm__ volatile ("nop");
}
void main() {
// Enable port C clock gate.
*((volatile unsigned int *)0x40021018) |= (1 << 4);
// Configure GPIO C pin 13 as output.
*((volatile unsigned int *)0x40011004) = ((0x44444444 // The reset value
& ~(0xfU << 20)) // Clear out the bits for pin 13
| (0x3U << 20)); // Set both MODE bits, leave CNF at 0
while (1) {
// Set the output bit.
*((volatile unsigned int *)0x40011010) = (1U << 13);
wait();
// Reset it again.
*((volatile unsigned int *)0x40011010) = (1U << 29);
wait();
}
}
This program essentially only consists of two parts: - The interrupt vector table that specifies the initial stack pointer and our entrypoint - A super tiny main function that sets up the GPIO pin and toggles it in a loop
If you're wondering how I came up with 2000000 loop iterations: By default, the CPU runs off a 8MHz internal oscillator. Each loop iteration takes a few instructions, so if we count to 2 million, it's more or less going to take about a second.
We can compile:
arm-none-eabi-gcc -ggdb -Wall -Os -mthumb -nostdlib -o main.o -c main.c
link:
arm-none-eabi-gcc -ggdb -Os -Wl,--gc-sections -mthumb -mcpu=cortex-m3 -Tlinker.ld -o main.elf main.o
and flash (using the JTAG adapter described in this post):
openocd -f sipeed.cfg -c "program main.elf verify reset exit"
and you should see the onboard LED start blinking.
Check out part 2 to do more with OpenOCD.