In part 1, we came up with a minimal program and a linker script to compile a binary we could load onto the Blue Pill. We used OpenOCD to program the binary into the chip's embedded flash memory. OpenOCD can however do much more than what we did so far.
Programming Flash
To change things up a little bit, we're going to use an ST-LINK V2 adapter this time. Compared to the the Sipeed JTAG adapter used in part 1, it has less features; there is no embedded serial port and it can't do JTAG.
On the other hand, the ST-LINK provides power for our Blue Pill and uses SWD to talk to the chip - the necessary SWDIO and SWCLK pins are conveniently accessible on the end of the Blue Pill. For a smooth experience, you also want to connect the RST pin of the ST-LINK with the reset pin on the Blue Pill (next to the USB connector, marked "R" or "B2").
Since we're using a different adapter (or "interface" in OpenOCD terminology), we
need a new config file. OpenOCD ships with one for the ST-LINK, so our config can
be as simple as this (stlink_bluepill.cfg
):
source [find interface/stlink-v2.cfg]
transport select hla_swd
source [find target/stm32f1x.cfg]
We're just using the shipped interface config, the SWD transport and the STM32F1x target (remember, the Blue Pill has an STM32F103 chip). We can again program our ELF binary from part 1:
openocd -f stlink_bluepill.cfg -c "program main.elf verify reset exit"
Debugging
OCD stands for "On-Chip Debugger", so programming could be seen more as a side business than the core purpose of OpenOCD. ARM cores contain a debug port (DP) that can be access via JTAG or SWD. Through it, we can read and manipulate the CPU's state and memory.
Let's start OpenOCD again, this time without programming anything:
$ openocd -f stlink_bluepill.cfg
Open On-Chip Debugger 0.10.0
Licensed under GNU GPL v2
For bug reports, read
http://openocd.org/doc/doxygen/bugs.html
Info : The selected transport took over low-level target control. The results might differ compared to plain JTAG/SWD
adapter speed: 1000 kHz
adapter_nsrst_delay: 100
none separate
Info : Unable to match requested speed 1000 kHz, using 950 kHz
Info : Unable to match requested speed 1000 kHz, using 950 kHz
Info : clock speed 950 kHz
Info : STLINK v2 JTAG v29 API v2 SWIM v7 VID 0x0483 PID 0x3748
Info : using stlink api v2
Info : Target voltage: 3.145419
Info : stm32f1x.cpu: hardware has 6 breakpoints, 4 watchpoints
In this mode, OpenOCD will be listening on ports 3333 and 4444. Let's try telnet on port 3333:
$ telnet localhost 4444
Trying ::1...
Trying 127.0.0.1...
Connected to localhost.
Escape character is '^]'.
Open On-Chip Debugger
>
We get a prompt, where we can for example halt the CPU. If we do this, our blinking LED should stop:
> halt
target halted due to debug-request, current mode: Thread
xPSR: 0x21000000 pc: 0x08000132 msp: 0x20004ff0
You can resume the CPU (and the blinking) with resume
. To disconnect, simply use
exit
.
GDB
On port 3333, OpenOCD runs a GDB server. We can run GDB and connect to it with
target remote :3333
like this:
$ arm-none-eabi-gdb main.elf
GNU gdb (7.10-1+9) 7.10
Copyright (C) 2015 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law. Type "show copying"
and "show warranty" for details.
This GDB was configured as "--host=arm-linux-gnueabihf --target=arm-none-eabi".
Type "show configuration" for configuration details.
For bug reporting instructions, please see:
<http://www.gnu.org/software/gdb/bugs/>.
Find the GDB manual and other documentation resources online at:
<http://www.gnu.org/software/gdb/documentation/>.
For help, type "help".
Type "apropos word" to search for commands related to "word"...
Reading symbols from main.elf...done.
(gdb) target remote :3333
Remote debugging using :3333
wait () at main.c:22
22 for (unsigned int i = 0; i < 2000000; ++i) __asm__ volatile ("nop");
(gdb)
We compiled the binary with -ggdb
, so GDB was able to load debug symbols and after
connecting to the target can directly tell us what code the CPU was executing. Since
we're waiting most of the time, it's highly likely that we see the for loop in the
wait()
function doing NOPs.
Let's reset the CPU and single-step through our code (you can just press enter to repeat the previous command):
(gdb) monitor reset halt
target halted due to debug-request, current mode: Thread
xPSR: 0x01000000 pc: 0x08000140 msp: 0x20005000
(gdb) s
25 void main() {
(gdb)
36 *((volatile unsigned int *)0x40011010) = (1U << 13);
(gdb)
39 *((volatile unsigned int *)0x40011010) = (1U << 29);
(gdb)
27 *((volatile unsigned int *)0x40021018) |= (1 << 4);
(gdb)
30 *((volatile unsigned int *)0x40011004) = ((0x44444444 // The reset value
(gdb)
36 *((volatile unsigned int *)0x40011010) = (1U << 13);
(gdb)
The LED should now be turned on - but what follows is a boring 2 million NOP loops. If we want to go straight to where we turn the LED off, we can use a break point. Resetting the LED is on line 39, so:
(gdb) break main.c:39
Breakpoint 1 at 0x8000146: file main.c, line 39.
(gdb) c
Continuing.
Unfortunately, at least with my GCC and GDB versions, this did not work.
Interlude: Why is the breakpoint not working?
We can disassemble the code of our main function:
(gdb) disass
Dump of assembler code for function main:
0x08000140 <+0>: movs r1, #16
0x08000142 <+2>: push {r4, r5, r6, lr}
0x08000144 <+4>: movs r6, #128 ; 0x80
0x08000146 <+6>: movs r5, #128 ; 0x80
0x08000148 <+8>: ldr r2, [pc, #32] ; (0x800016c <main+44>)
0x0800014a <+10>: ldr r3, [r2, #0]
0x0800014c <+12>: orrs r3, r1
0x0800014e <+14>: str r3, [r2, #0]
0x08000150 <+16>: ldr r2, [pc, #28] ; (0x8000170 <main+48>)
0x08000152 <+18>: ldr r3, [pc, #32] ; (0x8000174 <main+52>)
0x08000154 <+20>: str r2, [r3, #0]
0x08000156 <+22>: lsls r6, r6, #6
0x08000158 <+24>: lsls r5, r5, #22
0x0800015a <+26>: ldr r4, [pc, #28] ; (0x8000178 <main+56>)
0x0800015c <+28>: str r6, [r4, #0]
0x0800015e <+30>: bl 0x8000130 <wait>
=> 0x08000162 <+34>: str r5, [r4, #0]
0x08000164 <+36>: bl 0x8000130 <wait>
0x08000168 <+40>: b.n 0x800015a <main+26>
0x0800016a <+42>: nop ; (mov r8, r8)
0x0800016c <+44>: asrs r0, r3, #32
0x0800016e <+46>: ands r2, r0
0x08000170 <+48>: add r4, r8
0x08000172 <+50>: add r4, r6
0x08000174 <+52>: asrs r4, r0, #32
0x08000176 <+54>: ands r1, r0
0x08000178 <+56>: asrs r0, r2, #32
0x0800017a <+58>: ands r1, r0
End of assembler dump.
Looking at the code, we wanted to break between the two call to main
- this
would be at address 0x08000162
. However, the breakpoint was set at 0x8000146
.
If we stare at the code a bit, we can see that GCC optimized it for speed quite a bit.
Instead of computing the bitshift in each loop iteration, it's only doing
str r5, [r4, #0]
, so just storing the value from register r5
. And if
we look at the instruction where GDB set the breakpoint, movs r5, #128
, this
is actually the first step in computing the shifted (1U << 29
) value. The
second part is at 0x08000158
where the already set 0x80
is shifted by 22
.
So it looks like GDB set the breakpoint on the first instruction belonging to the line of code we wanted to break on - unfortunately this is outside the loop and we won't hit it.
We can set a breakpoint on the correct address manually:
(gdb) break *0x08000162
Breakpoint 7 at 0x8000162: file main.c, line 39.
(gdb) c
Continuing.
Breakpoint 2, main () at main.c:39
39 *((volatile unsigned int *)0x40011010) = (1U << 29);
Looking at CPU state
After hitting the (fixed) breakpoint from above, we can take a look at the CPU registers:
(gdb) info registers
r0 0x4a5dead2 1247668946
r1 0x10 16
r2 0x44344444 1144276036
r3 0x0 0
r4 0x40011010 1073811472
r5 0x20000000 536870912
r6 0x2000 8192
r7 0xf23d2b64 -230872220
r8 0xff3ffffe -12582914
r9 0xefbf9ddd -272654883
r10 0x2ad17842 718370882
r11 0x63dac8c9 1675282633
r12 0xf9ffeffb -100667397
sp 0x20004ff0 0x20004ff0
lr 0x8000163 134218083
pc 0x8000162 0x8000162 <main+34>
xPSR 0x61000000 1627389952
Most interesting are r5
and r6
, where we figured out our shifted values
for turning on/off the LED are. pc
is the program counter - which unsurprisingly
points to the instruction we set the breakpoint at.
sp
is the stack pointer. If we compare it to where we started, we didn't get too
far:
(gdb) print &_end_of_ram
$1 = (unsigned long *) 0x20005000
But since our program is not that big, that's to be expected. We could also print memory, but again, due to the simplicity of our minimal program, everything ends up in registers anyway.