Categories
Embedded Systems Engineering

How is the Sparkfun Artemis Configured to Boot?

tl;dr INFO0 is configured by Sparkfun. SRAM is cleared, GPIO47 is used to select whether ASB runs and accept data over UART, and if not the bootloader jumps to 0xC000.

I’ve been working with the Sparkfun Artemis module and the Redboard Artemis ATP board for research, due to the Ambiq Apollo3 microcontroller in use. There is relatively little documentation on the requirements of user programs to successfully take over after the Apollo3 secure bootloader finishes running. There’s also not much out there on what this secure bootloader actually does (and I wish we could shut it off and just jump immediately to user code…). The goal of this post is to summarize a few of my findings.

The secure bootloader can be configured to some extend through the use of parameters that are stored in a special region of flash that’s treated as an OTP during normal operation (completely read-only). This region is called INFO0. I’m assuming it’s Sparkfun that’s providing something reasonable here, or maybe it’s the default from the factory. In any case, the Sparkfun Apollo3s are configured to jump to address 0xC0001 or with a logic high on GPIO472 to use the ASB bootloader for “OTA” updates. The asb.py script (also found in the Ambiq SDK) can be used to upload programs through this interface over UART03 (also configured in INFO0).

The bootloaders (both ASB and Sparkfun’s own SVL) check for the validity of the application’s stack pointer (the first 32-bit value of the program). On this MCU, SRAM spans from 0x10000000 to 0x10005FFF. The stack is full descending, so in theory using the address 0x10006000 as the initial value of the stack should be valid, but it turns out both ASB and SVL reject this value. Valid stack values must be less than or equal to 0x10005FFF.

After some testing trying to have some SRAM survive reboots, it seems like the secure bootloader completely erases all of SRAM. It seems that the code to do this is copied from bootrom to SRAM, oddly enough (I was able to place a watchpoint on SRAM for when it was cleared, and it paused on some code in SRAM while the bootloader was executing). Apparently register 0x50020050 holds a value indicating “the amount of SRAM to keep reserved for application scratch space”4. So it may be possible to prevent some SRAM from being cleared with a modified INFO0. Unfortunately for me, this only protects memory at the high end of SRAM (I would have liked to retain some memory in the first 4KB of SRAM).

I think this is everything I’ve found thus far. I’ve reprogrammed the SVL bootloader, optimizing it and reducing it in size. It also provides some compile-time options for skipping the SVL bootloader completely or not based on the value of some GPIO (currently only lightly configurable). I also wrote svl-tools (in Rust) to replace the svl.py program from Sparkfun and to leverage the extra functionality of reading from memory on request that I’ve added to the SVL.

If I find anything else noteworthy about the boot process, I’ll edit this post and make note of it.

  1. Register 0x50020C00 is set to 0xC000, which is the address to jump to after the bootloader is done. ↩︎
  2. Register 0x50020020 configures the pin used to force ASB boot programming. It is set to 0x000000af, pin 47 with polarity set to high. ↩︎
  3. Register 0x5002002C configures the pins to use for UART, it is set to 0xffff3031, so pin 48 and 49 for TX and RX respectively. Register 0x50020028 configures the UART, it is set to 0x1c200c0, meaning using UART module 0, data 8 bits, no parity, 1 stop bit, and a baud rate of 115200. ↩︎
  4. From the Apollo3 datasheet: https://ambiq.com/wp-content/uploads/2020/10/Apollo3-Blue-SoC-Datasheet.pdf ↩︎
Categories
Embedded Systems Engineering Gentoo Linux

Don’t build standard libraries with LTO unless you really want to suffer

I wanted to make this post more as a reminder to myself.

A few years ago I ran into an interesting issue while working with a cross-compiler for ARM on my Gentoo Linux systems. Effectively, I was getting a ton of libc undefined symbols errors. After a thread on the gcchelp mailing list, this was the summary of the outcome:

Gabriel Marcano:
Can newlib not be built with -flto?

...

Alexander Monakov:
It matters because the compiler recognizes several libc functions by name (including 'malloc'); recognizing such so-called "builtins" is necessary to optimize their uses. Unfortunately, builtins are different from common functions in LTO bytecode, and sometimes behave in unexpected ways. You can find examples searching GCC Bugzilla for stuff like "LTO built-in".

In your case the reference to built-in malloc is not visible to the linker when it invokes the plugin while scanning input files, so it does not unpack malloc.o from libc.a. This is a simple case of a more general problem where a compiler creates new references to built-in functions after optimizations (e.g. transforming 'printf("Hello\n")' to 'puts("Hello")'). Somehow it happens to work when libc is not slim-LTO (I guess it is rescanned), but rescanning for LTO code-generation is not implemented, as far as I know.

So no, apparently the tricks that GCC/LD make to optimize strlen, malloc, etc. make it next to impossible to properly enable LTO for the standard library.

For Gentoo this means remembering to disable LTO for newlib builds for cross-compilers. This is done by making a /etc/portage/env/no-lto.conf file:

CFLAGS="${CFLAGS} -fno-lto"
CXXFLAGS="${CXXFLAGS} -fno-lto"
FFLAGS="${FFLAGS} -fno-lto"

And then somewhere under /etc/portage/package.env/ making a file with:

cross-*/newlib no-lto.conf
Categories
Embedded Systems Engineering

I _WANT_IO_LONG_LONG, or sometimes Newlib doesn’t want long long integer types

I ran into an issue when trying to finish some implementation work for a paper that I have due in less than a week. Effectively, I was just trying to printf some long long types, and they were getting butchered. I reduced a reproducer on my Cortex-M4 device to the following line:

printf("%llu %ld\n", 1llu, 2l);

What I expected to see:

1 2

What I saw:

0 1

Ok… that’s weird. I did a quick google search, and the summary is that if you’re using newlib nano, well, newlib doesn’t support long long types in printf and its ilk. Well, I’m not using newlib nano for this program. Thinking about it a bit harder, this really looks like printf is only reading 4 bytes for the %llu type. I finally decided to go read the source code in newlib for some clues.

Reading through the printf implementation eventually led me here:

#ifndef _NO_LONGLONG
			if (flags & QUADINT)
				*GET_ARG (N, ap, quad_ptr_t) = ret;

OK, so there’s a check for this _NO_LONGLONG define. Where can it get set? Oh, apparently in the same file:

#define _NO_LONGLONG
#if defined _WANT_IO_LONG_LONG \
	&& (defined __GNUC__ || __STDC_VERSION__ >= 199901L)
# undef _NO_LONGLONG

OK, so it’s looking to see if _WANT_IO_LONG_LONG is defined. Apparently this is a configuration option, specifically --enable-newlib-io-long-long. Rebuilding newlib with that option finally lets printf work as intended.

Reading through the repository now that I had a configuration flag handy, I was able to find it in the README, where it states that this flag is disabled by default, but some hosts enable it in their configure.host files. Just something to keep in mind, I guess, when using newlib.

For those Gentoo users like me, to build newlib with this option, add EXTRA_ECONF="--enable-newlib-io-long-long" to an /etc/portage/env/ file and have your cross-*/newlib of choice use it in /etc/portage/packages.env/ .

Categories
Embedded Systems Engineering

Newlib, FreeRTOS, and the curse of __SINGLE_THREAD___

tl;dr If newlib is built using --disable-newlib-multithread, its object files will have been built using __SINGLE_THREAD__, so any application linking to newlib must also define __SINGLE_THREAD__ or some newlib headers will have structs with different fields between newlib and the application. Or just rebuild newlib with --enable-newlib-multithread.

I’ve been working on a personal project of mine. At this point, revision A of the hardware works, and the firmware also works. But getting to this point wasn’t exactly easy. As expected of embedded software development, there are a lot of potholes in the road.

I thought I had finished my basic firmware implementation last night, and then I tried rebuilding it on my desktop instead of my laptop… and was greeted by hardfaults due to null-pointer dereferences on the Raspberry Pi Pico on the board (I have configured to MPU to prevent access to the first 256 bytes specifically to catch null pointer dereferences). That was… bizarre, as the code worked fine if I built it on my laptop.

After some debugging, I noticed the problem– for some reason, the stdout, which is part of an array of three elements of the type __sFILE struct (defined in newlib/libc/include/sys/reent.h) looked… like it was shifted over by 4 bytes. So as a sanity check, I printed the sizeof() of the struct while debugging… and found something I have honestly never seen before. If I printed the sizeof() from within newlib code, I got a size of 100 bytes. But if I printed it from my own application, I got 104 bytes. Well, that would explain the 4 byte offset, but… why was there a discrepancy?

Taking a look at the reent.h header in newlib, I noticed something in the __sFILE struct– one element is ifndef __SINGLE_THREAD__’d, specifically a lock for multi-threading… and then it clicked. I built my newlib for cross-compilation using Gentoo’s crossdev, and by default it builds newlib for single-threading, which forces newlib’s build systems to define the__SINGLE_THREAD__ macro. However, my own application nor the many CMake layers used by the pico-sdk do not define __SINGLE_THREAD__ anywhere, leading to my application to interpret this structure, defined in the header, as having that lock field. This led to a struct definition conflict between my application and newlib’s static library.

My fix was to simply re-build newlib with multi-threading enabled, as this is apparently the default used by upstream anyway. Also, the reason the firmware built on my laptop worked was that I just happened to have enabled newlib multi-threading… for reasons I no longer remember. In theory, one could also just define the __SINGLE_THREAD__ macro themselves when building the application, and should work as well.