Categories
Embedded Systems Engineering Gentoo Linux

Don’t build standard libraries with LTO unless you really want to suffer

I wanted to make this post more as a reminder to myself.

A few years ago I ran into an interesting issue while working with a cross-compiler for ARM on my Gentoo Linux systems. Effectively, I was getting a ton of libc undefined symbols errors. After a thread on the gcchelp mailing list, this was the summary of the outcome:

Gabriel Marcano:
Can newlib not be built with -flto?

...

Alexander Monakov:
It matters because the compiler recognizes several libc functions by name (including 'malloc'); recognizing such so-called "builtins" is necessary to optimize their uses. Unfortunately, builtins are different from common functions in LTO bytecode, and sometimes behave in unexpected ways. You can find examples searching GCC Bugzilla for stuff like "LTO built-in".

In your case the reference to built-in malloc is not visible to the linker when it invokes the plugin while scanning input files, so it does not unpack malloc.o from libc.a. This is a simple case of a more general problem where a compiler creates new references to built-in functions after optimizations (e.g. transforming 'printf("Hello\n")' to 'puts("Hello")'). Somehow it happens to work when libc is not slim-LTO (I guess it is rescanned), but rescanning for LTO code-generation is not implemented, as far as I know.

So no, apparently the tricks that GCC/LD make to optimize strlen, malloc, etc. make it next to impossible to properly enable LTO for the standard library.

For Gentoo this means remembering to disable LTO for newlib builds for cross-compilers. This is done by making a /etc/portage/env/no-lto.conf file:

CFLAGS="${CFLAGS} -fno-lto"
CXXFLAGS="${CXXFLAGS} -fno-lto"
FFLAGS="${FFLAGS} -fno-lto"

And then somewhere under /etc/portage/package.env/ making a file with:

cross-*/newlib no-lto.conf
Categories
Embedded Systems Engineering

I _WANT_IO_LONG_LONG, or sometimes Newlib doesn’t want long long integer types

I ran into an issue when trying to finish some implementation work for a paper that I have due in less than a week. Effectively, I was just trying to printf some long long types, and they were getting butchered. I reduced a reproducer on my Cortex-M4 device to the following line:

printf("%llu %ld\n", 1llu, 2l);

What I expected to see:

1 2

What I saw:

0 1

Ok… that’s weird. I did a quick google search, and the summary is that if you’re using newlib nano, well, newlib doesn’t support long long types in printf and its ilk. Well, I’m not using newlib nano for this program. Thinking about it a bit harder, this really looks like printf is only reading 4 bytes for the %llu type. I finally decided to go read the source code in newlib for some clues.

Reading through the printf implementation eventually led me here:

#ifndef _NO_LONGLONG
			if (flags & QUADINT)
				*GET_ARG (N, ap, quad_ptr_t) = ret;

OK, so there’s a check for this _NO_LONGLONG define. Where can it get set? Oh, apparently in the same file:

#define _NO_LONGLONG
#if defined _WANT_IO_LONG_LONG \
	&& (defined __GNUC__ || __STDC_VERSION__ >= 199901L)
# undef _NO_LONGLONG

OK, so it’s looking to see if _WANT_IO_LONG_LONG is defined. Apparently this is a configuration option, specifically --enable-newlib-io-long-long. Rebuilding newlib with that option finally lets printf work as intended.

Reading through the repository now that I had a configuration flag handy, I was able to find it in the README, where it states that this flag is disabled by default, but some hosts enable it in their configure.host files. Just something to keep in mind, I guess, when using newlib.

For those Gentoo users like me, to build newlib with this option, add EXTRA_ECONF="--enable-newlib-io-long-long" to an /etc/portage/env/ file and have your cross-*/newlib of choice use it in /etc/portage/packages.env/ .

Categories
Embedded Systems Engineering

Newlib, FreeRTOS, and the curse of __SINGLE_THREAD___

tl;dr If newlib is built using --disable-newlib-multithread, its object files will have been built using __SINGLE_THREAD__, so any application linking to newlib must also define __SINGLE_THREAD__ or some newlib headers will have structs with different fields between newlib and the application. Or just rebuild newlib with --enable-newlib-multithread.

I’ve been working on a personal project of mine. At this point, revision A of the hardware works, and the firmware also works. But getting to this point wasn’t exactly easy. As expected of embedded software development, there are a lot of potholes in the road.

I thought I had finished my basic firmware implementation last night, and then I tried rebuilding it on my desktop instead of my laptop… and was greeted by hardfaults due to null-pointer dereferences on the Raspberry Pi Pico on the board (I have configured to MPU to prevent access to the first 256 bytes specifically to catch null pointer dereferences). That was… bizarre, as the code worked fine if I built it on my laptop.

After some debugging, I noticed the problem– for some reason, the stdout, which is part of an array of three elements of the type __sFILE struct (defined in newlib/libc/include/sys/reent.h) looked… like it was shifted over by 4 bytes. So as a sanity check, I printed the sizeof() of the struct while debugging… and found something I have honestly never seen before. If I printed the sizeof() from within newlib code, I got a size of 100 bytes. But if I printed it from my own application, I got 104 bytes. Well, that would explain the 4 byte offset, but… why was there a discrepancy?

Taking a look at the reent.h header in newlib, I noticed something in the __sFILE struct– one element is ifndef __SINGLE_THREAD__’d, specifically a lock for multi-threading… and then it clicked. I built my newlib for cross-compilation using Gentoo’s crossdev, and by default it builds newlib for single-threading, which forces newlib’s build systems to define the__SINGLE_THREAD__ macro. However, my own application nor the many CMake layers used by the pico-sdk do not define __SINGLE_THREAD__ anywhere, leading to my application to interpret this structure, defined in the header, as having that lock field. This led to a struct definition conflict between my application and newlib’s static library.

My fix was to simply re-build newlib with multi-threading enabled, as this is apparently the default used by upstream anyway. Also, the reason the firmware built on my laptop worked was that I just happened to have enabled newlib multi-threading… for reasons I no longer remember. In theory, one could also just define the __SINGLE_THREAD__ macro themselves when building the application, and should work as well.