Categories
Embedded Systems Engineering

Newlib, FreeRTOS, and the curse of __SINGLE_THREAD___

tl;dr If newlib is built using --disable-newlib-multithread, its object files will have been built using __SINGLE_THREAD__, so any application linking to newlib must also define __SINGLE_THREAD__ or some newlib headers will have structs with different fields between newlib and the application. Or just rebuild newlib with --enable-newlib-multithread.

I’ve been working on a personal project of mine. At this point, revision A of the hardware works, and the firmware also works. But getting to this point wasn’t exactly easy. As expected of embedded software development, there are a lot of potholes in the road.

I thought I had finished my basic firmware implementation last night, and then I tried rebuilding it on my desktop instead of my laptop… and was greeted by hardfaults due to null-pointer dereferences on the Raspberry Pi Pico on the board (I have configured to MPU to prevent access to the first 256 bytes specifically to catch null pointer dereferences). That was… bizarre, as the code worked fine if I built it on my laptop.

After some debugging, I noticed the problem– for some reason, the stdout, which is part of an array of three elements of the type __sFILE struct (defined in newlib/libc/include/sys/reent.h) looked… like it was shifted over by 4 bytes. So as a sanity check, I printed the sizeof() of the struct while debugging… and found something I have honestly never seen before. If I printed the sizeof() from within newlib code, I got a size of 100 bytes. But if I printed it from my own application, I got 104 bytes. Well, that would explain the 4 byte offset, but… why was there a discrepancy?

Taking a look at the reent.h header in newlib, I noticed something in the __sFILE struct– one element is ifndef __SINGLE_THREAD__’d, specifically a lock for multi-threading… and then it clicked. I built my newlib for cross-compilation using Gentoo’s crossdev, and by default it builds newlib for single-threading, which forces newlib’s build systems to define the__SINGLE_THREAD__ macro. However, my own application nor the many CMake layers used by the pico-sdk do not define __SINGLE_THREAD__ anywhere, leading to my application to interpret this structure, defined in the header, as having that lock field. This led to a struct definition conflict between my application and newlib’s static library.

My fix was to simply re-build newlib with multi-threading enabled, as this is apparently the default used by upstream anyway. Also, the reason the firmware built on my laptop worked was that I just happened to have enabled newlib multi-threading… for reasons I no longer remember. In theory, one could also just define the __SINGLE_THREAD__ macro themselves when building the application, and should work as well.

Categories
Gentoo Linux

Gentoo’s sets.conf DateSet

I wanted to write this blog entry because this isn’t the first time I’ve rediscovered this bit of Gentoo functionality, and I suspect it will not be the last.

Recently my GCC upgrade from sys-devel/gcc:12 to sys-devel/gcc:13, and because I build most of my system with LTO, “bad things” started happening (mostly when building newer packages, LTO bytecode mismatch). I found myself wanting to rebuild all of the packages built prior to gcc:13 being installed… and remembered that at some point I had figured out this exact problem on an older system of mine through the use of Gentoo portage sets.

Gentoo’s portage’s set support is quite extensive, including a bunch of set classes that can control how packages are selected (like AgeSet and CategorySet). My goal was to make a set, like @older_gcc, that contained all of the packages built before the current gcc. I briefly considered using AgeSet, which lets you specify a number of days ago, and then you specify whether to select packages newer than that day, or older. Something like:

[older_gcc]
class = portage.sets.dbapi.AgetSet
age = 1
mode = older

But, that configuration doesn’t cover any packages that might have installed today before gcc:13. Also, I’ll probably forget to run this next time gcc does a major update, and I’ll have to update the age parameter.

It was here than I remembered doing something else many years ago, but I could not find anything relevant in the portage documentation. After digging through the portage source code, I found DateSet. This class allows one to specify a package as a point of reference! This is exactly what I wanted. As such, I’ve now added the following to my /etc/portage/sets.conf :

[older_gcc]
class = portage.sets.dbapi.DateSet
package = sys-devel/gcc

As far as I can tell from checking with emerge -pv @older_gcc, everything looks to be selected properly.

Hopefully this is the last time I rediscover this bit of functionality…

Categories
Engineering Linux

Porting postmarketOS to the Pixel 6A

I’m not sure yet how this series is going to turn out, but I’ll likely have to make multiple posts to outline all of the insanity that I’ve had to wade through to get postmarketOS working with the Pixel 6A.

Goal: Repurposing smartphones

This is work I’ve been doing in collaboration with Jen Switzer in the PATLab at UC San Diego. I was tapped to help with the project due to my background in beating my head against Linux until either I or it submit.

The goal of this particular effort is to repurpose smartphones as a way to reduce the overall carbon footprint of computing. For a more thorough explanation, I’ll refer anyone interested to Jen’s open access-paper from ASPLOS 2023 (which won a Distinguished Paper Award!).

Prerequisites

  • Google Pixel 6A that is bootloader unlocked
  • Basic understanding of Linux CLI and Android
  • A computer (or VM) with Linux and only Linux— no Mac OS X— and maybe WSL???)

Setting up pmOS development environment

I recommend following the pmOS wiki for installing pmbootstrap. If you’re using Gentoo, pmbootstrap may be available from the GURU repository, but note that the version must be 1.52 or greater (at the time of this writing, GURU only had 1.51).

For Gentoo Linux, installation (if GURU has the right version…) looks like:

# pmbootstrap ebuild is in GURU overlay, but it's not the latest.
# It must be at least version 1.52
sudo eselect repository enable guru
sudo emerge -av pmbootstrap

Once pmbootstram is installed, we need to initialize the pmOS working directory, which we can do with:

pmbootstrap init

One of the first questions pmbootstrap will ask you is where to set up a working directory. You can pick any directory, but note that it is critical that the working directory for pmbootstrap is in a Linux filesystem (e.g. ext4 or BTRFS) — NTFS or FAT32 or exFAT won’t work. The reason for this is that pmOS build system uses chroot and other Linux-isms that non-native Linux filesystems do not handle well.

If you’ve done anything with pmOS in the past, you may have a file in ~/.config/pmbootstrap.cfg that may need to be deleted, if you want to start completely from scratch/move the postmarketOS folder.

pmbootstrap init will ask a lot of questions. Here are most of the questions the installer will ask (extra spaces added to help the individual prompts stand out):

Location of the 'work' path. Multiple chroots (native, device arch, device rootfs) will be created in there.
Work path [/home/gabriel/.local/var/pmbootstrap]:

Choose the postmarketOS release channel.
Available (7):
* edge: Rolling release / Most devices / Occasional breakage: https://postmarketos.org/edge
* v22.12: Latest release / Recommended for best stability
* v22.06: Old release (unsupported)
* v21.12: Old release (unsupported)
* v21.06: Old release (unsupported)
* v21.03: Old release (unsupported)
* v20.05: Old release (unsupported)
Channel [edge]:

Choose your target device vendor (either an existing one, or a new one for porting).
Available vendors (76): acer, alcatel, amazon, amediatech, apple, ark, arrow, asus, beelink, bq, cubietech, cutiepi, dongshanpi, essential, fairphone, finepower, fly, goclever, google, gp, hisense, htc, huawei, infocus, jolla, klipad, kobo, lark, leeco, lenovo, lg, mangopi, medion, meizu, microsoft, mobvoi, motorola, nextbit, nobby, nokia, nvidia, odroid, oneplus, oppo, ouya, pegatron, pine64, planet, purism, qemu, radxa, raspberry, samsung, semc, sharp, shift, sipeed, sony, sourceparts, surftab, t2m, tablet, tokio, tolino, trekstor, vernee, videostrong, volla, wexler, wiko, wileyfox, xiaomi, xunlong, yu, zte, zuk
Vendor [qemu]:

Available codenames (39): bob, burnet, cozmo, crosshatch, damu, dru, druwl, dumo, elm, fennel, fennel14, glass, hana, juniper, kakadu, kappa, katsu, kenzo, kevin, kodama, krane, makomo, nyan-big, nyan-blaze, peach-pi, peach-pit, sargo, snow, spring, veyron-fievel, veyron-jaq, veyron-jerry, veyron-mickey, veyron-mighty, veyron-minnie, veyron-speedy, veyron-tiger, willow, x64cros
Device codename:

[The following appears if you add a new codename-- in this case Vendor google, codename bluejay]
You are about to do a new device port for 'google-bluejay'.
Continue? (y/n) [y]:

Device architecture (armv7/aarch64/x86_64/x86/riscv64) [armv7]:

Who produced the device (e.g. LG)?
Manufacturer:

What is the official name (e.g. Google Nexus 5)?
Name:

In what year was the device released (e.g. 2012)?
Year:

What type of device is it?
Valid types are: desktop, laptop, convertible, server, tablet, handset, watch, embedded, vm
Chassis:

Does the device have a hardware keyboard? (y/n) [n]:

Does the device have a sdcard or other external storage medium? (y/n) [n]:

Which flash method does the device support?
Flash method (0xffff/fastboot/heimdall/none/rkdeveloptool/uuu) [0xffff]:

You can analyze a known working boot.img file to automatically fill out the flasher information for your deviceinfo file. Either specify the path to an image or press return to skip this step (you can do it later with 'pmbootstrap bootimg_analyze').
Path:

Username [user]:

Available user interfaces (14):
* none: Bare minimum OS image for testing and manual customization. The "console" UI should be selected if a graphical UI is not desired.
* asteroid: (Wayland) Smartwatch UI from AsteroidOS
* console: Console environment, with no graphical/touch UI
* fbkeyboard: Plain framebuffer console with touchscreen keyboard support
* framebufferphone: Minimalist framebuffer menu/keyboard UI accessible via touch/volume keys & compatible scripts
* gnome: (Wayland) Gnome Shell
* gnome-mobile: (Wayland) Gnome Shell patched to adapt better to phones (Experimental)
* i3wm: (X11) Tiling WM (keyboard required)
* lxqt: (X11) Lightweight Qt Desktop Environment (stylus recommended)
* mate: (X11) MATE Desktop Environment, fork of GNOME2 (stylus recommended)
* plasma-desktop: (X11/Wayland) KDE Desktop Environment (works well with tablets)
* shelli: Plain console with touchscreen gesture support
* sxmo-de-dwm: Simple Mobile: Mobile environment based on SXMO and running on dwm
* sxmo-de-sway: Simple Mobile: Mobile environment based on SXMO and running on sway
* xfce4: (X11) Lightweight desktop (stylus recommended)
NOTE: 6 user interfaces are not available. If device supports GPU acceleration, set "deviceinfo_gpu_accelerated" to make UIs available. See: <https://wiki.postmarketos.org/wiki/Deviceinfo_reference
User interface [weston]:

Additional options: extra free space: 0 MB, boot partition size: 256 MB, parallel jobs: 17, ccache per arch: 5G, sudo timer: False, mirror: http://mirror.postmarketos.org/postmarketos/
Change them? (y/n) [n]:

Additional packages that will be installed to rootfs. Specify them in a comma separated list (e.g.: vim,file) or "none"
Extra packages [none]:

Your host timezone:

Use this timezone instead of GMT? (y/n) [y]:

Available locales (14): C.UTF-8, ch_DE.UTF-8, de_CH.UTF-8, de_DE.UTF-8, en_GB.UTF-8, en_US.UTF-8, es_ES.UTF-8, fr_FR.UTF-8, it_IT.UTF-8, nb_NO.UTF-8, nl_NL.UTF-8, pt_BR.UTF-8, ru_RU.UTF-8, sv_SE.UTF-8
Choose default locale for installation [C.UTF-8]:

Device hostname (short form, e.g. 'foo') [google-bluejay]:

Would you like to copy your SSH public keys to the device? (y/n) [n]:

Build outdated packages during 'pmbootstrap install'? (y/n) [y]:

Here is an example of me filling out the responses as I’m setting up my build for the Pixel 6A:

$ pmbootstrap init
[14:47:14] Location of the 'work' path. Multiple chroots (native, device arch, device rootfs) will be created in there.
[14:47:14] Work path [/home/gabriel/.local/var/pmbootstrap]: ~/.local/var/bluejay
[14:47:20] Setting up the native chroot and cloning the package build recipes (pmaports)...
[14:47:20] Clone git repository: https://gitlab.com/postmarketOS/pmaports.git
Cloning into '/home/gabriel/.local/var/bluejay/cache_git/pmaports'...
[14:47:25] NOTE: pmaports path: /home/gabriel/.local/var/bluejay/cache_git/pmaports
[14:47:25] Choose the postmarketOS release channel.
[14:47:25] Available (7):
[14:47:25] * edge: Rolling release / Most devices / Occasional breakage: https://postmarketos.org/edge
[14:47:25] * v22.12: Latest release / Recommended for best stability
[14:47:25] * v22.06: Old release (unsupported)
[14:47:25] * v21.12: Old release (unsupported)
[14:47:25] * v21.06: Old release (unsupported)
[14:47:25] * v21.03: Old release (unsupported)
[14:47:25] * v20.05: Old release (unsupported)
[14:47:25] Channel [edge]: 
[14:47:27] Choose your target device vendor (either an existing one, or a new one for porting).
[14:47:27] Available vendors (76): acer, alcatel, amazon, amediatech, apple, ark, arrow, asus, beelink, bq, cubietech, cutiepi, dongshanpi, essential, fairphone, finepower, fly, goclever, google, gp, hisense, htc, huawei, infocus, jolla, klipad, kobo, lark, leeco, lenovo, lg, mangopi, medion, meizu, microsoft, mobvoi, motorola, nextbit, nobby, nokia, nvidia, odroid, oneplus, oppo, ouya, pegatron, pine64, planet, purism, qemu, radxa, raspberry, samsung, semc, sharp, shift, sipeed, sony, sourceparts, surftab, t2m, tablet, tokio, tolino, trekstor, vernee, videostrong, volla, wexler, wiko, wileyfox, xiaomi, xunlong, yu, zte, zuk
[14:47:27] Vendor [qemu]: google
[14:47:29] Available codenames (39): bob, burnet, cozmo, crosshatch, damu, dru, druwl, dumo, elm, fennel, fennel14, glass, hana, juniper, kakadu, kappa, katsu, kenzo, kevin, kodama, krane, makomo, nyan-big, nyan-blaze, peach-pi, peach-pit, sargo, snow, spring, veyron-fievel, veyron-jaq, veyron-jerry, veyron-mickey, veyron-mighty, veyron-minnie, veyron-speedy, veyron-tiger, willow, x64cros
[14:47:29] Device codename: bluejay
[14:47:31] You are about to do a new device port for 'google-bluejay'.
[14:47:31] Continue? (y/n) [y]: y
[14:47:33] Generating new aports for: google-bluejay...
[14:47:33] Device architecture (armv7/aarch64/x86_64/x86/riscv64) [armv7]: aarch64
[14:47:36] Who produced the device (e.g. LG)?
[14:47:36] Manufacturer: Google
[14:47:38] What is the official name (e.g. Google Nexus 5)?
[14:47:38] Name: Pixel 6A
[14:47:40] In what year was the device released (e.g. 2012)?
[14:47:40] Year: 2022
[14:47:41] What type of device is it?
[14:47:41] Valid types are: desktop, laptop, convertible, server, tablet, handset, watch, embedded, vm
[14:47:41] Chassis: handset
[14:47:44] Does the device have a hardware keyboard? (y/n) [n]: n
[14:47:46] Does the device have a sdcard or other external storage medium? (y/n) [n]: n
[14:47:47] Which flash method does the device support?
[14:47:47] Flash method (0xffff/fastboot/heimdall/none/rkdeveloptool/uuu) [0xffff]: fastboot
[14:47:49] You can analyze a known working boot.img file to automatically fill out the flasher information for your deviceinfo file. Either specify the path to an image or press return to skip this step (you can do it later with 'pmbootstrap bootimg_analyze').
[14:47:49] Path: 
[14:47:51] *** pmaport generated: /home/gabriel/.local/var/bluejay/cache_git/pmaports/device/testing/device-google-bluejay
[14:47:51] *** pmaport generated: /home/gabriel/.local/var/bluejay/cache_git/pmaports/device/testing/linux-google-bluejay
[14:47:51] Username [user]: 
[14:47:55] Available user interfaces (14): 
[14:47:55] * none: Bare minimum OS image for testing and manual customization. The "console" UI should be selected if a graphical UI is not desired.
[14:47:55] * asteroid: (Wayland) Smartwatch UI from AsteroidOS
[14:47:55] * console: Console environment, with no graphical/touch UI
[14:47:55] * fbkeyboard: Plain framebuffer console with touchscreen keyboard support
[14:47:55] * framebufferphone: Minimalist framebuffer menu/keyboard UI accessible via touch/volume keys & compatible scripts
[14:47:55] * gnome: (Wayland) Gnome Shell
[14:47:55] * gnome-mobile: (Wayland) Gnome Shell patched to adapt better to phones (Experimental)
[14:47:55] * i3wm: (X11) Tiling WM (keyboard required)
[14:47:55] * lxqt: (X11) Lightweight Qt Desktop Environment (stylus recommended)
[14:47:55] * mate: (X11) MATE Desktop Environment, fork of GNOME2 (stylus recommended)
[14:47:55] * plasma-desktop: (X11/Wayland) KDE Desktop Environment (works well with tablets)
[14:47:55] * shelli: Plain console with touchscreen gesture support
[14:47:55] * sxmo-de-dwm: Simple Mobile: Mobile environment based on SXMO and running on dwm
[14:47:55] * sxmo-de-sway: Simple Mobile: Mobile environment based on SXMO and running on sway
[14:47:55] * xfce4: (X11) Lightweight desktop (stylus recommended)
[14:47:55] NOTE: 6 user interfaces are not available. If device supports GPU acceleration, set "deviceinfo_gpu_accelerated" to make UIs available. See: <https://wiki.postmarketos.org/wiki/Deviceinfo_reference
[14:47:55] User interface [weston]: gnome
[14:47:58] Additional options: extra free space: 0 MB, boot partition size: 256 MB, parallel jobs: 17, ccache per arch: 5G, sudo timer: False, mirror: http://mirror.postmarketos.org/postmarketos/
[14:47:58] Change them? (y/n) [n]: 
[14:48:00] Additional packages that will be installed to rootfs. Specify them in a comma separated list (e.g.: vim,file) or "none"
[14:48:00] Extra packages [none]: 
[14:48:01] Your host timezone: America/Los_Angeles
[14:48:01] Use this timezone instead of GMT? (y/n) [y]: 
[14:48:01] Available locales (14): C.UTF-8, ch_DE.UTF-8, de_CH.UTF-8, de_DE.UTF-8, en_GB.UTF-8, en_US.UTF-8, es_ES.UTF-8, fr_FR.UTF-8, it_IT.UTF-8, nb_NO.UTF-8, nl_NL.UTF-8, pt_BR.UTF-8, ru_RU.UTF-8, sv_SE.UTF-8
[14:48:01] Choose default locale for installation [C.UTF-8]: 
[14:48:02] Device hostname (short form, e.g. 'foo') [google-bluejay]: 
[14:48:03] Would you like to copy your SSH public keys to the device? (y/n) [n]: 
[14:48:05] After pmaports are changed, the binary packages may be outdated. If you want to install postmarketOS without changes, reply 'n' for a faster installation.
[14:48:05] Build outdated packages during 'pmbootstrap install'? (y/n) [y]: 
[14:48:06] WARNING: The chroots and git repositories in the work dir do not get updated automatically.
[14:48:06] Run 'pmbootstrap status' once a day before working with pmbootstrap to make sure that everything is up-to-date.
[14:48:06] DONE!

At this point, the build environment is mostly set up. You can check the status by calling pmbootstrap status:

$ pmbootstrap status
[14:48:42] *** CONFIG ***
[14:48:42] Device: google-bluejay (aarch64, "Google Pixel 6A")
[14:48:42] User Interface: gnome
[14:48:42] 
[14:48:42] *** GIT REPOS ***
[14:48:42] Path: /home/gabriel/.local/var/bluejay/cache_git
[14:48:42] - pmaports (master)
[14:48:42] 
[14:48:42] *** CHECKS ***
[14:48:42] [NOK] pmaports: workdir is not clean
[14:48:42] 
[14:48:42] *** CHECKLIST ***
[14:48:42] - pmaports: consider cleaning your workdir
[14:48:42] - Run 'pmbootstrap status' to verify that all is resolved

The message about the workdir not being clean is due to the creation of two folders in the pmaports folder for our new device— this is expected.

I’ve already done some work in getting the right parameters for the deviceinfo of the Pixel 6A port, and some patching of the initramfs that we’ll need to boot the stock Android GKI. We need to change the pmaports repository remote to point to my repository and branch.

cd ~/.local/var/bluejay/cache_git/pmaports
git clean -f device/testing/linux-google-bluejay/ device/testing/device-google-bluejay/
git remote set-url origin https://gitlab.com/gemarcano/pmaports.git
git remote add upstream https://gitlab.com/postmarketOS/pmaports.git
git fetch
git fetch upstream
git checkout google-bluejay

I have not upstreamed my changes to pmOS yet, as they are too experimental. Perhaps some day this step of using my fork won’t be necessary.

Finally, at this point you should be able to build the device package, and prepare the rootfs:

pmbootstrap build device-google-bluejay
pmbootstrap install

If you’ve already gone through these instructions, and you want to update the repository, cd to the pmaports directory and perform a git pull. Afterwards, run pmbootstrap build --force device-google-bluejay to force a rebuild of that package.

If the build fails, check the reason by using pmbootstrap log.

The following commands should/are only necessary if you modify the APGKBUILD files for the device or kernel packages:

pmbootstrap checksum device-google-bluejay
pmbootstrap checksum linux-google-bluejay

mkbootimg and avbtool

Google uses a custom image format for their Android flashable images, which are generated using a toolkit called mkbootimg. This toolkit includes mkbootimg.py itself, which is used to make boot.img and vendor_boot.img files, and unpack_bootimg.py, which is used to unpack existing boot and vendor_boot images.

Additionally, my experimentation on the 6A has identified that the bootloader, even when unlocked, does not like booting boot partitions that do not have an AVB signature, which are done using a tool called avbtool. The source code for the tools can be downloaded using the following commands:

git clone https://android.googlesource.com/platform/system/tools/mkbootimg
git clone https://android.googlesource.com/platform/external/avb

Unpacking boot.img, vendor_boot.img, and initramfs’s

We now need to unpack the stock boot and vendor_boot images. The following commands can be used to unpack boot.img and/or vendor_boot.img files. Replace anything inside square brackets with the paths specified by the contents of the brackets:

BOOT_IMG_DEST=[directory-to-unpack-boot.img]
./mkbootimg/unpack_bootimg.py --boot_img [path-to-boot.img] --out  "$BOOT_IMG_DEST"

When unpacking a boot.img, there should be two files in the output directory: a kernel, and an initramfs. For the Pixel 6A, this kernel should be the GKI, and the initramfs contains a first stage initramfs described in Google documentation (but no kernel modules whatsoever). This is the initramfs that we’ll be replacing.

We need to extract the vendor modules from the vendor_boot partition:

VENDOR_BOOT_IMG_DEST=[directory-to-unpack-boot.img]
./mkbootimg/unpack_bootimg.py --boot_img [path-to-vendor_boot.img] --out "$VENDOR_BOOT_IMG_DEST"
cd "$VENDOR_BOOT_IMG_DEST"
# The stock vendor_boot has two initramfs, unpack them both
mkdir 00 01
lz4 -cd vendor_ramdisk00 | cpio -iv0 -D ./00
lz4 -cd vendor_ramdisk01 | cpio -iv0 -D ./01
cd ../

There should be two initramfs’s in the vendor_boot image, and we’re just unpacking them into different folders. The initramfs we’re really interested in is the second one, now in the 01 folder, as this one contains all of the modules we need to load to boot.

Preparing our custom initramfs

Unfortunately, the Android GKI is too barebones for the normal bootflow of pmOS. I have added patches to the init script in the initramfs that pmbootstrap creates to load the vendor modules in the right order, but we still need to copy over the modules we unpacked into the initramfs we’ll be using.

To do this, we need to extract the postmarketOS initramfs. First, check to see what the compression of the initramfs is:

$ file ~/.local/var/bluejay/chroot_rootfs_google-bluejay/boot/initramfs
/home/gabriel/.local/var/bluejay/chroot_rootfs_google-bluejay/boot/initramfs: gzip compressed data, original size modulo 2^32 3045376

The deviceinfo for the Pixel 6A that I’ve created asks for LZ4 legacy, but the current version of postmarketos-mkinitfs that actually generates the image has not been updated since my pull request was merged, so for now we always need to check what compression format it is. Thankfully, gzip and lz4 use the same command-line arguments when decompressing, so we can decompress and unpack the initramfs as follows:

mkdir pmos-initramfs
gzip -cd ~/.local/var/bluejay/chroot_rootfs_google-bluejay/boot/initramfs | cpio -iv -D ./pmos-initramfs

At this point we have the contents of the initramfs unpacked.

Loading kernel modules in the right order

For the Pixel 6A, Google uses its new Android GKI system for managing the Linux kernel. In summary, the core kernel is an extremely generic image with absolutely minimal hardware support, just enough to boot into an initramfs, and no more. This core kernel relies on vendor modules to bring up the rest of the hardware. We’ve already unpacked the modules, so it’s just a matter of copying them over to our unpacked pmOS initramfs, and packing it again using lz4 -l.

For some reason, the stock modules are in the $VENDOR_BOOT_IMG_DEST/01/lib/modules directory (as opposed to $VENDOR_BOOT_IMG_DEST/01/lib/modules/[kernel-version]). In order for modprobe to work properly in load-modules.sh, we need to move the modules to the right kernel version folder in our initramfs. To find out what the name of the folder is, we can grep our stock kernel for its version. We know that the stock kernel version for the Pixel 6A is the 5.10 LTS Linux kernel, so we can use that to help us find the full version string:

KERNEL_VERSION=$(strings "$BOOT_IMG_DEST"/kernel | grep -m1 -Po "5\.10.*? ")

On my system, this looks like:

$ echo $KERNEL_VERSION
5.10.149-android13-4-693021-g2f0d2b7f95c6-ab9667365

With this information, we can now copy the modules to our unpacked initramfs:

mkdir -p pmos-initramfs/lib/modules/$KERNEL_VERSION
cp -ra $VENDOR_BOOT_IMG_DEST/01/lib/modules/* pmos-initramfs/lib/modules/$KERNEL_VERSION

For reasons I have yet to understand, these modules on the 6A must be loaded in a very specific order, otherwise they hang the kernel. This is the reason I list all of the modules in a very specific order inside the load_modules.sh script. I derived the load order from grepping the dmesg of a working Android boot. This script should already be in the root directory of the initramfs.

Packing boot.img

Finally, we have everything in place to re-pack boot.img with the stock kernel and our modified initramfs:

cd pmos-initramfs
find . -print0 | cpio --null -o --format=newc | lz4 -l --best > ../postmarket_initramfs.cpio.lz
cd ../
./mkbootimg/mkbootimg.py --kernel "$BOOT_IMG_DEST"/kernel --ramdisk ./postmarket_initramfs.cpio.lz --header_version 4 -o pmos_bootimg.bin --pagesize 4096 --os_version 13.0.0 --os_patch_level 2022-03

I’m not sure if it’s necessary, but I’ve added the --os_patch_level and --os_version flags to the packing command.

Signing the boot.img

So, apparently the Pixel 6A bootloader, even when unlocked, will refuse to boot a boot.img that does not have a valid AVB signature. To work around this, we’re going to sign it with a random key. To generate the key:

openssl genrsa -out foobar.key 4096
openssl rsa -in foobar.key -pubout -out foobar.key.pub

And to sign the boot.img:

./avb/avbtool.py add_hash_footer --dynamic_partition_size --image pmos_bootimg.bin --key foobar.key --prop com.android.build.boot.fingerprint:'google/bluejay/bluejay:12/SD2A.220601.004.B2/8852801:user/release-keys' --prop com.android.build.boot.os_version:'13' --prop com.android.build.boot.security_patch:'2022-06-01' --algorithm 'SHA256_RSA4096' --rollback_index 1654041600 --partition_name boot

There are a lot of flags here that may not be necessary, but I copied from an existing boot.img, just in case.

Flashing pmOS

WARNING: From here on out, we are replacing partitions on the phone, and I haven’t tried undoing the changes to the super partition. I strongly recommend backing up the existing boot and super partitions somehow (I’ve backed these up in the past using adb pull with a rooted Android system).

At this point, we’re finally ready to flash the boot.img to the Pixel 6A. This is done with fastboot after booting into the bootloader (power off the phone, turn it back on while holding down the power button):

fastboot flash boot pmos_boot.bin

And lastly, we need to flash to root filesystem. I’ve configured the root filesystem to be installed in the super partition on the Pixel 6A instead of userdata:

pmbootstrap flasher flash_rootfs --partition super

At this point, pmOS should be installed on the phone and should work.

Configuring UART for debugging

One of the major perks of working with Google Pixel phones is that they all, so far, appear to expose the SoC UART pins through the USB port. Specifically, they co-opt the Alternate Mode pins on the USB port for UART TX and RX. The following command enables this functionality:

fastboot oem uart enable

You need:

  • USB C full featured cable (must support thunderbolt or DisplayPort Alt mode, this guarantees the existence of the SBU/Alternate Mode pins)
  • Something like the Treedix USB 3.1 Type C Female to Female Pass Through Adapter Breakout Board 24 Pin for Date Line Wire Cable Transfer Extension
  • To be safe, a USB 2.0 USB C cable — these do not populate the alternate mode pins, so there’s no chance of the computer you’re connecting to will try to use them and do weird things while the UART is connected
  • 1.8V UART adapter, connected to the SBU pins for TX and RX.

Google open-sourced a hardware device to interface with the UART pins from Pixel phones, USB-Cereal. This cable could also work.

Which SBU pin is TX and which one is RX depends on how the USB C cable is plugged in. For my testing, I used an oscilloscope to identify which pin had TX data on it.

After the cables are all plugged in, I recommend using a program like minicom to connect to the plugged in USB UART device. The Pixel 6A’s UART interface is configured to use 115200 baud with 8-N-1 (8bits, 1 stop bit).

To disable UART:

fastboot oem uart disable

Finally, loaded completely into rootfs!

postmarketOS, by default, does not seem capable of driving the display for the Pixel 6A. To confirm that the phone booted, connect over USB to the phone. At least on Linux, doing this causes a new ethernet device to appear and to autoconfigure an IP address in the 172.16.42.1/24 range. To ssh into the phone, do:

ssh user@172.16.42.1

And type in the password provided during pmbootstrap install.

At this point, you should be able to SSH into the phone, but it may complain about missing a PTY or TTY. This is normal! The GKI kernel is missing support for virtual TTYs, which is required by most “normal” Linux systems. However, even without a fully functional session, you should still be able to invoke commands. Type a command, e.g. uname -a and press Enter to confirm that you have connected.

A workaround for this issue is to create the /dev/ptmx device (note that because we don’t have a real PTY, we have to use sudo -S, which shows the password being typed in, so make sure that no one is looking over your shoulder!):

sudo -S mknod /dev/ptmx c 5 2

After this, disconnect from SSH and reconnect, and things should work more “normal”. Sadly, this needs to be done on every reboot of the phone (one workaround is to make a local script that gets called on boot to set this file up).

The real solution for this mess is to make sure that the kernel has built-in support for a lot of stuff. This may be the topic of a future blog post, because compiling the kernel for the Pixel 6A is… interesting 🫠.

Configuring additional options

To turn on KVM:

fastboot oem pkvm enable

Troubleshooting

Resetting pmbootstrap chroots

Sometimes pmbootstrap became confused on my laptop (failures while trying to build, stating that it failed to find sh). A workaround is to restart the chroots, which can be done by doing:

pmbootstrap zap

Usually I didn’t need to wipe the chroots, but if you want to start from scratch, using zap and answering yes to its questions is how to do that.

telnet-ing into pmOS’s initramfs environment

If booting is working, but something is going wrong in the second half of loading from the initramfs, pmOS has a debug hook that sets up telnet access while in the initramfs. To set this up:

pmbootstrap initfs hook_add debug-shell

Note that you will need to unpack the generated pmOS initramfs and go through the process of copying the modules into the unpacked initramfs, re-packing boot.img, and re-signing boot.img. After doing that and flashing the boot.img, you should be able to telnet in:

telnet 172.16.42.1 23
Categories
Uncategorized

Well, it’s been a while

I can’t believe it’s been almost 3 years since I started working on my PhD. It’s been a very busy time, and research has taken up a massive amount of my time (as expected). My research journal idea required far more time than I anticipated, and between classes early on in my PhD and now research, I really don’t have the time.

However, I’ve been reflecting back on my embedded systems software engineering, and after enumerating all the random little things I’ve run into over the years, I think it might be worthwhile documenting some of those in a public blog for others to learn from my mistakes and really weird, corner case bugs I’ve run into.

I might also write about some of my (mis)adventures with Linux over the past 15 years (man, has it really been that long since I started using Linux?). Of course, it depends on how much time I want to spend writing vs doing what I’m supposed to be doing 🙂

Categories
Research Journal

The Toastboard

This is a summary and thoughts about the following paper:

The Toastboard: Ubiquitous Instrumentation and Automated Checking of Breadboarded Circuits
by Daniel Drew, Julie L. Newcomb, William McGrath, Filip Maksimovic, David Mellis, Bjoern Hartmann

Summary

The Toastboard is a breadboard that has diagnostic circuitry taped (yes, taped using electrical transfer tape) to its backplane to allow for real-time introspection of circuit performance. It was designed to be used to improve debugging while prototyping on a breadboard by providing real-time feedback on the hardware itself (via LEDs) and through a web interface (through an IDE).

The core of the implementation is based around a CC3200 Launchpad microcontroller and sets of cascading multiplexers. The basic idea is that the microcontroller has a limited number of ADC channels available, and there are a lot of rows, but it is possible to get a good estimate of the readings of each row if each row can be sampled quickly and many times a second. This is the approach taken for the Toastboard, where the MCU cycles through each row via the cascading multiplexers, at approximately 100 Hz for all rows, or around 1 kHz for a single row continuously.

The Toastboard was also designed to detect whether a given row is floating, which normally is something that would not be possible to tell by just sampling the row and checking its value. The way this was by leveraging the pulldown resistor network in place at the sampling circuit, and connecting the row to be tested (via the multiplexers) to a float testing circuit. This new combined circuit forms a voltage divider with the row being tested, so that if it is floating, there will be a known voltage value at the MCU pin, but if it is grounded, the row will still sample a voltage of zero.

There was an IDE developed as a companion to the physical hardware. Its primary purpose is to help identify problems with their circuits, if they do not behave as the IDE’s model expect. The IDE models hardware through the use of modules, where each module describes to the IDE what the component is and the kinds of errors it should check for.

There are known limitations with the Toasterboard as it was implemented. Hardware-wise, there are a limited number of ground and voltage rails, the maximum voltage on the voltage rail is fixed at 3.3V, there is a small amount of current leakage to ground while sampling each row, and the use of electrically conductive tape to connect the breadboard to the supporting circuitry means that the system is sensitive to being picked up. On the software/firmware side, the sampling resolution is limited by the MCU’s sampling hardware and sampling rate (trade-off between resolution and rate), the IDE module based checker can be confused with more complex circuits. The small current leakage likely means that this board will not be usable for extremely sensitive circuits, and the sampling resolution limits detecting subtle changes in voltages.

Personal thoughts

Having just left industry to study in academia, the very first thing that stood out was the use of electrically conductive tape to hold together the board to the sampling hardware. I mean, they were able to get it to work, but as they identified, it also means it can be unstable when being picked up and moved around. That said, if this were to be built for actual deployment, there are other ways to affix the boards together (it might be possible to solder everything together, or worst case scenario, use some kind of pressure harness to physically push the connecting parts together by force).

I really liked the cleverness of the design used to detect floating rows. That said, as far as I can tell, the more rows that could possibly be floating, the more time the MCU has to spend probing them, leading to a reduction in the overall sampling rate. I wonder if there is some way to do this comparison in parallel, or when not being sampled? This would require two parallel sets of multiplexers if using the current approach, though. Or would it? I’d have to think about it some more (if dealing with cascading multiplexers might be possible to reuse one of them so long as the other parallel process isn’t using one of the two in the cascade).

There was not a lot of information about the IDE on the paper, nor was I able to find a link to it or anything resembling it, so I was not able to see exactly what went into its function. From what I can tell, it allows users to drag and drop components on a virtual breadboard, which then tries to compare each module with error cases. For example, if a resistor has the same voltage on both ends, either it’s being shorted, it died, or one of its ends is floating, the IDE will report that specific error along with a suggested fix. The biggest problem I can see with the approach of attempting to enumerate what can go wrong is that it is difficult if not impossible to list ALL cases in which a component can be mis-wired or fail. As the paper also mentions in passing, false positives are also possible when someone builds a circuit exploiting a more obscure property of a component (the paper states using an LED as a Zenner diode would likely cause the IDE to throw a warning).

As far as I can tell, a lot of the hardware limitations are mostly engineering related. It should be possible to develop a power control system that can transform whatever input voltage is fed into the Vcc rail into whatever is necessary to run the actual MCU. I’m not terribly familiar, but I’d be surprised if there aren’t ICs available that can do the sampling for the MCU and handle a much wider range of voltages than what the MCU pins can. I am curious as to why they ended up cascading multiplexers– maybe they could not find or could not source multiplexers with enough channels? The paper also mentions in passing that the sampling system was meant to support more steady-state testing than analog… I wonder what kind of analog signal degradation or distortion happens as it goes through the multiplexer cascade? I’ve had to deal with this on personal projects for multiplexers designed for digital applications, although taking a look at their multiplexers (TI SN74HC4066), it looks like this isn’t a problem for frequencies below 30 MHz (if I’m interpreting the spec sheet properly).

As an addition, I wonder what it would take to also be able to measure current? This would be more difficult as it would require MITM’ing the signal, and depending on the underlying circuitry, this could introduce extra capacitance and resistance that might change the behavior or a sensitive circuit implementation.

This is more difficult, but I wonder if they considered integrating the IDE with Spice or some other circuit simulator to predict actual circuit behavior, instead of trying to detect predetermined errors.

Now that I’m thinking about it, it’s not clear how the Toastboard is communicating with the client computer, and who is hosting the web server that the client connects to. Is the MCU powerful enough to host a basic server while it is polling? Or is it sending information off to some other system which is actually serving the data? How is it connecting to other systems? Based on Figure 3 in the paper, I’m going to guess it’s connected to the network via the MCU devkit (looks like it has WiFi support) and it is hosting its own server that a client computer then connects to. No information is provided on what this server software is, and what is the OS running on the MCU, and anything related to latencies or realtime requirements of such an OS or applications running on the MCU.

I wish I had something like this when I was learning about electronics on my own, if nothing else being able to see the voltages would have been a godsend. I’m personally less interested in the IDE’s automatic fix suggestion, partially due to what it seems like could be a non-negligible rate of false positives for more complex circuits. For simple circuits as you’d find in educational settings, I think it’d be a great aid (I can even see it being used in classes with full warnings and error fixes during labs, and then only voltages being shown in a practicum).

Bibliography

  • Daniel Drew, Julie L. Newcomb, William McGrath, Filip Maksimovic, David Mellis, and Björn Hartmann. 2016. The Toastboard: Ubiquitous Instrumentation and Automated Checking of Breadboarded Circuits. In Proceedings of the 29th Annual Symposium on User Interface Software and Technology (UIST ’16). Association for Computing Machinery, New York, NY, USA, 677–686. DOI:https://doi.org/10.1145/2984511.2984566
Categories
Uncategorized

Hello world!!!

This blog will contain a journal of papers I’m reading as a student at UC San Diego to catch up and understand the bleeding edge of research in embedded systems and hardware acceleration.