Categories
Embedded Systems Engineering

How is the Sparkfun Artemis Configured to Boot?

tl;dr INFO0 is configured by Sparkfun. SRAM is cleared, GPIO47 is used to select whether ASB runs and accept data over UART, and if not the bootloader jumps to 0xC000.

I’ve been working with the Sparkfun Artemis module and the Redboard Artemis ATP board for research, due to the Ambiq Apollo3 microcontroller in use. There is relatively little documentation on the requirements of user programs to successfully take over after the Apollo3 secure bootloader finishes running. There’s also not much out there on what this secure bootloader actually does (and I wish we could shut it off and just jump immediately to user code…). The goal of this post is to summarize a few of my findings.

The secure bootloader can be configured to some extend through the use of parameters that are stored in a special region of flash that’s treated as an OTP during normal operation (completely read-only). This region is called INFO0. I’m assuming it’s Sparkfun that’s providing something reasonable here, or maybe it’s the default from the factory. In any case, the Sparkfun Apollo3s are configured to jump to address 0xC0001 or with a logic high on GPIO472 to use the ASB bootloader for “OTA” updates. The asb.py script (also found in the Ambiq SDK) can be used to upload programs through this interface over UART03 (also configured in INFO0).

The bootloaders (both ASB and Sparkfun’s own SVL) check for the validity of the application’s stack pointer (the first 32-bit value of the program). On this MCU, SRAM spans from 0x10000000 to 0x10005FFF. The stack is full descending, so in theory using the address 0x10006000 as the initial value of the stack should be valid, but it turns out both ASB and SVL reject this value. Valid stack values must be less than or equal to 0x10005FFF.

After some testing trying to have some SRAM survive reboots, it seems like the secure bootloader completely erases all of SRAM. It seems that the code to do this is copied from bootrom to SRAM, oddly enough (I was able to place a watchpoint on SRAM for when it was cleared, and it paused on some code in SRAM while the bootloader was executing). Apparently register 0x50020050 holds a value indicating “the amount of SRAM to keep reserved for application scratch space”4. So it may be possible to prevent some SRAM from being cleared with a modified INFO0. Unfortunately for me, this only protects memory at the high end of SRAM (I would have liked to retain some memory in the first 4KB of SRAM).

I think this is everything I’ve found thus far. I’ve reprogrammed the SVL bootloader, optimizing it and reducing it in size. It also provides some compile-time options for skipping the SVL bootloader completely or not based on the value of some GPIO (currently only lightly configurable). I also wrote svl-tools (in Rust) to replace the svl.py program from Sparkfun and to leverage the extra functionality of reading from memory on request that I’ve added to the SVL.

If I find anything else noteworthy about the boot process, I’ll edit this post and make note of it.

  1. Register 0x50020C00 is set to 0xC000, which is the address to jump to after the bootloader is done. ↩︎
  2. Register 0x50020020 configures the pin used to force ASB boot programming. It is set to 0x000000af, pin 47 with polarity set to high. ↩︎
  3. Register 0x5002002C configures the pins to use for UART, it is set to 0xffff3031, so pin 48 and 49 for TX and RX respectively. Register 0x50020028 configures the UART, it is set to 0x1c200c0, meaning using UART module 0, data 8 bits, no parity, 1 stop bit, and a baud rate of 115200. ↩︎
  4. From the Apollo3 datasheet: https://ambiq.com/wp-content/uploads/2020/10/Apollo3-Blue-SoC-Datasheet.pdf ↩︎
Categories
Engineering

LDO Voltage Regulators are More Confusing than I Expected

A few months ago a couple of friends and I bought some Japanese Super Famicoms (SFCs) from e-bay to refurbish. For reference, the SFC and the American Super Nintendo (SNES) used the same exact motherboards internally. The input power for the SFC/SNES is about 10 V DC, and the SNES uses an old linear voltage regulator to step it down to 5 V, which is the voltage required for most of the digital chips inside. Thing is, this regulator gets hot and has been known to sometimes fail. So if we’re taking everything apart, why not just replace it with something newer?

I actually found the L78S05CV to be a pretty decent “old” style linear voltage regulator that I used in my own SNES at home. But when I went to purchase it for this restoration project…

Great. And Digikey had run out of them. So I looked at other substitutes and I figured I might as well try using LDOs– “Time to move to the future” I thought to myself. But then I started reading the datasheets…

Many of the through hole LDOs I encountered with similar current capacity (2 A and higher) rely on the ESR of the bulk capacitor on the output for stability. If the ESR is too low they can oscillate out of control! The exact reasoning is a little beyond my understanding of electronics, but this application report from TI goes into the details. In short, LDOs will oscillate if the phase margin of a regulator is too close to or below zero (phase margin is the difference between the phase and -180 degrees at the frequency when the gain is unity. This is visible is Bode plots). Without enough compensation in place, this can occur with LDOs not designed to be used with MLCCs (MLCCs have very low ESR compared to typical electrolytic capacitors).

One critical observation from the TI document is that sometimes even placing two capacitors in parallel, one with adequate ESR, and one with much smaller ESR, can backfire.

That all said, there are LDOs (usually not through hole , or in as high a rated current output as 2 A) that are designed to use MLCCs and are made to be stable with low ESR at their output. To quote the document:

The incorrect assumption typically made is that when a small [ceramic] capacitor is in parallel with a larger capacitor, the smaller one’s effect will be “swamped out” by the larger one. However, the smaller value of capacitance made up by the “bypass capacitors” will form it’s own pole. If that pole is near or below the unity-gain crossover frequency of the loop, it can add enough phase lag to create an oscillator.

This also applies to bypass capacitors placed further away near ICs! Apparently the trace inductance can help decouple the bypass capacitors from the LDO. Apparently, the only “reliable” way to confirm that the LDO will be stable is this:

The reliable way to determine if board capacitance is reducing phase margin is to perform load step testing on the actual board with all capacitors in place. The IC’s that the regulator powers should be removed (or not installed) and a resistor should be used at the output of the regulator that provides the same load current. The load should be stepped from no load to rated load while the output is watched for ringing or overshoot during the load step transient: excessive ringing indicates low phase margin.

That’s not exactly comforting. The document also doesn’t really specify what a long enough trace is as “board layouts vary, a ‘safe distance’ boundary for all applications can not be given”.

So, what can we do? Ideally we don’t bother and find a more “classical” linear voltage regulator. Non-LDO voltage regulators have a significant voltage drop requirement (near ~2V if I remember correctly), but due to their construction are significantly more stable and tolerate just about any ESR on their output. So… this is what we did, we decided to not replace the regulators on the SFCs.

Another alternative is to build a buck converter voltage regulator that can reduce the voltage to 5V and sustain up to 2 A of current output. This will probably be a post for another day, as it’s not super trivial to reduce the ripple and keep it out of relevant frequency bands (e.g. NTSC/PAL).

Categories
Engineering

MLCC Capacitors and DC Bias

I’ve been working with electronics and PCB design for my research for some time now, and the DC bias effect on (most?) multi-layer ceramic capacitors (MLCCs) is one of the newest details I’ve come across that warrants a post so I don’t forget about it.

Effectively, due to the nature of some of the dielectric material used in MLCCs, the actual capacitance of an MLCC exposed to some DC bias can be much lower than the rated capacitance.1 The strength of the effect is correlated to the size of the MLCC package, and is independent of the rated voltage of the MLCC.2

Not all dielectric materials are affected, however. MLCCs with C0G dielectric are unaffected, but these are fairly large for their capacitance (and expensive). X5R and X7R dielectrics are affected, and from what I understand it has to do with some ferroelectric effect due to the titanium in their compound.

To drive home the point, I found this really good 12 year old article from Maxim Integrated (now part of Analog Devices). In it, it has this nifty table:

Figure 1. Capacitance variation vs. DC voltage for select 4.7µF capacitors.

Of particular note, notice how the 0603 X5R loses more than 50% of its rated capacitance with a DC bias voltage of just under 5 volts!

From what I can gather, some manufacturers have simulations on their product pages showing the DC bias effect on their capacitors, others may included in their datasheets, and others don’t bother. This is just something that needs to be kept in mind while designing with MLCC capacitors.

  1. Figure 1 in https://community.infineon.com/t5/Knowledge-Base-Articles/DC-Bias-characteristics-of-Multilayer-Ceramic-Capacitor-MLCC/ta-p/250035 ↩︎
  2. Figure 2 in the same Infineon article as the previous footnote. ↩︎

Categories
Engineering Linux

Managing Your Own CA, Elliptic Curve Edition

Historically, I’ve used my main desktop to store system backups and provide other network services, but this is annoying because I dual-boot it to play video-games, so while I’m playing most services are offline. Additionally, electricity in San Diego is absurdly expensive (thanks SDGE), so the 180+W while idling adds up quite a bit. To fix both problems, I bought an HP Prodesk 600 G3 MT from Human-I-T for very reasonable prices, hoping that the idle power draw stays under 20 Watts. While setting up that server, I decided that I might as well figure out how to set up my own CA for all of the internal services that I’m setting up, since I was getting tired of getting warnings from Firefox about “unsecured” connections due to self-signed certificates.

Most of the information I used to set up my own CA came from the OpenSSL CA documentation, which can be found here: https://openssl-ca.readthedocs.io/en/latest/index.html. However, it is missing the details on how to use Elliptic Curve keys. This post is an attempt to fill in that gap.

The process can be summarized as follows:

  1. Make a root CA key and certificate
  2. Make an intermediate CA key and certificate by requesting signing by root CA, and create an intermediate certificate chain
  3. Make individual server certificates by requesting signing by intermediate CA, and extend the certificate chain

Safeguarding Keys

It is important to protect all private keys, as if they leak anyone with them can sign certificates that pass as valid when checked against your CA certificates. This matters a little less when all of your certs are meant for your local network, but meh, might as well try to do something about it.

What I decided to do was to use LUKS to encrypt some filesystems on disk. I used one to store the CA and intermediate CA, and a second one to store client keys.

# Create the Root CA/Intermediate CA file, and the file for server keys
dd if=/dev/urandom of=ca bs=1M count=100 status=progress
dd if=/dev/urandom of=server bs=1M count=100 status=progress

# it will ask for a password-- remember it or store it somewhere secure/safe/encrypted
cryptsetup luksFormat ca
cryptsetup luksFormat server

# now open the volume, and mount it for use. Opening it will ask for the volume password
cryptsetup open ca ca-keys
cryptsetup open server server-keys
mkdir ca-keys
mkdir server-keys
mount /dev/mapper/ca-keys ca-keys
mount /dev/mapper/server-keys server-keys

It’s probably a good idea to backup the LUKS headers, in case something happens to them:

cryptsetup luksHeaderBackup ca --header-backup-file ca-keys.header.luks
cryptsetup luksHeaderBackup server --header-backup-file server-keys.header.luks

Make Root CA and its Certificate

The general outline of the steps can be found here: https://openssl-ca.readthedocs.io/en/latest/create-the-root-pair.html

cd ca-keys/
mkdir root/
cd root/
mkdir certs crl newcerts private
chmod 700 private
touch index.txt
echo 1000 > serial

At this point a openssl.cnf file needs to be created that will include default settings. It can be copied from https://openssl-ca.readthedocs.io/en/latest/root-configuration-file.html#root-configuration-file and modified to meet needs. For example, on my local install, I tweaked the dir and private_key paths, and changed the defaults for many of the *_default fields (e.g. countryName_default to US). I also tweaked the req default_bits to 4096.

With that in place, now it’s time to create the key in the ca-keys/root/ directory:

openssl genpkey -algorithm EC -pkeyopt ec_paramgen_curve:secp384r1 -pkeyopt ec_param_enc:named_curve -aes-256-cbc -out private/root-ca.key.pem
# This will now ask for a password, set one and keep it safe!
chmod 400 private/root-ca.key.pem

Wait, why is that so messy? Turns out that only some elliptic-curves are really used widely online (e.g. Chrome doesn’t support secp521r1). The two that seem to be the most supported are secp384r1 and secp256r1. So I just arbitrarily-ish picked secp384r1. Also, we are using the -aes-256-cbc option to enable password protection on the key. There are other options that may work (see the output of openssl enc -list for tentative options).

OK, now we have the key. Now we need a root certificate (we’re setting the expiry date far into the future, as recommended):

openssl req -config openssl.cnf -new -sha256 -extensions v3_ca -days 7300 -key private/root-ca.key.pem -out csr/ca.csr.pem
# This will request the password for the key, and then will ask for information about the certificate such as its common name, organization, etc.
chmod 444 certs/ca.cert.pem

To verify that the certificate looks ok:

openssl x509 -noout -text -in certs/ca.cert.pem

Make Intermediate CA and its Certificate

OK, now we can work on making the intermediate CA, which is the one that will actually be used to sign all of the certificates for the websites, etc. This should look familiar as it’s similar to what we did with the root certificate:

cd ../
mkdir intermediate
cd intermediate
mkdir certs crl csr newcerts private
chmod 700 private
touch index.txt
echo 1000 > serial
echo 1000 > crlnumber

Now, we can copy the base openssl.cnf for the intermediate certificate from https://openssl-ca.readthedocs.io/en/latest/intermediate-configuration-file.html#intermediate-configuration-file. Once saved, just as with the root openssl.cnf, variables can be tweaked (such as dir, *_default, default_bits).

Now it is time to create the intermediate CA key:

openssl genpkey -algorithm EC -pkeyopt ec_paramgen_curve:secp384r1 -pkeyopt ec_param_enc:named_curve -aes-256-cbc -out private/intermediate.key.pem
# This will now ask for a password, set one and keep it safe!
chmod 400 private/intermediate.key.pem

And now to create and sign the intermediate certificate:

openssl req -config openssl.cnf -new -sha256 -keyprivate/intermediate.key.pem -out csr/intermediate.csr.pem
# This will ask for the intermediate key password, and other information to fill into the certificate

cd ../root
openssl ca -config openssl.cnf -extensions v3_intermediate_ca -days 3650 -notext -md sha256 -in ../intermediate/csr/intermediate.csr.pem -out ../intermediate/certs/intermediate.cert.pem
# This will ask for the root CA password
chmod 444 ../intermediate/certs/intermediate.cert.pem

And to verify the newly created certificate:

openssl x509 -noout -text -in ../intermediate/certs/intermediate.cert.pem
# and check it against the root CA
openssl verify -CAfile ./certs/ca.cert.pem ../intermediate/certs/intermediate.cert.pem

Create the Certificate Chain

This is straight-ish forward (the order matters!). We need to concatenate the certificates of the root and the intermediate together to form the start of the certificate chain (intermediate needs to go first, the root):

# Changing directory to the root of the certs encrypted partition
cd ../
cat ./intermediate/certs/intermediate.cert.pem root/certs/ca.cert.pem > ./intermediate/certs/ca-chain.cert.pem

Great! At this point all of the initial setup is completed. From here on is a matter of creating site specific certificates, and installing the root certificate to all computers that need it.

Creating a Certificate for an Internal Site

Now we need to go into the server-certs encrypted folder we created earlier, as this is where we will store the certificates and keys for the individual sites as we create them (arguably, the better approach is to have each server create its own private keys and just provide the CSR to the signing server):

cd server-keys
mkdir site1.my.example
# Now create the key, no password
# If password is desired, add `-aes256 -pass stdin`
openssl genpkey -algorithm EC -pkeyopt ec_paramgen_curve:secp384r1 -pkeyopt ec_param_enc:named_curve -out site1.my.example/site1.key.pem
chmod 400 site1.my.example/site1.key.pem

With the private key in place (which should probably securely copied to the server for this domain), we now need to prepare the certificate signing request (CSR). We need the openssl.cnf file from here again (the intermediate CA). subjectAltName are now effectively required by modern browsers if the certificate is to authenticate a website. Multiple names can be provided, separated by commas (e.g. DNS:site1.my.example,DNS:site1.local).

openssl req -config openssl.cnf -new -sha256 -key site1.my.example/site1.key.pem -out site1.my.example/site1.csr.pem -addext "subjectAltName = DNS:site1.my.example"

With the CSR in hand, we now need to go back to the intermediate CA to create the certificate:

cd ../ca-keys/intermediate/
openssl ca -config openssl.cnf -extensions server_cert -days 365 -notext -md sha256 -in ../../server-keys/site1.my.example/site1.csr.pem -out ../../server-keys/site1.my.example/site1.cert.pem

And with that we should now have a signed server certificate! The last thing to do is to create the certificate chain with this new certificate:

cd ../../server-keys/site1.my.example/
cat site1.cert.pem ../../ca-keys/intermediate/certs/ca-chain.cert.pem > site1.chain.pem

And finally, with that we should have all of the certificates and chains needed to set up properly signed certificates at home or other locations. The only thing really left to do is to install the root CA certificate on any machines and browsers that should recognize any certificates signed by the intermediate CA as authentic.

Safeguarding Keys, Again: Unmount and Lock

After being done signing and copying certificates and chains, remember to unmount and lock the encrypted storages!

# cd out of server-keys and ca-keys
umount server-keys
umount ca-keys
cryptsetup close ca
cryptsetup close server

Categories
Engineering Linux

Wait, is that an STM32 in my RGB controller?

One of the first things I tried to do with my Dell G5 5505 SE was trying to see if I could figure out what the RGB controller is doing. Looking through the files belonging to Dell in the Program Files directory on Windows, the directory Firmware stood out immediately. Looking into it, I found two Intel Hex formatted files:

$ ls Firmware/ELC/
elc-dfu-gcc-v0.0.2.hex*  elc-iar-v1.0.12.hex*

OK, that’s interesting. Unpacking them using objcopy:

$ objcopy -I ihex -O binary Firmware/ELC/elc-dfu-gcc-v0.0.2.hex ~/dfu.bin
$ strings ~/dfu.bin | tail
YI      }I
]JILHLOMMNJ
UfUcTKK
""K^}
K[XCP
       DFU Config
DFU Interface
STMicroelectronics
DFU in FS Mode
@Internal Flash   /0x08000000/6*02Ka,58*02Kg

Oh! So it’s not encrypted, and this firmware belongs to some sort of STM32 chip! What does the other firmware have in store for us?

$ objcopy -I ihex -O binary Firmware/ELC/elc-iar-v1.0.12.hex ~/iar.bin
gabriel@diamante /mnt/Gaia/Program Files/Alienware/Alienware Command Center  
$ strings ~/iar.bin | grep stm32
       c:\jenkins_duvel\workspace\aw_elc_iar_prod\Drivers\STM32F0xx_HAL_Driver\Src\stm32f0xx_hal_cortex.c
c:\jenkins_duvel\workspace\aw_elc_iar_prod\Drivers\STM32F0xx_HAL_Driver\Src\stm32f0xx_hal_gpio.c
c:\jenkins_duvel\workspace\aw_elc_iar_prod\Drivers\STM32F0xx_HAL_Driver\Src\stm32f0xx_hal_i2c_ex.c
c:\jenkins_duvel\workspace\aw_elc_iar_prod\Drivers\STM32F0xx_HAL_Driver\Src\stm32f0xx_hal_rcc_ex.c
c:\jenkins_duvel\workspace\aw_elc_iar_prod\Drivers\STM32F0xx_HAL_Driver\Src\stm32f0xx_hal_uart.c
c:\jenkins_duvel\workspace\aw_elc_iar_prod\Drivers\STM32F0xx_HAL_Driver\Src\stm32f0xx_hal_dma.c
c:\jenkins_duvel\workspace\aw_elc_iar_prod\Drivers\STM32F0xx_HAL_Driver\Src\stm32f0xx_hal_i2c.c
c:\jenkins_duvel\workspace\aw_elc_iar_prod\Drivers\STM32F0xx_HAL_Driver\Src\stm32f0xx_hal_pcd.c
c:\jenkins_duvel\workspace\aw_elc_iar_prod\Drivers\STM32F0xx_HAL_Driver\Src\stm32f0xx_hal_pwr.c
c:\jenkins_duvel\workspace\aw_elc_iar_prod\Drivers\STM32F0xx_HAL_Driver\Src\stm32f0xx_hal_rcc.c
c:\jenkins_duvel\workspace\aw_elc_iar_prod\Drivers\STM32F0xx_HAL_Driver\Src\stm32f0xx_hal_spi.c

Oh. That’s nice and convenient. This is some sort of STM32F0 chip, with USB support. Additionally, we know Dell is using STM32’s HAL, or at least some version of it.

So, from all of this it really looks like Dell is using the DFU support from the STM32 to flash it. From our ACPI experimentation, we can force the STM32 to boot into DFU mode by forcing BOOT0 high and resetting the chip.

By using dfu-util, and trial and error, I was able to determine that this chip has 128KB of flash. And after some poking around in the binary firmware, and comparing the implementation of the SDK being used, I was able to narrow down to the HAL version and the chip must be some variant of the STM32F070xB.

I’ve studied the DFU part of the firmware in some depth, but I have not had the time to dive into the part that actually manages the USB hardware. There are bugs there that I’d like the fix (controller crashes if there are too many incoming commands).

Back when I was working more actively on this I wrote a section on the AW-ELC RGB controller in the OpenRGB’s wiki, from online sources and my own experimentation. This information is enough to write some code to talk to the controller over USB HID and program animations and change keyboard brightness intensity.

I’ve made some sample applications on my gitlab showing how to talk to the controller on Linux. It generates a couple of binaries, of which status, reset, and toggle are more interesting. status prints all of the status registers of the RGB controller, reset resets the contents of the controller to that of the Dell G5 5505 SE default settings, and toggle cycles through the intensity values (0 being bright to 100 being dim, inclusive) of a little file at ~/.config/aw_elc/dim to control the keyboard brightness. I’ve been using toggle for some years now to toggle my keyboard brightness on and off, so hopefully this can be useful to someone else too (or if nothing else, as an example of how to talk to this RGB controller).

Categories
Embedded Systems Engineering Gentoo Linux

Don’t build standard libraries with LTO unless you really want to suffer

I wanted to make this post more as a reminder to myself.

A few years ago I ran into an interesting issue while working with a cross-compiler for ARM on my Gentoo Linux systems. Effectively, I was getting a ton of libc undefined symbols errors. After a thread on the gcchelp mailing list, this was the summary of the outcome:

Gabriel Marcano:
Can newlib not be built with -flto?

...

Alexander Monakov:
It matters because the compiler recognizes several libc functions by name (including 'malloc'); recognizing such so-called "builtins" is necessary to optimize their uses. Unfortunately, builtins are different from common functions in LTO bytecode, and sometimes behave in unexpected ways. You can find examples searching GCC Bugzilla for stuff like "LTO built-in".

In your case the reference to built-in malloc is not visible to the linker when it invokes the plugin while scanning input files, so it does not unpack malloc.o from libc.a. This is a simple case of a more general problem where a compiler creates new references to built-in functions after optimizations (e.g. transforming 'printf("Hello\n")' to 'puts("Hello")'). Somehow it happens to work when libc is not slim-LTO (I guess it is rescanned), but rescanning for LTO code-generation is not implemented, as far as I know.

So no, apparently the tricks that GCC/LD make to optimize strlen, malloc, etc. make it next to impossible to properly enable LTO for the standard library.

For Gentoo this means remembering to disable LTO for newlib builds for cross-compilers. This is done by making a /etc/portage/env/no-lto.conf file:

CFLAGS="${CFLAGS} -fno-lto"
CXXFLAGS="${CXXFLAGS} -fno-lto"
FFLAGS="${FFLAGS} -fno-lto"

And then somewhere under /etc/portage/package.env/ making a file with:

cross-*/newlib no-lto.conf
Categories
Engineering Linux

Dell G5 5505 SE ACPI, or figuring out how to reset the RGB controller

This is going to be a long one, and it has taken me a while to write up (I started writing this in 2022, and now we’re almost in 2025!). While this isn’t as polished as I’d liked for it to be, it’s been sitting as a draft for too long, and I almost want to get rid of this laptop, heh.

Four years ago my old 2013 MSI GE40 laptop’s battery finally kicked the bucket, and after swearing off Nvidia and wanting to try out a CPU from team Red, I acquired the only all-AMD laptop I could find at the time against my better judgement: a Dell G5 5505 Special Edition. Honestly, this laptop has been mostly OK (other than having to replace one of its cooling fans under warranty not even two weeks after getting the laptop, great QC there Dell). One of the first things that caught my attention was the full RGB blacklit keyboard on the laptop, and I immediately wanted to see if I could get it working on Linux. This will probably be the subject of another blog post, later, though. However, in the process of reversing what the RGB controller was doing, I figured out that there was a way to reboot the RGB controller into DFU flashing mode, and that in Windows Dell’s software did this via ACPI calls.

For almost all commands that follow, one needs the acpica or iasl package, ideally at least the 20240927 or R09_27_24 release.

Dumping and disassembling ACPI tables

On Linux, acpidump and acpixtract make quick work of this:

# Extracts all ACPI tables into the tables.acpi file
sudo sh -c "acpidump > tables.acpi"
# Extracts all tables individually into their own .dat binary file
acpixtract -a tables.acpi

At this point, there should be a ton of *.dat files in the working directory. For some reason this laptop has over 20 ssdt tables.

To then disassemble the files, I do the following (my shell is fish, so some modification may be required to work with bash):

#!/usr/bin/fish
for I in *.dat
        echo -----------$I
        iasl -e $(ls *.dat | grep -v $I) -d $I
end

In essence we want to pass as many tables to the disassembler as possible so that it can find methods and other objects that the table being disassembled may use. We can’t pass the same table as a reference otherwise iasl complains about duplicate objects.

Getting WMI information

Great, now we can inspect the tables. From some… poking around Dell files on Windows, I determined that it updates the RGB firmware through some WMI ACPI functions. The Linux kernel has some pretty decent documentation on WMI, including stuff relating to Dell WMI functionality.

We need the bmf2mof utility to be able to parse some of the binary blobs exposed by the Linux kernel related to WMI methods. On my system, the right BMOF data for the WMAX function can be extracted as follows:

# As root or with root privileges
bmf2mof < /sys/bus/wmi/devices/05901221-D566-11D1-B2F0-00A0C9062910-1/bmof
[WMI, Dynamic, Provider("WmiProv"), Locale("MS\\0x409"), Description("WMI Function"), guid("{A70591CE-A997-11DA-B012-B622A1EF5492}")]
class AWCCWmiMethodFunction {
  [key, read] string InstanceName;
  [read] boolean Active;

  [WmiMethodId(19), Implemented, read, write, Description("Get Fan Sensors.")] void GetFanSensors([in] uint32 arg2, [out] uint32 argr);
  [WmiMethodId(20), Implemented, read, write, Description("Thermal Information.")] void Thermal_Information([in] uint32 arg2, [out] uint32 argr);
  [WmiMethodId(21), Implemented, read, write, Description("Thermal Control.")] void Thermal_Control([in] uint32 arg2, [out] uint32 argr);
  [WmiMethodId(23), Implemented, read, write, Description("MemoryOCControl.")] void MemoryOCControl([in] uint32 arg2, [out] uint32 argr);
  [WmiMethodId(26), Implemented, read, write, Description("System Information.")] void SystemInformation([in] uint32 arg2, [out] uint32 argr);
  [WmiMethodId(32), Implemented, read, write, Description("FW Update GPIO toggle.")] void FWUpdateGPIOtoggle([in] uint32 arg2, [out] uint32 argr);
  [WmiMethodId(33), Implemented, read, write, Description("Read Total of GPIOs.")] void ReadTotalofGPIOs([out] uint32 argr);
  [WmiMethodId(34), Implemented, read, write, Description("Read GPIO pin Status.")] void ReadGPIOpPinStatus([in] uint32 arg2, [out] uint32 argr);
  [WmiMethodId(36), Implemented, read, write, Description("Read Platform Properties.")] void ReadPlatformProperties([out] uint32 argr);
  [WmiMethodId(37), Implemented, read, write, Description("Game Shift Status.")] void GameShiftStatus([in] uint32 arg2, [out] uint32 argr);
};

Oh, that’s so nice. From here we know the GUID of the function A70591CE-A997-11DA-B012-B622A1EF5492 and all of the WMI methods and their IDs. With this information we can go find them in the ACPI tables.

For BIOS version 1.24.0, this is the WMAX function in ssdt20.dsl. In there there’s a large switch-case statement with IDs in hexadecimal that match up with the IDs in the BMOF.

Reimplementing ACPI GPIO methods to avoid screwing up their output

The ReadGPIOpPinStatus function (ID 34 or 0x22) uses the WISC method (defined in dsdt.dsl), which changes the direction of the GPIO pin to read it. This… is not useful if we just want to read the pin without affecting its function (which we can do), as changing its direction makes the GPIO act as thought it was set to output low.

I implemented two new ACPI functions, WISI and WISO, to refactor WISC and to stop screwing up the GPIO output on a GPIO read call. One observation I made is that the “magic” address 0xFED81500 is the address of the gpio-amd-fch GPIO registers (found in the Linux kernel in drivers/gpio/gpio-amd-fch.c).

    Method (WISO, 2, Serialized)
    {   
        Local0 = (Arg0 << 0x02)
        Local0 += 0xFED81500
        OperationRegion (GREG, SystemMemory, Local0, 0x04)
        Field (GREG, ByteAcc, NoLock, Preserve)
        {
            Offset (0x02),
                ,   6,
            OPVL,   1,
            OPEN,   1
        } 
        OPEN = One
        OPVL = Arg1
    }   
        
    Method (WISI, 1, Serialized)
    {       
        Local0 = (Arg0 << 0x02)
        Local0 += 0xFED81500
        OperationRegion (GREG, SystemMemory, Local0, 0x04)
        Field (GREG, ByteAcc, NoLock, Preserve)
        {   
            Offset (0x02),
            PSTS,   1
        }   
            
        Local2 = PSTS /* \WISI.PSTS */
        Return (Local2)
    }

I then re-implemented method IDs 0x20 (32) and 0x22 (34) using these new methods. This should let us query the GPIO state without messing with the GPIO direction.

                    // Write NRST or BOOT0
                    // Name=FWUpdateGPIOtoggle
                    Case (0x20)
                    {
                        AXBF = Arg2
                        // BFB0 is pin select (1 == NRST, 0 == BOOT0)
                        If ((BFB0 == Zero))
                        {
                            If ((BFB1 == Zero))
                            {
                                WISO(0x05, Zero)
                            }
                            Else
                            {
                                WISO(0x05, One)
                            }
                        }
                        ElseIf ((BFB0 == One))
                        {
                            If ((BFB1 == Zero))
                            {
                                WISO(0x0A, Zero)
                            }
                            Else
                            {
                                WISO(0x0A, One)
                            }
                        }

                        Return (Zero)
                    }
                    // Name=ReadTotalofGPIOs
                    Case (0x21)
                    {
                        Return (0x02)
                    }
                    // Name=ReadGPIOpPinStatus
                    Case (0x22)
                    {
                        AXBF = Arg2
                        Local0 = 0x02
                        // BFB0 is pin select (1 == NRST, 0 == BOOT0)
                        // WISI does not switch the pin to input
                        If ((BFB0 == Zero))
                        {
                            Local0 = WISI (0x05)
                        }
                        ElseIf ((BFB0 == One))
                        {
                            Local0 = WISI (0x0A)
                        }

                        Return (Local0)
                    }

Fixing the other ACPI tables (is pain)

If one tries to just recompile the tables, it won’t work. Most of the tables have a ton of issues. At a bare minimum, however, dsdt.dsl and ssdt20.dsl (or whichever ssdt has the WMAX function) need to build. Unfortunately the process for fixing these tables is non-trivial and rather tedious. My best recommendation is to scour online for any help in trying to understand the iasl error messages.

One critical change that needs to happen is that the version number near the top of the source files needs to be increased, else the Linux ACPI table override system won’t bother trying to override the table.

Compiling the new table is as simple as doing:

# Example rebuilding of the DSDT table
iasl dsdt.dsl
# If successful, it emits dsdt.aml

If there are errors, iasl will report them. On success a .aml file is generated, and it contains the compiled table.

Hacking Linux: Overriding the right tables!

Alright, we have the tables re-compiled. Now what? The Linux kernel can override ACPI tables, but the mechanism it uses is by table type… and we’re overriding at least one SSDT out of a bunch of SDDT ones, so the kernel can’t tell them apart.

I’ve come up with what is arguably the worst hack I’ve implemented by far to work around this issue. My hack checks each table in the kernel against the original table size, if they match the kernel prepares to replace the table. It is nasty, but it works, as none of the SSDT tables I’ve been overriding have the same original size.

Here’s the patch, for the 6 SSDT tables I’m overriding for BIOS version 1.24.0:

-- a/drivers/acpi/tables.c 2022-03-20 13:14:17.000000000 -0700
+++ b/drivers/acpi/tables.c 2022-04-13 16:37:55.389618306 -0700
@@ -688,6 +688,10 @@ acpi_table_initrd_override(struct acpi_t
    struct acpi_table_header *table;
    u32 table_length;
 
+   // FIXME Hacks by Gabriel M. to load a specific SSDT on a Dell G5505 SE
+   bool is_ssdt;
+   bool hack;
+
    *length = 0;
    *address = 0;
    if (!acpi_tables_addr)
@@ -705,7 +709,9 @@ acpi_table_initrd_override(struct acpi_t
        table_length = table->length;
 
        /* Only override tables matched */
-       if (memcmp(existing_table->signature, table->signature, 4) ||
+       is_ssdt = !memcmp(existing_table->signature, "SSDT", 4);
+       hack = is_ssdt && (existing_table->length == 0x517 || existing_table->length == 0x53B || existing_table->length == 0x723C || existing_table->length == 0x28D || existing_table->length == 0x30C8 || existing_table->length == 0xC6C);
+       if ((is_ssdt && !hack) || memcmp(existing_table->signature, table->signature, 4) ||
            memcmp(table->oem_id, existing_table->oem_id,
               ACPI_OEM_ID_SIZE) ||
            memcmp(table->oem_table_id, existing_table->oem_table_id,
@@ -713,6 +719,13 @@ acpi_table_initrd_override(struct acpi_t
            acpi_os_unmap_memory(table, ACPI_HEADER_SIZE);
            goto next_table;
        }
+
+       // FIXME Hack, skip matching all SSDT tables except specific one
+       if (is_ssdt && !hack) {
+           acpi_os_unmap_memory(table, ACPI_HEADER_SIZE);
+           goto next_table;
+       }
+
        /*
         * Mark the table to avoid being used in
         * acpi_table_initrd_scan() and check the revision.

This hack really sucks because it needs to be updated every BIOS update. I wish there were some way to differentiate SSDT tables already available, but I haven’t found one yet.

To actually have the Linux kernel override the ACPI tables, I’m using dracut to generate an initramfs image, and its acpi_override option to actually install the *.aml files as part of the initramfs. See the man pages for dracut for more details.

Implementing a debugfs for these newly found ACPI methods

OK, finally, we’re at a point where we can do something interesting. There’s probably a better way to do this, but I’ve implemented some code to allow calling these WMI functions through the Linux kernel, and I’ve exposed them through the debugfs interface.

I keep the patches that implement this in the kernel in a fork of the kernel on gitlab. Once the module is loaded it exposes the following files:

/sys/kernel/debug/dell_awcc/memory_volt
/sys/kernel/debug/dell_awcc/memory_freq
/sys/kernel/debug/dell_awcc/gameshift
/sys/kernel/debug/dell_awcc/boot0
/sys/kernel/debug/dell_awcc/nrst

I can’t vouch for the complete validity of the memory_* stuff, but the boot0 and nrst control the GPIOs connected to the STM32 managing the RGB controller. echo-ing 0 turns the GPIOs low, and echo-ing 1 turns them high. So, to change the RGB controller to its boot DFU mode:

echo 1 > /sys/kernel/debug/dell_awcc/boot0
echo 0 > /sys/kernel/debug/dell_awcc/nrst
echo 1 > /sys/kernel/debug/dell_awcc/nrst

And this is the dmesg output of doing the above:

[17039.648454] usb 3-3.2: USB disconnect, device number 5
[17043.154318] usb 3-3.2: new full-speed USB device number 6 using xhci_hcd
[17043.242228] usb 3-3.2: New USB device found, idVendor=0483, idProduct=df11, bcdDevice=22.00
[17043.242235] usb 3-3.2: New USB device strings: Mfr=1, Product=2, SerialNumber=3
[17043.242237] usb 3-3.2: Product: STM32  BOOTLOADER
[17043.242239] usb 3-3.2: Manufacturer: STMicroelectronics
[17043.242241] usb 3-3.2: SerialNumber: FFFFFFFEFFFF

Success! We’ve rebooted the RGB controller to its DFU programming mode and can be dumped using dfu-util!

To turn the RGB chip back to its normal self:

echo 0 > /sys/kernel/debug/dell_awcc/boot0
echo 0 > /sys/kernel/debug/dell_awcc/nrst
echo 1 > /sys/kernel/debug/dell_awcc/nrst

And this is my kernel output:

[17144.938660] usb 3-3.2: reset full-speed USB device number 6 using xhci_hcd
[17145.014269] usb 3-3.2: device firmware changed
[17145.014859] usb 3-3.2: USB disconnect, device number 6
[17145.194627] usb 3-3.2: new full-speed USB device number 7 using xhci_hcd
[17145.285273] usb 3-3.2: New USB device found, idVendor=0483, idProduct=df11, bcdDevice= 2.00
[17145.285283] usb 3-3.2: New USB device strings: Mfr=1, Product=2, SerialNumber=3
[17145.285288] usb 3-3.2: Product: DFU in FS Mode
[17145.285292] usb 3-3.2: Manufacturer: STMicroelectronics
[17145.285295] usb 3-3.2: SerialNumber: 206D335B5353
[17146.912500] usb 3-3.2: USB disconnect, device number 7
[17147.080577] usb 3-3.2: new full-speed USB device number 8 using xhci_hcd
[17147.176269] usb 3-3.2: New USB device found, idVendor=187c, idProduct=0550, bcdDevice= 2.00
[17147.176283] usb 3-3.2: New USB device strings: Mfr=1, Product=2, SerialNumber=3
[17147.176289] usb 3-3.2: Product: AW-ELC
[17147.176295] usb 3-3.2: Manufacturer: Alienware
[17147.176300] usb 3-3.2: SerialNumber: 00.01
[17147.269992] hid-generic 0003:187C:0550.0005: hiddev96,hidraw0: USB HID v1.11 Device [Alienware AW-ELC] on usb-0000:07:00.4-3.2/input0

I do wonder, now that I know what GPIO pins are connected to the RGB controller, if we should just… bypass the ACPI table and talk to the controller directly. Maybe that’s an experiment for another time.

Categories
Embedded Systems Engineering

I _WANT_IO_LONG_LONG, or sometimes Newlib doesn’t want long long integer types

I ran into an issue when trying to finish some implementation work for a paper that I have due in less than a week. Effectively, I was just trying to printf some long long types, and they were getting butchered. I reduced a reproducer on my Cortex-M4 device to the following line:

printf("%llu %ld\n", 1llu, 2l);

What I expected to see:

1 2

What I saw:

0 1

Ok… that’s weird. I did a quick google search, and the summary is that if you’re using newlib nano, well, newlib doesn’t support long long types in printf and its ilk. Well, I’m not using newlib nano for this program. Thinking about it a bit harder, this really looks like printf is only reading 4 bytes for the %llu type. I finally decided to go read the source code in newlib for some clues.

Reading through the printf implementation eventually led me here:

#ifndef _NO_LONGLONG
			if (flags & QUADINT)
				*GET_ARG (N, ap, quad_ptr_t) = ret;

OK, so there’s a check for this _NO_LONGLONG define. Where can it get set? Oh, apparently in the same file:

#define _NO_LONGLONG
#if defined _WANT_IO_LONG_LONG \
	&& (defined __GNUC__ || __STDC_VERSION__ >= 199901L)
# undef _NO_LONGLONG

OK, so it’s looking to see if _WANT_IO_LONG_LONG is defined. Apparently this is a configuration option, specifically --enable-newlib-io-long-long. Rebuilding newlib with that option finally lets printf work as intended.

Reading through the repository now that I had a configuration flag handy, I was able to find it in the README, where it states that this flag is disabled by default, but some hosts enable it in their configure.host files. Just something to keep in mind, I guess, when using newlib.

For those Gentoo users like me, to build newlib with this option, add EXTRA_ECONF="--enable-newlib-io-long-long" to an /etc/portage/env/ file and have your cross-*/newlib of choice use it in /etc/portage/packages.env/ .

Categories
Embedded Systems Engineering

Newlib, FreeRTOS, and the curse of __SINGLE_THREAD___

tl;dr If newlib is built using --disable-newlib-multithread, its object files will have been built using __SINGLE_THREAD__, so any application linking to newlib must also define __SINGLE_THREAD__ or some newlib headers will have structs with different fields between newlib and the application. Or just rebuild newlib with --enable-newlib-multithread.

I’ve been working on a personal project of mine. At this point, revision A of the hardware works, and the firmware also works. But getting to this point wasn’t exactly easy. As expected of embedded software development, there are a lot of potholes in the road.

I thought I had finished my basic firmware implementation last night, and then I tried rebuilding it on my desktop instead of my laptop… and was greeted by hardfaults due to null-pointer dereferences on the Raspberry Pi Pico on the board (I have configured to MPU to prevent access to the first 256 bytes specifically to catch null pointer dereferences). That was… bizarre, as the code worked fine if I built it on my laptop.

After some debugging, I noticed the problem– for some reason, the stdout, which is part of an array of three elements of the type __sFILE struct (defined in newlib/libc/include/sys/reent.h) looked… like it was shifted over by 4 bytes. So as a sanity check, I printed the sizeof() of the struct while debugging… and found something I have honestly never seen before. If I printed the sizeof() from within newlib code, I got a size of 100 bytes. But if I printed it from my own application, I got 104 bytes. Well, that would explain the 4 byte offset, but… why was there a discrepancy?

Taking a look at the reent.h header in newlib, I noticed something in the __sFILE struct– one element is ifndef __SINGLE_THREAD__’d, specifically a lock for multi-threading… and then it clicked. I built my newlib for cross-compilation using Gentoo’s crossdev, and by default it builds newlib for single-threading, which forces newlib’s build systems to define the__SINGLE_THREAD__ macro. However, my own application nor the many CMake layers used by the pico-sdk do not define __SINGLE_THREAD__ anywhere, leading to my application to interpret this structure, defined in the header, as having that lock field. This led to a struct definition conflict between my application and newlib’s static library.

My fix was to simply re-build newlib with multi-threading enabled, as this is apparently the default used by upstream anyway. Also, the reason the firmware built on my laptop worked was that I just happened to have enabled newlib multi-threading… for reasons I no longer remember. In theory, one could also just define the __SINGLE_THREAD__ macro themselves when building the application, and should work as well.

Categories
Engineering Linux

Porting postmarketOS to the Pixel 6A

I’m not sure yet how this series is going to turn out, but I’ll likely have to make multiple posts to outline all of the insanity that I’ve had to wade through to get postmarketOS working with the Pixel 6A.

Goal: Repurposing smartphones

This is work I’ve been doing in collaboration with Jen Switzer in the PATLab at UC San Diego. I was tapped to help with the project due to my background in beating my head against Linux until either I or it submit.

The goal of this particular effort is to repurpose smartphones as a way to reduce the overall carbon footprint of computing. For a more thorough explanation, I’ll refer anyone interested to Jen’s open access-paper from ASPLOS 2023 (which won a Distinguished Paper Award!).

Prerequisites

  • Google Pixel 6A that is bootloader unlocked
  • Basic understanding of Linux CLI and Android
  • A computer (or VM) with Linux and only Linux— no Mac OS X— and maybe WSL???)

Setting up pmOS development environment

I recommend following the pmOS wiki for installing pmbootstrap. If you’re using Gentoo, pmbootstrap may be available from the GURU repository, but note that the version must be 1.52 or greater (at the time of this writing, GURU only had 1.51).

For Gentoo Linux, installation (if GURU has the right version…) looks like:

# pmbootstrap ebuild is in GURU overlay, but it's not the latest.
# It must be at least version 1.52
sudo eselect repository enable guru
sudo emerge -av pmbootstrap

Once pmbootstram is installed, we need to initialize the pmOS working directory, which we can do with:

pmbootstrap init

One of the first questions pmbootstrap will ask you is where to set up a working directory. You can pick any directory, but note that it is critical that the working directory for pmbootstrap is in a Linux filesystem (e.g. ext4 or BTRFS) — NTFS or FAT32 or exFAT won’t work. The reason for this is that pmOS build system uses chroot and other Linux-isms that non-native Linux filesystems do not handle well.

If you’ve done anything with pmOS in the past, you may have a file in ~/.config/pmbootstrap.cfg that may need to be deleted, if you want to start completely from scratch/move the postmarketOS folder.

pmbootstrap init will ask a lot of questions. Here are most of the questions the installer will ask (extra spaces added to help the individual prompts stand out):

Location of the 'work' path. Multiple chroots (native, device arch, device rootfs) will be created in there.
Work path [/home/gabriel/.local/var/pmbootstrap]:

Choose the postmarketOS release channel.
Available (7):
* edge: Rolling release / Most devices / Occasional breakage: https://postmarketos.org/edge
* v22.12: Latest release / Recommended for best stability
* v22.06: Old release (unsupported)
* v21.12: Old release (unsupported)
* v21.06: Old release (unsupported)
* v21.03: Old release (unsupported)
* v20.05: Old release (unsupported)
Channel [edge]:

Choose your target device vendor (either an existing one, or a new one for porting).
Available vendors (76): acer, alcatel, amazon, amediatech, apple, ark, arrow, asus, beelink, bq, cubietech, cutiepi, dongshanpi, essential, fairphone, finepower, fly, goclever, google, gp, hisense, htc, huawei, infocus, jolla, klipad, kobo, lark, leeco, lenovo, lg, mangopi, medion, meizu, microsoft, mobvoi, motorola, nextbit, nobby, nokia, nvidia, odroid, oneplus, oppo, ouya, pegatron, pine64, planet, purism, qemu, radxa, raspberry, samsung, semc, sharp, shift, sipeed, sony, sourceparts, surftab, t2m, tablet, tokio, tolino, trekstor, vernee, videostrong, volla, wexler, wiko, wileyfox, xiaomi, xunlong, yu, zte, zuk
Vendor [qemu]:

Available codenames (39): bob, burnet, cozmo, crosshatch, damu, dru, druwl, dumo, elm, fennel, fennel14, glass, hana, juniper, kakadu, kappa, katsu, kenzo, kevin, kodama, krane, makomo, nyan-big, nyan-blaze, peach-pi, peach-pit, sargo, snow, spring, veyron-fievel, veyron-jaq, veyron-jerry, veyron-mickey, veyron-mighty, veyron-minnie, veyron-speedy, veyron-tiger, willow, x64cros
Device codename:

[The following appears if you add a new codename-- in this case Vendor google, codename bluejay]
You are about to do a new device port for 'google-bluejay'.
Continue? (y/n) [y]:

Device architecture (armv7/aarch64/x86_64/x86/riscv64) [armv7]:

Who produced the device (e.g. LG)?
Manufacturer:

What is the official name (e.g. Google Nexus 5)?
Name:

In what year was the device released (e.g. 2012)?
Year:

What type of device is it?
Valid types are: desktop, laptop, convertible, server, tablet, handset, watch, embedded, vm
Chassis:

Does the device have a hardware keyboard? (y/n) [n]:

Does the device have a sdcard or other external storage medium? (y/n) [n]:

Which flash method does the device support?
Flash method (0xffff/fastboot/heimdall/none/rkdeveloptool/uuu) [0xffff]:

You can analyze a known working boot.img file to automatically fill out the flasher information for your deviceinfo file. Either specify the path to an image or press return to skip this step (you can do it later with 'pmbootstrap bootimg_analyze').
Path:

Username [user]:

Available user interfaces (14):
* none: Bare minimum OS image for testing and manual customization. The "console" UI should be selected if a graphical UI is not desired.
* asteroid: (Wayland) Smartwatch UI from AsteroidOS
* console: Console environment, with no graphical/touch UI
* fbkeyboard: Plain framebuffer console with touchscreen keyboard support
* framebufferphone: Minimalist framebuffer menu/keyboard UI accessible via touch/volume keys & compatible scripts
* gnome: (Wayland) Gnome Shell
* gnome-mobile: (Wayland) Gnome Shell patched to adapt better to phones (Experimental)
* i3wm: (X11) Tiling WM (keyboard required)
* lxqt: (X11) Lightweight Qt Desktop Environment (stylus recommended)
* mate: (X11) MATE Desktop Environment, fork of GNOME2 (stylus recommended)
* plasma-desktop: (X11/Wayland) KDE Desktop Environment (works well with tablets)
* shelli: Plain console with touchscreen gesture support
* sxmo-de-dwm: Simple Mobile: Mobile environment based on SXMO and running on dwm
* sxmo-de-sway: Simple Mobile: Mobile environment based on SXMO and running on sway
* xfce4: (X11) Lightweight desktop (stylus recommended)
NOTE: 6 user interfaces are not available. If device supports GPU acceleration, set "deviceinfo_gpu_accelerated" to make UIs available. See: <https://wiki.postmarketos.org/wiki/Deviceinfo_reference
User interface [weston]:

Additional options: extra free space: 0 MB, boot partition size: 256 MB, parallel jobs: 17, ccache per arch: 5G, sudo timer: False, mirror: http://mirror.postmarketos.org/postmarketos/
Change them? (y/n) [n]:

Additional packages that will be installed to rootfs. Specify them in a comma separated list (e.g.: vim,file) or "none"
Extra packages [none]:

Your host timezone:

Use this timezone instead of GMT? (y/n) [y]:

Available locales (14): C.UTF-8, ch_DE.UTF-8, de_CH.UTF-8, de_DE.UTF-8, en_GB.UTF-8, en_US.UTF-8, es_ES.UTF-8, fr_FR.UTF-8, it_IT.UTF-8, nb_NO.UTF-8, nl_NL.UTF-8, pt_BR.UTF-8, ru_RU.UTF-8, sv_SE.UTF-8
Choose default locale for installation [C.UTF-8]:

Device hostname (short form, e.g. 'foo') [google-bluejay]:

Would you like to copy your SSH public keys to the device? (y/n) [n]:

Build outdated packages during 'pmbootstrap install'? (y/n) [y]:

Here is an example of me filling out the responses as I’m setting up my build for the Pixel 6A:

$ pmbootstrap init
[14:47:14] Location of the 'work' path. Multiple chroots (native, device arch, device rootfs) will be created in there.
[14:47:14] Work path [/home/gabriel/.local/var/pmbootstrap]: ~/.local/var/bluejay
[14:47:20] Setting up the native chroot and cloning the package build recipes (pmaports)...
[14:47:20] Clone git repository: https://gitlab.com/postmarketOS/pmaports.git
Cloning into '/home/gabriel/.local/var/bluejay/cache_git/pmaports'...
[14:47:25] NOTE: pmaports path: /home/gabriel/.local/var/bluejay/cache_git/pmaports
[14:47:25] Choose the postmarketOS release channel.
[14:47:25] Available (7):
[14:47:25] * edge: Rolling release / Most devices / Occasional breakage: https://postmarketos.org/edge
[14:47:25] * v22.12: Latest release / Recommended for best stability
[14:47:25] * v22.06: Old release (unsupported)
[14:47:25] * v21.12: Old release (unsupported)
[14:47:25] * v21.06: Old release (unsupported)
[14:47:25] * v21.03: Old release (unsupported)
[14:47:25] * v20.05: Old release (unsupported)
[14:47:25] Channel [edge]: 
[14:47:27] Choose your target device vendor (either an existing one, or a new one for porting).
[14:47:27] Available vendors (76): acer, alcatel, amazon, amediatech, apple, ark, arrow, asus, beelink, bq, cubietech, cutiepi, dongshanpi, essential, fairphone, finepower, fly, goclever, google, gp, hisense, htc, huawei, infocus, jolla, klipad, kobo, lark, leeco, lenovo, lg, mangopi, medion, meizu, microsoft, mobvoi, motorola, nextbit, nobby, nokia, nvidia, odroid, oneplus, oppo, ouya, pegatron, pine64, planet, purism, qemu, radxa, raspberry, samsung, semc, sharp, shift, sipeed, sony, sourceparts, surftab, t2m, tablet, tokio, tolino, trekstor, vernee, videostrong, volla, wexler, wiko, wileyfox, xiaomi, xunlong, yu, zte, zuk
[14:47:27] Vendor [qemu]: google
[14:47:29] Available codenames (39): bob, burnet, cozmo, crosshatch, damu, dru, druwl, dumo, elm, fennel, fennel14, glass, hana, juniper, kakadu, kappa, katsu, kenzo, kevin, kodama, krane, makomo, nyan-big, nyan-blaze, peach-pi, peach-pit, sargo, snow, spring, veyron-fievel, veyron-jaq, veyron-jerry, veyron-mickey, veyron-mighty, veyron-minnie, veyron-speedy, veyron-tiger, willow, x64cros
[14:47:29] Device codename: bluejay
[14:47:31] You are about to do a new device port for 'google-bluejay'.
[14:47:31] Continue? (y/n) [y]: y
[14:47:33] Generating new aports for: google-bluejay...
[14:47:33] Device architecture (armv7/aarch64/x86_64/x86/riscv64) [armv7]: aarch64
[14:47:36] Who produced the device (e.g. LG)?
[14:47:36] Manufacturer: Google
[14:47:38] What is the official name (e.g. Google Nexus 5)?
[14:47:38] Name: Pixel 6A
[14:47:40] In what year was the device released (e.g. 2012)?
[14:47:40] Year: 2022
[14:47:41] What type of device is it?
[14:47:41] Valid types are: desktop, laptop, convertible, server, tablet, handset, watch, embedded, vm
[14:47:41] Chassis: handset
[14:47:44] Does the device have a hardware keyboard? (y/n) [n]: n
[14:47:46] Does the device have a sdcard or other external storage medium? (y/n) [n]: n
[14:47:47] Which flash method does the device support?
[14:47:47] Flash method (0xffff/fastboot/heimdall/none/rkdeveloptool/uuu) [0xffff]: fastboot
[14:47:49] You can analyze a known working boot.img file to automatically fill out the flasher information for your deviceinfo file. Either specify the path to an image or press return to skip this step (you can do it later with 'pmbootstrap bootimg_analyze').
[14:47:49] Path: 
[14:47:51] *** pmaport generated: /home/gabriel/.local/var/bluejay/cache_git/pmaports/device/testing/device-google-bluejay
[14:47:51] *** pmaport generated: /home/gabriel/.local/var/bluejay/cache_git/pmaports/device/testing/linux-google-bluejay
[14:47:51] Username [user]: 
[14:47:55] Available user interfaces (14): 
[14:47:55] * none: Bare minimum OS image for testing and manual customization. The "console" UI should be selected if a graphical UI is not desired.
[14:47:55] * asteroid: (Wayland) Smartwatch UI from AsteroidOS
[14:47:55] * console: Console environment, with no graphical/touch UI
[14:47:55] * fbkeyboard: Plain framebuffer console with touchscreen keyboard support
[14:47:55] * framebufferphone: Minimalist framebuffer menu/keyboard UI accessible via touch/volume keys & compatible scripts
[14:47:55] * gnome: (Wayland) Gnome Shell
[14:47:55] * gnome-mobile: (Wayland) Gnome Shell patched to adapt better to phones (Experimental)
[14:47:55] * i3wm: (X11) Tiling WM (keyboard required)
[14:47:55] * lxqt: (X11) Lightweight Qt Desktop Environment (stylus recommended)
[14:47:55] * mate: (X11) MATE Desktop Environment, fork of GNOME2 (stylus recommended)
[14:47:55] * plasma-desktop: (X11/Wayland) KDE Desktop Environment (works well with tablets)
[14:47:55] * shelli: Plain console with touchscreen gesture support
[14:47:55] * sxmo-de-dwm: Simple Mobile: Mobile environment based on SXMO and running on dwm
[14:47:55] * sxmo-de-sway: Simple Mobile: Mobile environment based on SXMO and running on sway
[14:47:55] * xfce4: (X11) Lightweight desktop (stylus recommended)
[14:47:55] NOTE: 6 user interfaces are not available. If device supports GPU acceleration, set "deviceinfo_gpu_accelerated" to make UIs available. See: <https://wiki.postmarketos.org/wiki/Deviceinfo_reference
[14:47:55] User interface [weston]: gnome
[14:47:58] Additional options: extra free space: 0 MB, boot partition size: 256 MB, parallel jobs: 17, ccache per arch: 5G, sudo timer: False, mirror: http://mirror.postmarketos.org/postmarketos/
[14:47:58] Change them? (y/n) [n]: 
[14:48:00] Additional packages that will be installed to rootfs. Specify them in a comma separated list (e.g.: vim,file) or "none"
[14:48:00] Extra packages [none]: 
[14:48:01] Your host timezone: America/Los_Angeles
[14:48:01] Use this timezone instead of GMT? (y/n) [y]: 
[14:48:01] Available locales (14): C.UTF-8, ch_DE.UTF-8, de_CH.UTF-8, de_DE.UTF-8, en_GB.UTF-8, en_US.UTF-8, es_ES.UTF-8, fr_FR.UTF-8, it_IT.UTF-8, nb_NO.UTF-8, nl_NL.UTF-8, pt_BR.UTF-8, ru_RU.UTF-8, sv_SE.UTF-8
[14:48:01] Choose default locale for installation [C.UTF-8]: 
[14:48:02] Device hostname (short form, e.g. 'foo') [google-bluejay]: 
[14:48:03] Would you like to copy your SSH public keys to the device? (y/n) [n]: 
[14:48:05] After pmaports are changed, the binary packages may be outdated. If you want to install postmarketOS without changes, reply 'n' for a faster installation.
[14:48:05] Build outdated packages during 'pmbootstrap install'? (y/n) [y]: 
[14:48:06] WARNING: The chroots and git repositories in the work dir do not get updated automatically.
[14:48:06] Run 'pmbootstrap status' once a day before working with pmbootstrap to make sure that everything is up-to-date.
[14:48:06] DONE!

At this point, the build environment is mostly set up. You can check the status by calling pmbootstrap status:

$ pmbootstrap status
[14:48:42] *** CONFIG ***
[14:48:42] Device: google-bluejay (aarch64, "Google Pixel 6A")
[14:48:42] User Interface: gnome
[14:48:42] 
[14:48:42] *** GIT REPOS ***
[14:48:42] Path: /home/gabriel/.local/var/bluejay/cache_git
[14:48:42] - pmaports (master)
[14:48:42] 
[14:48:42] *** CHECKS ***
[14:48:42] [NOK] pmaports: workdir is not clean
[14:48:42] 
[14:48:42] *** CHECKLIST ***
[14:48:42] - pmaports: consider cleaning your workdir
[14:48:42] - Run 'pmbootstrap status' to verify that all is resolved

The message about the workdir not being clean is due to the creation of two folders in the pmaports folder for our new device— this is expected.

I’ve already done some work in getting the right parameters for the deviceinfo of the Pixel 6A port, and some patching of the initramfs that we’ll need to boot the stock Android GKI. We need to change the pmaports repository remote to point to my repository and branch.

cd ~/.local/var/bluejay/cache_git/pmaports
git clean -f device/testing/linux-google-bluejay/ device/testing/device-google-bluejay/
git remote set-url origin https://gitlab.com/gemarcano/pmaports.git
git remote add upstream https://gitlab.com/postmarketOS/pmaports.git
git fetch
git fetch upstream
git checkout google-bluejay

I have not upstreamed my changes to pmOS yet, as they are too experimental. Perhaps some day this step of using my fork won’t be necessary.

Finally, at this point you should be able to build the device package, and prepare the rootfs:

pmbootstrap build device-google-bluejay
pmbootstrap install

If you’ve already gone through these instructions, and you want to update the repository, cd to the pmaports directory and perform a git pull. Afterwards, run pmbootstrap build --force device-google-bluejay to force a rebuild of that package.

If the build fails, check the reason by using pmbootstrap log.

The following commands should/are only necessary if you modify the APGKBUILD files for the device or kernel packages:

pmbootstrap checksum device-google-bluejay
pmbootstrap checksum linux-google-bluejay

mkbootimg and avbtool

Google uses a custom image format for their Android flashable images, which are generated using a toolkit called mkbootimg. This toolkit includes mkbootimg.py itself, which is used to make boot.img and vendor_boot.img files, and unpack_bootimg.py, which is used to unpack existing boot and vendor_boot images.

Additionally, my experimentation on the 6A has identified that the bootloader, even when unlocked, does not like booting boot partitions that do not have an AVB signature, which are done using a tool called avbtool. The source code for the tools can be downloaded using the following commands:

git clone https://android.googlesource.com/platform/system/tools/mkbootimg
git clone https://android.googlesource.com/platform/external/avb

Unpacking boot.img, vendor_boot.img, and initramfs’s

We now need to unpack the stock boot and vendor_boot images. The following commands can be used to unpack boot.img and/or vendor_boot.img files. Replace anything inside square brackets with the paths specified by the contents of the brackets:

BOOT_IMG_DEST=[directory-to-unpack-boot.img]
./mkbootimg/unpack_bootimg.py --boot_img [path-to-boot.img] --out  "$BOOT_IMG_DEST"

When unpacking a boot.img, there should be two files in the output directory: a kernel, and an initramfs. For the Pixel 6A, this kernel should be the GKI, and the initramfs contains a first stage initramfs described in Google documentation (but no kernel modules whatsoever). This is the initramfs that we’ll be replacing.

We need to extract the vendor modules from the vendor_boot partition:

VENDOR_BOOT_IMG_DEST=[directory-to-unpack-boot.img]
./mkbootimg/unpack_bootimg.py --boot_img [path-to-vendor_boot.img] --out "$VENDOR_BOOT_IMG_DEST"
cd "$VENDOR_BOOT_IMG_DEST"
# The stock vendor_boot has two initramfs, unpack them both
mkdir 00 01
lz4 -cd vendor_ramdisk00 | cpio -iv0 -D ./00
lz4 -cd vendor_ramdisk01 | cpio -iv0 -D ./01
cd ../

There should be two initramfs’s in the vendor_boot image, and we’re just unpacking them into different folders. The initramfs we’re really interested in is the second one, now in the 01 folder, as this one contains all of the modules we need to load to boot.

Preparing our custom initramfs

Unfortunately, the Android GKI is too barebones for the normal bootflow of pmOS. I have added patches to the init script in the initramfs that pmbootstrap creates to load the vendor modules in the right order, but we still need to copy over the modules we unpacked into the initramfs we’ll be using.

To do this, we need to extract the postmarketOS initramfs. First, check to see what the compression of the initramfs is:

$ file ~/.local/var/bluejay/chroot_rootfs_google-bluejay/boot/initramfs
/home/gabriel/.local/var/bluejay/chroot_rootfs_google-bluejay/boot/initramfs: gzip compressed data, original size modulo 2^32 3045376

The deviceinfo for the Pixel 6A that I’ve created asks for LZ4 legacy, but the current version of postmarketos-mkinitfs that actually generates the image has not been updated since my pull request was merged, so for now we always need to check what compression format it is. Thankfully, gzip and lz4 use the same command-line arguments when decompressing, so we can decompress and unpack the initramfs as follows:

mkdir pmos-initramfs
gzip -cd ~/.local/var/bluejay/chroot_rootfs_google-bluejay/boot/initramfs | cpio -iv -D ./pmos-initramfs

At this point we have the contents of the initramfs unpacked.

Loading kernel modules in the right order

For the Pixel 6A, Google uses its new Android GKI system for managing the Linux kernel. In summary, the core kernel is an extremely generic image with absolutely minimal hardware support, just enough to boot into an initramfs, and no more. This core kernel relies on vendor modules to bring up the rest of the hardware. We’ve already unpacked the modules, so it’s just a matter of copying them over to our unpacked pmOS initramfs, and packing it again using lz4 -l.

For some reason, the stock modules are in the $VENDOR_BOOT_IMG_DEST/01/lib/modules directory (as opposed to $VENDOR_BOOT_IMG_DEST/01/lib/modules/[kernel-version]). In order for modprobe to work properly in load-modules.sh, we need to move the modules to the right kernel version folder in our initramfs. To find out what the name of the folder is, we can grep our stock kernel for its version. We know that the stock kernel version for the Pixel 6A is the 5.10 LTS Linux kernel, so we can use that to help us find the full version string:

KERNEL_VERSION=$(strings "$BOOT_IMG_DEST"/kernel | grep -m1 -Po "5\.10.*? ")

On my system, this looks like:

$ echo $KERNEL_VERSION
5.10.149-android13-4-693021-g2f0d2b7f95c6-ab9667365

With this information, we can now copy the modules to our unpacked initramfs:

mkdir -p pmos-initramfs/lib/modules/$KERNEL_VERSION
cp -ra $VENDOR_BOOT_IMG_DEST/01/lib/modules/* pmos-initramfs/lib/modules/$KERNEL_VERSION

For reasons I have yet to understand, these modules on the 6A must be loaded in a very specific order, otherwise they hang the kernel. This is the reason I list all of the modules in a very specific order inside the load_modules.sh script. I derived the load order from grepping the dmesg of a working Android boot. This script should already be in the root directory of the initramfs.

Packing boot.img

Finally, we have everything in place to re-pack boot.img with the stock kernel and our modified initramfs:

cd pmos-initramfs
find . -print0 | cpio --null -o --format=newc | lz4 -l --best > ../postmarket_initramfs.cpio.lz
cd ../
./mkbootimg/mkbootimg.py --kernel "$BOOT_IMG_DEST"/kernel --ramdisk ./postmarket_initramfs.cpio.lz --header_version 4 -o pmos_bootimg.bin --pagesize 4096 --os_version 13.0.0 --os_patch_level 2022-03

I’m not sure if it’s necessary, but I’ve added the --os_patch_level and --os_version flags to the packing command.

Signing the boot.img

So, apparently the Pixel 6A bootloader, even when unlocked, will refuse to boot a boot.img that does not have a valid AVB signature. To work around this, we’re going to sign it with a random key. To generate the key:

openssl genrsa -out foobar.key 4096
openssl rsa -in foobar.key -pubout -out foobar.key.pub

And to sign the boot.img:

./avb/avbtool.py add_hash_footer --dynamic_partition_size --image pmos_bootimg.bin --key foobar.key --prop com.android.build.boot.fingerprint:'google/bluejay/bluejay:12/SD2A.220601.004.B2/8852801:user/release-keys' --prop com.android.build.boot.os_version:'13' --prop com.android.build.boot.security_patch:'2022-06-01' --algorithm 'SHA256_RSA4096' --rollback_index 1654041600 --partition_name boot

There are a lot of flags here that may not be necessary, but I copied from an existing boot.img, just in case.

Flashing pmOS

WARNING: From here on out, we are replacing partitions on the phone, and I haven’t tried undoing the changes to the super partition. I strongly recommend backing up the existing boot and super partitions somehow (I’ve backed these up in the past using adb pull with a rooted Android system).

At this point, we’re finally ready to flash the boot.img to the Pixel 6A. This is done with fastboot after booting into the bootloader (power off the phone, turn it back on while holding down the power button):

fastboot flash boot pmos_boot.bin

And lastly, we need to flash to root filesystem. I’ve configured the root filesystem to be installed in the super partition on the Pixel 6A instead of userdata:

pmbootstrap flasher flash_rootfs --partition super

At this point, pmOS should be installed on the phone and should work.

Configuring UART for debugging

One of the major perks of working with Google Pixel phones is that they all, so far, appear to expose the SoC UART pins through the USB port. Specifically, they co-opt the Alternate Mode pins on the USB port for UART TX and RX. The following command enables this functionality:

fastboot oem uart enable

You need:

  • USB C full featured cable (must support thunderbolt or DisplayPort Alt mode, this guarantees the existence of the SBU/Alternate Mode pins)
  • Something like the Treedix USB 3.1 Type C Female to Female Pass Through Adapter Breakout Board 24 Pin for Date Line Wire Cable Transfer Extension
  • To be safe, a USB 2.0 USB C cable — these do not populate the alternate mode pins, so there’s no chance of the computer you’re connecting to will try to use them and do weird things while the UART is connected
  • 1.8V UART adapter, connected to the SBU pins for TX and RX.

Google open-sourced a hardware device to interface with the UART pins from Pixel phones, USB-Cereal. This cable could also work.

Which SBU pin is TX and which one is RX depends on how the USB C cable is plugged in. For my testing, I used an oscilloscope to identify which pin had TX data on it.

After the cables are all plugged in, I recommend using a program like minicom to connect to the plugged in USB UART device. The Pixel 6A’s UART interface is configured to use 115200 baud with 8-N-1 (8bits, 1 stop bit).

To disable UART:

fastboot oem uart disable

Finally, loaded completely into rootfs!

postmarketOS, by default, does not seem capable of driving the display for the Pixel 6A. To confirm that the phone booted, connect over USB to the phone. At least on Linux, doing this causes a new ethernet device to appear and to autoconfigure an IP address in the 172.16.42.1/24 range. To ssh into the phone, do:

ssh user@172.16.42.1

And type in the password provided during pmbootstrap install.

At this point, you should be able to SSH into the phone, but it may complain about missing a PTY or TTY. This is normal! The GKI kernel is missing support for virtual TTYs, which is required by most “normal” Linux systems. However, even without a fully functional session, you should still be able to invoke commands. Type a command, e.g. uname -a and press Enter to confirm that you have connected.

A workaround for this issue is to create the /dev/ptmx device (note that because we don’t have a real PTY, we have to use sudo -S, which shows the password being typed in, so make sure that no one is looking over your shoulder!):

sudo -S mknod /dev/ptmx c 5 2

After this, disconnect from SSH and reconnect, and things should work more “normal”. Sadly, this needs to be done on every reboot of the phone (one workaround is to make a local script that gets called on boot to set this file up).

The real solution for this mess is to make sure that the kernel has built-in support for a lot of stuff. This may be the topic of a future blog post, because compiling the kernel for the Pixel 6A is… interesting 🫠.

Configuring additional options

To turn on KVM:

fastboot oem pkvm enable

Troubleshooting

Resetting pmbootstrap chroots

Sometimes pmbootstrap became confused on my laptop (failures while trying to build, stating that it failed to find sh). A workaround is to restart the chroots, which can be done by doing:

pmbootstrap zap

Usually I didn’t need to wipe the chroots, but if you want to start from scratch, using zap and answering yes to its questions is how to do that.

telnet-ing into pmOS’s initramfs environment

If booting is working, but something is going wrong in the second half of loading from the initramfs, pmOS has a debug hook that sets up telnet access while in the initramfs. To set this up:

pmbootstrap initfs hook_add debug-shell

Note that you will need to unpack the generated pmOS initramfs and go through the process of copying the modules into the unpacked initramfs, re-packing boot.img, and re-signing boot.img. After doing that and flashing the boot.img, you should be able to telnet in:

telnet 172.16.42.1 23