Categories
Embedded Systems Engineering

How is the Sparkfun Artemis Configured to Boot?

tl;dr INFO0 is configured by Sparkfun. SRAM is cleared, GPIO47 is used to select whether ASB runs and accept data over UART, and if not the bootloader jumps to 0xC000.

I’ve been working with the Sparkfun Artemis module and the Redboard Artemis ATP board for research, due to the Ambiq Apollo3 microcontroller in use. There is relatively little documentation on the requirements of user programs to successfully take over after the Apollo3 secure bootloader finishes running. There’s also not much out there on what this secure bootloader actually does (and I wish we could shut it off and just jump immediately to user code…). The goal of this post is to summarize a few of my findings.

The secure bootloader can be configured to some extend through the use of parameters that are stored in a special region of flash that’s treated as an OTP during normal operation (completely read-only). This region is called INFO0. I’m assuming it’s Sparkfun that’s providing something reasonable here, or maybe it’s the default from the factory. In any case, the Sparkfun Apollo3s are configured to jump to address 0xC0001 or with a logic high on GPIO472 to use the ASB bootloader for “OTA” updates. The asb.py script (also found in the Ambiq SDK) can be used to upload programs through this interface over UART03 (also configured in INFO0).

The bootloaders (both ASB and Sparkfun’s own SVL) check for the validity of the application’s stack pointer (the first 32-bit value of the program). On this MCU, SRAM spans from 0x10000000 to 0x10005FFF. The stack is full descending, so in theory using the address 0x10006000 as the initial value of the stack should be valid, but it turns out both ASB and SVL reject this value. Valid stack values must be less than or equal to 0x10005FFF.

After some testing trying to have some SRAM survive reboots, it seems like the secure bootloader completely erases all of SRAM. It seems that the code to do this is copied from bootrom to SRAM, oddly enough (I was able to place a watchpoint on SRAM for when it was cleared, and it paused on some code in SRAM while the bootloader was executing). Apparently register 0x50020050 holds a value indicating “the amount of SRAM to keep reserved for application scratch space”4. So it may be possible to prevent some SRAM from being cleared with a modified INFO0. Unfortunately for me, this only protects memory at the high end of SRAM (I would have liked to retain some memory in the first 4KB of SRAM).

I think this is everything I’ve found thus far. I’ve reprogrammed the SVL bootloader, optimizing it and reducing it in size. It also provides some compile-time options for skipping the SVL bootloader completely or not based on the value of some GPIO (currently only lightly configurable). I also wrote svl-tools (in Rust) to replace the svl.py program from Sparkfun and to leverage the extra functionality of reading from memory on request that I’ve added to the SVL.

If I find anything else noteworthy about the boot process, I’ll edit this post and make note of it.

  1. Register 0x50020C00 is set to 0xC000, which is the address to jump to after the bootloader is done. ↩︎
  2. Register 0x50020020 configures the pin used to force ASB boot programming. It is set to 0x000000af, pin 47 with polarity set to high. ↩︎
  3. Register 0x5002002C configures the pins to use for UART, it is set to 0xffff3031, so pin 48 and 49 for TX and RX respectively. Register 0x50020028 configures the UART, it is set to 0x1c200c0, meaning using UART module 0, data 8 bits, no parity, 1 stop bit, and a baud rate of 115200. ↩︎
  4. From the Apollo3 datasheet: https://ambiq.com/wp-content/uploads/2020/10/Apollo3-Blue-SoC-Datasheet.pdf ↩︎
Categories
Engineering

LDO Voltage Regulators are More Confusing than I Expected

A few months ago a couple of friends and I bought some Japanese Super Famicoms (SFCs) from e-bay to refurbish. For reference, the SFC and the American Super Nintendo (SNES) used the same exact motherboards internally. The input power for the SFC/SNES is about 10 V DC, and the SNES uses an old linear voltage regulator to step it down to 5 V, which is the voltage required for most of the digital chips inside. Thing is, this regulator gets hot and has been known to sometimes fail. So if we’re taking everything apart, why not just replace it with something newer?

I actually found the L78S05CV to be a pretty decent “old” style linear voltage regulator that I used in my own SNES at home. But when I went to purchase it for this restoration project…

Great. And Digikey had run out of them. So I looked at other substitutes and I figured I might as well try using LDOs– “Time to move to the future” I thought to myself. But then I started reading the datasheets…

Many of the through hole LDOs I encountered with similar current capacity (2 A and higher) rely on the ESR of the bulk capacitor on the output for stability. If the ESR is too low they can oscillate out of control! The exact reasoning is a little beyond my understanding of electronics, but this application report from TI goes into the details. In short, LDOs will oscillate if the phase margin of a regulator is too close to or below zero (phase margin is the difference between the phase and -180 degrees at the frequency when the gain is unity. This is visible is Bode plots). Without enough compensation in place, this can occur with LDOs not designed to be used with MLCCs (MLCCs have very low ESR compared to typical electrolytic capacitors).

One critical observation from the TI document is that sometimes even placing two capacitors in parallel, one with adequate ESR, and one with much smaller ESR, can backfire.

That all said, there are LDOs (usually not through hole , or in as high a rated current output as 2 A) that are designed to use MLCCs and are made to be stable with low ESR at their output. To quote the document:

The incorrect assumption typically made is that when a small [ceramic] capacitor is in parallel with a larger capacitor, the smaller one’s effect will be “swamped out” by the larger one. However, the smaller value of capacitance made up by the “bypass capacitors” will form it’s own pole. If that pole is near or below the unity-gain crossover frequency of the loop, it can add enough phase lag to create an oscillator.

This also applies to bypass capacitors placed further away near ICs! Apparently the trace inductance can help decouple the bypass capacitors from the LDO. Apparently, the only “reliable” way to confirm that the LDO will be stable is this:

The reliable way to determine if board capacitance is reducing phase margin is to perform load step testing on the actual board with all capacitors in place. The IC’s that the regulator powers should be removed (or not installed) and a resistor should be used at the output of the regulator that provides the same load current. The load should be stepped from no load to rated load while the output is watched for ringing or overshoot during the load step transient: excessive ringing indicates low phase margin.

That’s not exactly comforting. The document also doesn’t really specify what a long enough trace is as “board layouts vary, a ‘safe distance’ boundary for all applications can not be given”.

So, what can we do? Ideally we don’t bother and find a more “classical” linear voltage regulator. Non-LDO voltage regulators have a significant voltage drop requirement (near ~2V if I remember correctly), but due to their construction are significantly more stable and tolerate just about any ESR on their output. So… this is what we did, we decided to not replace the regulators on the SFCs.

Another alternative is to build a buck converter voltage regulator that can reduce the voltage to 5V and sustain up to 2 A of current output. This will probably be a post for another day, as it’s not super trivial to reduce the ripple and keep it out of relevant frequency bands (e.g. NTSC/PAL).

Categories
Engineering

MLCC Capacitors and DC Bias

I’ve been working with electronics and PCB design for my research for some time now, and the DC bias effect on (most?) multi-layer ceramic capacitors (MLCCs) is one of the newest details I’ve come across that warrants a post so I don’t forget about it.

Effectively, due to the nature of some of the dielectric material used in MLCCs, the actual capacitance of an MLCC exposed to some DC bias can be much lower than the rated capacitance.1 The strength of the effect is correlated to the size of the MLCC package, and is independent of the rated voltage of the MLCC.2

Not all dielectric materials are affected, however. MLCCs with C0G dielectric are unaffected, but these are fairly large for their capacitance (and expensive). X5R and X7R dielectrics are affected, and from what I understand it has to do with some ferroelectric effect due to the titanium in their compound.

To drive home the point, I found this really good 12 year old article from Maxim Integrated (now part of Analog Devices). In it, it has this nifty table:

Figure 1. Capacitance variation vs. DC voltage for select 4.7µF capacitors.

Of particular note, notice how the 0603 X5R loses more than 50% of its rated capacitance with a DC bias voltage of just under 5 volts!

From what I can gather, some manufacturers have simulations on their product pages showing the DC bias effect on their capacitors, others may included in their datasheets, and others don’t bother. This is just something that needs to be kept in mind while designing with MLCC capacitors.

  1. Figure 1 in https://community.infineon.com/t5/Knowledge-Base-Articles/DC-Bias-characteristics-of-Multilayer-Ceramic-Capacitor-MLCC/ta-p/250035 ↩︎
  2. Figure 2 in the same Infineon article as the previous footnote. ↩︎

Categories
Engineering Linux

Managing Your Own CA, Elliptic Curve Edition

Historically, I’ve used my main desktop to store system backups and provide other network services, but this is annoying because I dual-boot it to play video-games, so while I’m playing most services are offline. Additionally, electricity in San Diego is absurdly expensive (thanks SDGE), so the 180+W while idling adds up quite a bit. To fix both problems, I bought an HP Prodesk 600 G3 MT from Human-I-T for very reasonable prices, hoping that the idle power draw stays under 20 Watts. While setting up that server, I decided that I might as well figure out how to set up my own CA for all of the internal services that I’m setting up, since I was getting tired of getting warnings from Firefox about “unsecured” connections due to self-signed certificates.

Most of the information I used to set up my own CA came from the OpenSSL CA documentation, which can be found here: https://openssl-ca.readthedocs.io/en/latest/index.html. However, it is missing the details on how to use Elliptic Curve keys. This post is an attempt to fill in that gap.

The process can be summarized as follows:

  1. Make a root CA key and certificate
  2. Make an intermediate CA key and certificate by requesting signing by root CA, and create an intermediate certificate chain
  3. Make individual server certificates by requesting signing by intermediate CA, and extend the certificate chain

Safeguarding Keys

It is important to protect all private keys, as if they leak anyone with them can sign certificates that pass as valid when checked against your CA certificates. This matters a little less when all of your certs are meant for your local network, but meh, might as well try to do something about it.

What I decided to do was to use LUKS to encrypt some filesystems on disk. I used one to store the CA and intermediate CA, and a second one to store client keys.

# Create the Root CA/Intermediate CA file, and the file for server keys
dd if=/dev/urandom of=ca bs=1M count=100 status=progress
dd if=/dev/urandom of=server bs=1M count=100 status=progress

# it will ask for a password-- remember it or store it somewhere secure/safe/encrypted
cryptsetup luksFormat ca
cryptsetup luksFormat server

# now open the volume, and mount it for use. Opening it will ask for the volume password
cryptsetup open ca ca-keys
cryptsetup open server server-keys
mkdir ca-keys
mkdir server-keys
mount /dev/mapper/ca-keys ca-keys
mount /dev/mapper/server-keys server-keys

It’s probably a good idea to backup the LUKS headers, in case something happens to them:

cryptsetup luksHeaderBackup ca --header-backup-file ca-keys.header.luks
cryptsetup luksHeaderBackup server --header-backup-file server-keys.header.luks

Make Root CA and its Certificate

The general outline of the steps can be found here: https://openssl-ca.readthedocs.io/en/latest/create-the-root-pair.html

cd ca-keys/
mkdir root/
cd root/
mkdir certs crl newcerts private
chmod 700 private
touch index.txt
echo 1000 > serial

At this point a openssl.cnf file needs to be created that will include default settings. It can be copied from https://openssl-ca.readthedocs.io/en/latest/root-configuration-file.html#root-configuration-file and modified to meet needs. For example, on my local install, I tweaked the dir and private_key paths, and changed the defaults for many of the *_default fields (e.g. countryName_default to US). I also tweaked the req default_bits to 4096.

With that in place, now it’s time to create the key in the ca-keys/root/ directory:

openssl genpkey -algorithm EC -pkeyopt ec_paramgen_curve:secp384r1 -pkeyopt ec_param_enc:named_curve -aes-256-cbc -out private/root-ca.key.pem
# This will now ask for a password, set one and keep it safe!
chmod 400 private/root-ca.key.pem

Wait, why is that so messy? Turns out that only some elliptic-curves are really used widely online (e.g. Chrome doesn’t support secp521r1). The two that seem to be the most supported are secp384r1 and secp256r1. So I just arbitrarily-ish picked secp384r1. Also, we are using the -aes-256-cbc option to enable password protection on the key. There are other options that may work (see the output of openssl enc -list for tentative options).

OK, now we have the key. Now we need a root certificate (we’re setting the expiry date far into the future, as recommended):

openssl req -config openssl.cnf -new -sha256 -extensions v3_ca -days 7300 -key private/root-ca.key.pem -out csr/ca.csr.pem
# This will request the password for the key, and then will ask for information about the certificate such as its common name, organization, etc.
chmod 444 certs/ca.cert.pem

To verify that the certificate looks ok:

openssl x509 -noout -text -in certs/ca.cert.pem

Make Intermediate CA and its Certificate

OK, now we can work on making the intermediate CA, which is the one that will actually be used to sign all of the certificates for the websites, etc. This should look familiar as it’s similar to what we did with the root certificate:

cd ../
mkdir intermediate
cd intermediate
mkdir certs crl csr newcerts private
chmod 700 private
touch index.txt
echo 1000 > serial
echo 1000 > crlnumber

Now, we can copy the base openssl.cnf for the intermediate certificate from https://openssl-ca.readthedocs.io/en/latest/intermediate-configuration-file.html#intermediate-configuration-file. Once saved, just as with the root openssl.cnf, variables can be tweaked (such as dir, *_default, default_bits).

Now it is time to create the intermediate CA key:

openssl genpkey -algorithm EC -pkeyopt ec_paramgen_curve:secp384r1 -pkeyopt ec_param_enc:named_curve -aes-256-cbc -out private/intermediate.key.pem
# This will now ask for a password, set one and keep it safe!
chmod 400 private/intermediate.key.pem

And now to create and sign the intermediate certificate:

openssl req -config openssl.cnf -new -sha256 -keyprivate/intermediate.key.pem -out csr/intermediate.csr.pem
# This will ask for the intermediate key password, and other information to fill into the certificate

cd ../root
openssl ca -config openssl.cnf -extensions v3_intermediate_ca -days 3650 -notext -md sha256 -in ../intermediate/csr/intermediate.csr.pem -out ../intermediate/certs/intermediate.cert.pem
# This will ask for the root CA password
chmod 444 ../intermediate/certs/intermediate.cert.pem

And to verify the newly created certificate:

openssl x509 -noout -text -in ../intermediate/certs/intermediate.cert.pem
# and check it against the root CA
openssl verify -CAfile ./certs/ca.cert.pem ../intermediate/certs/intermediate.cert.pem

Create the Certificate Chain

This is straight-ish forward (the order matters!). We need to concatenate the certificates of the root and the intermediate together to form the start of the certificate chain (intermediate needs to go first, the root):

# Changing directory to the root of the certs encrypted partition
cd ../
cat ./intermediate/certs/intermediate.cert.pem root/certs/ca.cert.pem > ./intermediate/certs/ca-chain.cert.pem

Great! At this point all of the initial setup is completed. From here on is a matter of creating site specific certificates, and installing the root certificate to all computers that need it.

Creating a Certificate for an Internal Site

Now we need to go into the server-certs encrypted folder we created earlier, as this is where we will store the certificates and keys for the individual sites as we create them (arguably, the better approach is to have each server create its own private keys and just provide the CSR to the signing server):

cd server-keys
mkdir site1.my.example
# Now create the key, no password
# If password is desired, add `-aes256 -pass stdin`
openssl genpkey -algorithm EC -pkeyopt ec_paramgen_curve:secp384r1 -pkeyopt ec_param_enc:named_curve -out site1.my.example/site1.key.pem
chmod 400 site1.my.example/site1.key.pem

With the private key in place (which should probably securely copied to the server for this domain), we now need to prepare the certificate signing request (CSR). We need the openssl.cnf file from here again (the intermediate CA). subjectAltName are now effectively required by modern browsers if the certificate is to authenticate a website. Multiple names can be provided, separated by commas (e.g. DNS:site1.my.example,DNS:site1.local).

openssl req -config openssl.cnf -new -sha256 -key site1.my.example/site1.key.pem -out site1.my.example/site1.csr.pem -addext "subjectAltName = DNS:site1.my.example"

With the CSR in hand, we now need to go back to the intermediate CA to create the certificate:

cd ../ca-keys/intermediate/
openssl ca -config openssl.cnf -extensions server_cert -days 365 -notext -md sha256 -in ../../server-keys/site1.my.example/site1.csr.pem -out ../../server-keys/site1.my.example/site1.cert.pem

And with that we should now have a signed server certificate! The last thing to do is to create the certificate chain with this new certificate:

cd ../../server-keys/site1.my.example/
cat site1.cert.pem ../../ca-keys/intermediate/certs/ca-chain.cert.pem > site1.chain.pem

And finally, with that we should have all of the certificates and chains needed to set up properly signed certificates at home or other locations. The only thing really left to do is to install the root CA certificate on any machines and browsers that should recognize any certificates signed by the intermediate CA as authentic.

Safeguarding Keys, Again: Unmount and Lock

After being done signing and copying certificates and chains, remember to unmount and lock the encrypted storages!

# cd out of server-keys and ca-keys
umount server-keys
umount ca-keys
cryptsetup close ca
cryptsetup close server

Categories
Engineering Linux

Wait, is that an STM32 in my RGB controller?

One of the first things I tried to do with my Dell G5 5505 SE was trying to see if I could figure out what the RGB controller is doing. Looking through the files belonging to Dell in the Program Files directory on Windows, the directory Firmware stood out immediately. Looking into it, I found two Intel Hex formatted files:

$ ls Firmware/ELC/
elc-dfu-gcc-v0.0.2.hex*  elc-iar-v1.0.12.hex*

OK, that’s interesting. Unpacking them using objcopy:

$ objcopy -I ihex -O binary Firmware/ELC/elc-dfu-gcc-v0.0.2.hex ~/dfu.bin
$ strings ~/dfu.bin | tail
YI      }I
]JILHLOMMNJ
UfUcTKK
""K^}
K[XCP
       DFU Config
DFU Interface
STMicroelectronics
DFU in FS Mode
@Internal Flash   /0x08000000/6*02Ka,58*02Kg

Oh! So it’s not encrypted, and this firmware belongs to some sort of STM32 chip! What does the other firmware have in store for us?

$ objcopy -I ihex -O binary Firmware/ELC/elc-iar-v1.0.12.hex ~/iar.bin
gabriel@diamante /mnt/Gaia/Program Files/Alienware/Alienware Command Center  
$ strings ~/iar.bin | grep stm32
       c:\jenkins_duvel\workspace\aw_elc_iar_prod\Drivers\STM32F0xx_HAL_Driver\Src\stm32f0xx_hal_cortex.c
c:\jenkins_duvel\workspace\aw_elc_iar_prod\Drivers\STM32F0xx_HAL_Driver\Src\stm32f0xx_hal_gpio.c
c:\jenkins_duvel\workspace\aw_elc_iar_prod\Drivers\STM32F0xx_HAL_Driver\Src\stm32f0xx_hal_i2c_ex.c
c:\jenkins_duvel\workspace\aw_elc_iar_prod\Drivers\STM32F0xx_HAL_Driver\Src\stm32f0xx_hal_rcc_ex.c
c:\jenkins_duvel\workspace\aw_elc_iar_prod\Drivers\STM32F0xx_HAL_Driver\Src\stm32f0xx_hal_uart.c
c:\jenkins_duvel\workspace\aw_elc_iar_prod\Drivers\STM32F0xx_HAL_Driver\Src\stm32f0xx_hal_dma.c
c:\jenkins_duvel\workspace\aw_elc_iar_prod\Drivers\STM32F0xx_HAL_Driver\Src\stm32f0xx_hal_i2c.c
c:\jenkins_duvel\workspace\aw_elc_iar_prod\Drivers\STM32F0xx_HAL_Driver\Src\stm32f0xx_hal_pcd.c
c:\jenkins_duvel\workspace\aw_elc_iar_prod\Drivers\STM32F0xx_HAL_Driver\Src\stm32f0xx_hal_pwr.c
c:\jenkins_duvel\workspace\aw_elc_iar_prod\Drivers\STM32F0xx_HAL_Driver\Src\stm32f0xx_hal_rcc.c
c:\jenkins_duvel\workspace\aw_elc_iar_prod\Drivers\STM32F0xx_HAL_Driver\Src\stm32f0xx_hal_spi.c

Oh. That’s nice and convenient. This is some sort of STM32F0 chip, with USB support. Additionally, we know Dell is using STM32’s HAL, or at least some version of it.

So, from all of this it really looks like Dell is using the DFU support from the STM32 to flash it. From our ACPI experimentation, we can force the STM32 to boot into DFU mode by forcing BOOT0 high and resetting the chip.

By using dfu-util, and trial and error, I was able to determine that this chip has 128KB of flash. And after some poking around in the binary firmware, and comparing the implementation of the SDK being used, I was able to narrow down to the HAL version and the chip must be some variant of the STM32F070xB.

I’ve studied the DFU part of the firmware in some depth, but I have not had the time to dive into the part that actually manages the USB hardware. There are bugs there that I’d like the fix (controller crashes if there are too many incoming commands).

Back when I was working more actively on this I wrote a section on the AW-ELC RGB controller in the OpenRGB’s wiki, from online sources and my own experimentation. This information is enough to write some code to talk to the controller over USB HID and program animations and change keyboard brightness intensity.

I’ve made some sample applications on my gitlab showing how to talk to the controller on Linux. It generates a couple of binaries, of which status, reset, and toggle are more interesting. status prints all of the status registers of the RGB controller, reset resets the contents of the controller to that of the Dell G5 5505 SE default settings, and toggle cycles through the intensity values (0 being bright to 100 being dim, inclusive) of a little file at ~/.config/aw_elc/dim to control the keyboard brightness. I’ve been using toggle for some years now to toggle my keyboard brightness on and off, so hopefully this can be useful to someone else too (or if nothing else, as an example of how to talk to this RGB controller).

Categories
Embedded Systems Engineering Gentoo Linux

Don’t build standard libraries with LTO unless you really want to suffer

I wanted to make this post more as a reminder to myself.

A few years ago I ran into an interesting issue while working with a cross-compiler for ARM on my Gentoo Linux systems. Effectively, I was getting a ton of libc undefined symbols errors. After a thread on the gcchelp mailing list, this was the summary of the outcome:

Gabriel Marcano:
Can newlib not be built with -flto?

...

Alexander Monakov:
It matters because the compiler recognizes several libc functions by name (including 'malloc'); recognizing such so-called "builtins" is necessary to optimize their uses. Unfortunately, builtins are different from common functions in LTO bytecode, and sometimes behave in unexpected ways. You can find examples searching GCC Bugzilla for stuff like "LTO built-in".

In your case the reference to built-in malloc is not visible to the linker when it invokes the plugin while scanning input files, so it does not unpack malloc.o from libc.a. This is a simple case of a more general problem where a compiler creates new references to built-in functions after optimizations (e.g. transforming 'printf("Hello\n")' to 'puts("Hello")'). Somehow it happens to work when libc is not slim-LTO (I guess it is rescanned), but rescanning for LTO code-generation is not implemented, as far as I know.

So no, apparently the tricks that GCC/LD make to optimize strlen, malloc, etc. make it next to impossible to properly enable LTO for the standard library.

For Gentoo this means remembering to disable LTO for newlib builds for cross-compilers. This is done by making a /etc/portage/env/no-lto.conf file:

CFLAGS="${CFLAGS} -fno-lto"
CXXFLAGS="${CXXFLAGS} -fno-lto"
FFLAGS="${FFLAGS} -fno-lto"

And then somewhere under /etc/portage/package.env/ making a file with:

cross-*/newlib no-lto.conf
Categories
Engineering Linux

Dell G5 5505 SE ACPI, or figuring out how to reset the RGB controller

This is going to be a long one, and it has taken me a while to write up (I started writing this in 2022, and now we’re almost in 2025!). While this isn’t as polished as I’d liked for it to be, it’s been sitting as a draft for too long, and I almost want to get rid of this laptop, heh.

Four years ago my old 2013 MSI GE40 laptop’s battery finally kicked the bucket, and after swearing off Nvidia and wanting to try out a CPU from team Red, I acquired the only all-AMD laptop I could find at the time against my better judgement: a Dell G5 5505 Special Edition. Honestly, this laptop has been mostly OK (other than having to replace one of its cooling fans under warranty not even two weeks after getting the laptop, great QC there Dell). One of the first things that caught my attention was the full RGB blacklit keyboard on the laptop, and I immediately wanted to see if I could get it working on Linux. This will probably be the subject of another blog post, later, though. However, in the process of reversing what the RGB controller was doing, I figured out that there was a way to reboot the RGB controller into DFU flashing mode, and that in Windows Dell’s software did this via ACPI calls.

For almost all commands that follow, one needs the acpica or iasl package, ideally at least the 20240927 or R09_27_24 release.

Dumping and disassembling ACPI tables

On Linux, acpidump and acpixtract make quick work of this:

# Extracts all ACPI tables into the tables.acpi file
sudo sh -c "acpidump > tables.acpi"
# Extracts all tables individually into their own .dat binary file
acpixtract -a tables.acpi

At this point, there should be a ton of *.dat files in the working directory. For some reason this laptop has over 20 ssdt tables.

To then disassemble the files, I do the following (my shell is fish, so some modification may be required to work with bash):

#!/usr/bin/fish
for I in *.dat
        echo -----------$I
        iasl -e $(ls *.dat | grep -v $I) -d $I
end

In essence we want to pass as many tables to the disassembler as possible so that it can find methods and other objects that the table being disassembled may use. We can’t pass the same table as a reference otherwise iasl complains about duplicate objects.

Getting WMI information

Great, now we can inspect the tables. From some… poking around Dell files on Windows, I determined that it updates the RGB firmware through some WMI ACPI functions. The Linux kernel has some pretty decent documentation on WMI, including stuff relating to Dell WMI functionality.

We need the bmf2mof utility to be able to parse some of the binary blobs exposed by the Linux kernel related to WMI methods. On my system, the right BMOF data for the WMAX function can be extracted as follows:

# As root or with root privileges
bmf2mof < /sys/bus/wmi/devices/05901221-D566-11D1-B2F0-00A0C9062910-1/bmof
[WMI, Dynamic, Provider("WmiProv"), Locale("MS\\0x409"), Description("WMI Function"), guid("{A70591CE-A997-11DA-B012-B622A1EF5492}")]
class AWCCWmiMethodFunction {
  [key, read] string InstanceName;
  [read] boolean Active;

  [WmiMethodId(19), Implemented, read, write, Description("Get Fan Sensors.")] void GetFanSensors([in] uint32 arg2, [out] uint32 argr);
  [WmiMethodId(20), Implemented, read, write, Description("Thermal Information.")] void Thermal_Information([in] uint32 arg2, [out] uint32 argr);
  [WmiMethodId(21), Implemented, read, write, Description("Thermal Control.")] void Thermal_Control([in] uint32 arg2, [out] uint32 argr);
  [WmiMethodId(23), Implemented, read, write, Description("MemoryOCControl.")] void MemoryOCControl([in] uint32 arg2, [out] uint32 argr);
  [WmiMethodId(26), Implemented, read, write, Description("System Information.")] void SystemInformation([in] uint32 arg2, [out] uint32 argr);
  [WmiMethodId(32), Implemented, read, write, Description("FW Update GPIO toggle.")] void FWUpdateGPIOtoggle([in] uint32 arg2, [out] uint32 argr);
  [WmiMethodId(33), Implemented, read, write, Description("Read Total of GPIOs.")] void ReadTotalofGPIOs([out] uint32 argr);
  [WmiMethodId(34), Implemented, read, write, Description("Read GPIO pin Status.")] void ReadGPIOpPinStatus([in] uint32 arg2, [out] uint32 argr);
  [WmiMethodId(36), Implemented, read, write, Description("Read Platform Properties.")] void ReadPlatformProperties([out] uint32 argr);
  [WmiMethodId(37), Implemented, read, write, Description("Game Shift Status.")] void GameShiftStatus([in] uint32 arg2, [out] uint32 argr);
};

Oh, that’s so nice. From here we know the GUID of the function A70591CE-A997-11DA-B012-B622A1EF5492 and all of the WMI methods and their IDs. With this information we can go find them in the ACPI tables.

For BIOS version 1.24.0, this is the WMAX function in ssdt20.dsl. In there there’s a large switch-case statement with IDs in hexadecimal that match up with the IDs in the BMOF.

Reimplementing ACPI GPIO methods to avoid screwing up their output

The ReadGPIOpPinStatus function (ID 34 or 0x22) uses the WISC method (defined in dsdt.dsl), which changes the direction of the GPIO pin to read it. This… is not useful if we just want to read the pin without affecting its function (which we can do), as changing its direction makes the GPIO act as thought it was set to output low.

I implemented two new ACPI functions, WISI and WISO, to refactor WISC and to stop screwing up the GPIO output on a GPIO read call. One observation I made is that the “magic” address 0xFED81500 is the address of the gpio-amd-fch GPIO registers (found in the Linux kernel in drivers/gpio/gpio-amd-fch.c).

    Method (WISO, 2, Serialized)
    {   
        Local0 = (Arg0 << 0x02)
        Local0 += 0xFED81500
        OperationRegion (GREG, SystemMemory, Local0, 0x04)
        Field (GREG, ByteAcc, NoLock, Preserve)
        {
            Offset (0x02),
                ,   6,
            OPVL,   1,
            OPEN,   1
        } 
        OPEN = One
        OPVL = Arg1
    }   
        
    Method (WISI, 1, Serialized)
    {       
        Local0 = (Arg0 << 0x02)
        Local0 += 0xFED81500
        OperationRegion (GREG, SystemMemory, Local0, 0x04)
        Field (GREG, ByteAcc, NoLock, Preserve)
        {   
            Offset (0x02),
            PSTS,   1
        }   
            
        Local2 = PSTS /* \WISI.PSTS */
        Return (Local2)
    }

I then re-implemented method IDs 0x20 (32) and 0x22 (34) using these new methods. This should let us query the GPIO state without messing with the GPIO direction.

                    // Write NRST or BOOT0
                    // Name=FWUpdateGPIOtoggle
                    Case (0x20)
                    {
                        AXBF = Arg2
                        // BFB0 is pin select (1 == NRST, 0 == BOOT0)
                        If ((BFB0 == Zero))
                        {
                            If ((BFB1 == Zero))
                            {
                                WISO(0x05, Zero)
                            }
                            Else
                            {
                                WISO(0x05, One)
                            }
                        }
                        ElseIf ((BFB0 == One))
                        {
                            If ((BFB1 == Zero))
                            {
                                WISO(0x0A, Zero)
                            }
                            Else
                            {
                                WISO(0x0A, One)
                            }
                        }

                        Return (Zero)
                    }
                    // Name=ReadTotalofGPIOs
                    Case (0x21)
                    {
                        Return (0x02)
                    }
                    // Name=ReadGPIOpPinStatus
                    Case (0x22)
                    {
                        AXBF = Arg2
                        Local0 = 0x02
                        // BFB0 is pin select (1 == NRST, 0 == BOOT0)
                        // WISI does not switch the pin to input
                        If ((BFB0 == Zero))
                        {
                            Local0 = WISI (0x05)
                        }
                        ElseIf ((BFB0 == One))
                        {
                            Local0 = WISI (0x0A)
                        }

                        Return (Local0)
                    }

Fixing the other ACPI tables (is pain)

If one tries to just recompile the tables, it won’t work. Most of the tables have a ton of issues. At a bare minimum, however, dsdt.dsl and ssdt20.dsl (or whichever ssdt has the WMAX function) need to build. Unfortunately the process for fixing these tables is non-trivial and rather tedious. My best recommendation is to scour online for any help in trying to understand the iasl error messages.

One critical change that needs to happen is that the version number near the top of the source files needs to be increased, else the Linux ACPI table override system won’t bother trying to override the table.

Compiling the new table is as simple as doing:

# Example rebuilding of the DSDT table
iasl dsdt.dsl
# If successful, it emits dsdt.aml

If there are errors, iasl will report them. On success a .aml file is generated, and it contains the compiled table.

Hacking Linux: Overriding the right tables!

Alright, we have the tables re-compiled. Now what? The Linux kernel can override ACPI tables, but the mechanism it uses is by table type… and we’re overriding at least one SSDT out of a bunch of SDDT ones, so the kernel can’t tell them apart.

I’ve come up with what is arguably the worst hack I’ve implemented by far to work around this issue. My hack checks each table in the kernel against the original table size, if they match the kernel prepares to replace the table. It is nasty, but it works, as none of the SSDT tables I’ve been overriding have the same original size.

Here’s the patch, for the 6 SSDT tables I’m overriding for BIOS version 1.24.0:

-- a/drivers/acpi/tables.c 2022-03-20 13:14:17.000000000 -0700
+++ b/drivers/acpi/tables.c 2022-04-13 16:37:55.389618306 -0700
@@ -688,6 +688,10 @@ acpi_table_initrd_override(struct acpi_t
    struct acpi_table_header *table;
    u32 table_length;
 
+   // FIXME Hacks by Gabriel M. to load a specific SSDT on a Dell G5505 SE
+   bool is_ssdt;
+   bool hack;
+
    *length = 0;
    *address = 0;
    if (!acpi_tables_addr)
@@ -705,7 +709,9 @@ acpi_table_initrd_override(struct acpi_t
        table_length = table->length;
 
        /* Only override tables matched */
-       if (memcmp(existing_table->signature, table->signature, 4) ||
+       is_ssdt = !memcmp(existing_table->signature, "SSDT", 4);
+       hack = is_ssdt && (existing_table->length == 0x517 || existing_table->length == 0x53B || existing_table->length == 0x723C || existing_table->length == 0x28D || existing_table->length == 0x30C8 || existing_table->length == 0xC6C);
+       if ((is_ssdt && !hack) || memcmp(existing_table->signature, table->signature, 4) ||
            memcmp(table->oem_id, existing_table->oem_id,
               ACPI_OEM_ID_SIZE) ||
            memcmp(table->oem_table_id, existing_table->oem_table_id,
@@ -713,6 +719,13 @@ acpi_table_initrd_override(struct acpi_t
            acpi_os_unmap_memory(table, ACPI_HEADER_SIZE);
            goto next_table;
        }
+
+       // FIXME Hack, skip matching all SSDT tables except specific one
+       if (is_ssdt && !hack) {
+           acpi_os_unmap_memory(table, ACPI_HEADER_SIZE);
+           goto next_table;
+       }
+
        /*
         * Mark the table to avoid being used in
         * acpi_table_initrd_scan() and check the revision.

This hack really sucks because it needs to be updated every BIOS update. I wish there were some way to differentiate SSDT tables already available, but I haven’t found one yet.

To actually have the Linux kernel override the ACPI tables, I’m using dracut to generate an initramfs image, and its acpi_override option to actually install the *.aml files as part of the initramfs. See the man pages for dracut for more details.

Implementing a debugfs for these newly found ACPI methods

OK, finally, we’re at a point where we can do something interesting. There’s probably a better way to do this, but I’ve implemented some code to allow calling these WMI functions through the Linux kernel, and I’ve exposed them through the debugfs interface.

I keep the patches that implement this in the kernel in a fork of the kernel on gitlab. Once the module is loaded it exposes the following files:

/sys/kernel/debug/dell_awcc/memory_volt
/sys/kernel/debug/dell_awcc/memory_freq
/sys/kernel/debug/dell_awcc/gameshift
/sys/kernel/debug/dell_awcc/boot0
/sys/kernel/debug/dell_awcc/nrst

I can’t vouch for the complete validity of the memory_* stuff, but the boot0 and nrst control the GPIOs connected to the STM32 managing the RGB controller. echo-ing 0 turns the GPIOs low, and echo-ing 1 turns them high. So, to change the RGB controller to its boot DFU mode:

echo 1 > /sys/kernel/debug/dell_awcc/boot0
echo 0 > /sys/kernel/debug/dell_awcc/nrst
echo 1 > /sys/kernel/debug/dell_awcc/nrst

And this is the dmesg output of doing the above:

[17039.648454] usb 3-3.2: USB disconnect, device number 5
[17043.154318] usb 3-3.2: new full-speed USB device number 6 using xhci_hcd
[17043.242228] usb 3-3.2: New USB device found, idVendor=0483, idProduct=df11, bcdDevice=22.00
[17043.242235] usb 3-3.2: New USB device strings: Mfr=1, Product=2, SerialNumber=3
[17043.242237] usb 3-3.2: Product: STM32  BOOTLOADER
[17043.242239] usb 3-3.2: Manufacturer: STMicroelectronics
[17043.242241] usb 3-3.2: SerialNumber: FFFFFFFEFFFF

Success! We’ve rebooted the RGB controller to its DFU programming mode and can be dumped using dfu-util!

To turn the RGB chip back to its normal self:

echo 0 > /sys/kernel/debug/dell_awcc/boot0
echo 0 > /sys/kernel/debug/dell_awcc/nrst
echo 1 > /sys/kernel/debug/dell_awcc/nrst

And this is my kernel output:

[17144.938660] usb 3-3.2: reset full-speed USB device number 6 using xhci_hcd
[17145.014269] usb 3-3.2: device firmware changed
[17145.014859] usb 3-3.2: USB disconnect, device number 6
[17145.194627] usb 3-3.2: new full-speed USB device number 7 using xhci_hcd
[17145.285273] usb 3-3.2: New USB device found, idVendor=0483, idProduct=df11, bcdDevice= 2.00
[17145.285283] usb 3-3.2: New USB device strings: Mfr=1, Product=2, SerialNumber=3
[17145.285288] usb 3-3.2: Product: DFU in FS Mode
[17145.285292] usb 3-3.2: Manufacturer: STMicroelectronics
[17145.285295] usb 3-3.2: SerialNumber: 206D335B5353
[17146.912500] usb 3-3.2: USB disconnect, device number 7
[17147.080577] usb 3-3.2: new full-speed USB device number 8 using xhci_hcd
[17147.176269] usb 3-3.2: New USB device found, idVendor=187c, idProduct=0550, bcdDevice= 2.00
[17147.176283] usb 3-3.2: New USB device strings: Mfr=1, Product=2, SerialNumber=3
[17147.176289] usb 3-3.2: Product: AW-ELC
[17147.176295] usb 3-3.2: Manufacturer: Alienware
[17147.176300] usb 3-3.2: SerialNumber: 00.01
[17147.269992] hid-generic 0003:187C:0550.0005: hiddev96,hidraw0: USB HID v1.11 Device [Alienware AW-ELC] on usb-0000:07:00.4-3.2/input0

I do wonder, now that I know what GPIO pins are connected to the RGB controller, if we should just… bypass the ACPI table and talk to the controller directly. Maybe that’s an experiment for another time.