Preamble / back history
My homelab has gone through quite a few migrations over the years. For a long time I ran FreeBSD on a spare PC. Eventually I got an HP Microserver G8 to turn it into more of a NAS. At some point I got fed up of having to manage Samba myself, so reinstalled it with FreeNAS (which later became TrueNAS CORE). I used FreeNAS' ability to run FreeBSD Jails to run some of the other services such as Plex.
Fast-forward to 2022 and I was starting to get a bit fed up of TrueNAS CORE for various reasons and I wanted to add more storage to the NAS, so I got a QNAP TS-673A NAS.
In 2023 we had solar PV and battery storage installed in the house. After some experimentation of trying to read and control the solar inverter from a Go program I hacked together, I eventually decided to try Home Assistant as it seemed to be the thing a lot of people used for this. The Home Assistant devs really push you towards Home Assistant OS, so as I was new to this I went that route, creating a VM on the QNAP NAS.
Migration number 1: HAOS to container
At some point I didn't want to run a VM on the NAS anymore. I don't remember why exactly.
I had an Intel NUC that I originally got as a play box for doing random dev work on; I use Macs as my desktop/laptop computers, but sometimes I need to work on a Linux machine, so I got the NUC as a small, not so powerful, low power machine. I flip-flopped between NixOS and Arch Linux on this NUC, but finally settled on NixOS.
Because NixOS opened up possibilities of doing more things with it through its declarative configuration, I started adding services to the NUC. (I didn't when running Arch because I was loathed to have to do configuration management for what was originally a throwaway machine.)
So, I ran a backup in Home Assistant, set up NixOS to run the HA container, imported the backup, and that was that. HA was now running in a container, freeing up resources on the NAS. Later on I also wanted more long-term data from HA, and to use Grafana to create dashboards of the solar/battery setup, rather than using HA's (frankly not very useful) dashboards, so the NUC also gained Prometheus and Grafana.
Because I was now running 3 services, each running on their own ports, I set up Traefik as well to reverse proxy them, which meant I could also have TLS and stop my browser yelling at me. So now that original play box is a fully-fledged server.
Why migrate back to HAOS?
Frankly this NUC is underpowered for what it's now doing, and any time HA starts doing intensive things the fan on the NUC spins to max. That teeny little 40mm fan is very whiny. The Core i5 CPU in the NUC is OK for most things, but there are some tasks for which it's just too slow.
A while ago Michael Stapelberg blogged about his mini PC VM host. I was very intrigued by this, especially given the powerful CPU and very low power consumption. After looking into it some more, I decided I'd go down the same route - ordering almost identical components. Like Michael, I installed Proxmox VE on the DeskMini. I could have stuck with bare NixOS, but once in a while there are things I'd like to do in Home Assistant that are easier if you follow the recommended approach of using HAOS - such as using "addons" (which basically creates another container on the HAOS host). I can do some of that manually, but there are a few cases where it was too annoying and I got rid.
Which takes us to...
Migration number 2: Back to HAOS
I created a VM for Home Assistant following this guide. The qcow2 disk image provided by HA is only 32GB, so I grew that to 300GB before booting. My HA state database alone is 25GB, and I need space for backups (of which there's around 100GB of backup tarballs, each a bit over 7GB).
Problem number 1: HA backups are too slow
As my HA database records solar PV power, house energy usage, etc. I want to keep downtime for the migration to a minimum. If I was only doing basic home automation tasks I wouldn't be bothered about having it down for a while, but I like having this near-realtime data about the house.
Looking at Home Assistant's backups I realised it was taking around 1h45m to create each backup! This seems silly for what I thought was a simple tar/compress, even with the 25GB SQLite file.
Why's it so slow? In HA 2025.1 they enabled encryption for backups by default. Generally this is a good thing! But as all my backups are on the same disk (I treat them more as point-in-time snapshots more than actual backups) and never go off-site, I don't need encryption. Release 2025.2 brought the ability to disable encryption. I'm running 2025.6, so I did that. This brought the backup time down to around 15 minutes.
15 minutes is still too slow for my liking. Where's it getting stuck? Oh, gzip. And Python. Gzip is single-threaded, so the Python-based tar/compress they use consumes a whole CPU core, and no more, to compress that 25GB SQLite file.
I know I can do better! One of my favourite tricks for situations like this is to offload work to other machines by streaming data over the network using netcat. I ultimately need the backup on my Mac anyway to upload to the new HA install, so I set up this pipeline:
On the current NixOS machine hosting HA:
time tar -cvf - \
--exclude='backups/*.tar' \
--exclude='*.db-shm' \
--exclude='*.log' \
--exclude='*.log.*' \
--exclude='tts/*' * | nc -Nl 2222And on my Mac:
nix shell nixpkgs#{gzip,pigz}
time nc sadpunk 2222 | gzip > homeassistant.tar.gzThat took around 9:28. Not bad, but I know it can be better. That gzip is still single-threaded, even if it's running on a faster P-core on my M2 Pro than on the Core i5.
Quick explanation of those commands:
The
timebuiltin is used to measure how long the command specified takesThe
tarcommand is creating (-c) with verbose output (-v, meaning it prints out the name of each file it's adding) to standard output (-f -- here the-is special and means "write to the file descriptor for stdout"), and excluding from the backup all files that HA also excludes; and finally a glob (*) to add all files in the current directory (run from within HA's data directory) to the tar commandI found the exclusions by digging through HA's code
The
nc(netcat) command says to shutdown the network socket (-N) when stdin sendsEOF, and to listen on port 2222 (-l 2222)On my Mac I create a new temporary shell to use the
gzipandpigzpackages from nixpkgsThis is a super-power of Nix that I love. If I need a piece of software temporarily, I don't need to install it globally, I can create a temporary shell to add it, and when I exit that shell the software is no longer available (but is technically still in the Nix store, until it gets garbage collected)
The
{gzip,pigz}is a shell expansion trick which turnsnixpkgs#{gzip,pigz}intonixpkgs#gzip nixpkgs#pigz- it avoids having to write thenixpkgs#out twice
Finally, we run a netcat receiver, connecting to the NixOS host (sadpunk) on port 2222
This streams all data received over the network to stdout
The output from netcat is piped (
|) togzipto compress the stream and then the output redirected to a file (> homeassistant.tar.gz)
TL;DR: Create a tar file on one machine, but instead of writing to disk, stream it over the network, receive that on a remote machine and write to a file on that machine's disk, compressing as you go.
Let's try with a gzip implementation called pigz, which can use multiple cores:
time nc sadpunk 2222 | pigz > homeassistant.tar.gz3:45. Much better. I can accept this.
The HA backup is actually another tar file containing the one above, and a file called backup.json that contains metadata about the backup (when it was created, a description, backup ID, is it encrypted, etc.). I grabbed a backup.json from an existing backup to be sure I got the format right, changed the description, time, then created the wrapper:
tar -cf backup.tar backup.json homeassistant.tar.gzProblem number 2: Home Assistant throws HTTP 500 uploading the backup
Trying to upload the file from the HA "onboarding" screen chugged for a bit, then gave an unhelpful 500. There was nothing on the screen or in the browser console about what went wrong. Checking Firefox devtools on the 500 I saw:
Unknown error, see supervisorThanks? How do I do that?
After some poking on the VM's console I finally found something:
2025-09-13 14:07:06.321 ERROR (MainThread) [supervisor.backups.backup] Can't read backup tarfile /data/tmp/tmp9ge0_aol/upload.tar: "filename './backup.json' not found"What do you mean backup.json not found? It's right there. I can see it in the tar file.
Ohhhh. Oh no. It says ./backup.json. I created the tar with backup.json. Tar includes paths if specified, and HA is being spectacularly dumb when reading the file.
Fine, let me "fix" the tarball.
tar -cf backup.tar ./backup.json homeassistant.tar.gzNow I can fully upload the file without getting a 500. Success? Not quite.
Problem number 3: Home Assistant thinks the backup is encrypted
I definitely didn't encrypt this! But when I try to load my backup.tar into the new HA it asks me for an encryption key. Huh?
I grabbed a backup made by HA, extracted the homeassistant.tar.gz and checked the file metadata of both. Both definitely pass the sniff test as compressed tar files, so let's look at the file structure:
% hexdump -C good/homeassistant.tar.gz | head -2
00000000 1f 8b 08 08 d0 4a c5 68 00 ff 68 6f 6d 65 61 73 |.....J.h..homeas|
00000010 73 69 73 74 61 6e 74 2e 74 61 72 00 ec bd 0b 9c |sistant.tar.....|% hexdump -C bad/homeassistant.tar.gz | head -2
00000000 1f 8b 08 00 00 00 00 00 00 03 ec bd db 8e 24 c7 |..............$.|
00000010 91 20 aa e7 fa 8a 98 1e 08 45 6a c4 6a bf 5f 0a |. .......Ej.j._.|Ooh, interesting. In the backup made by HA we see the tar file name listed, but in mine we don't; just some apparently random bytes.
What was different? Well, I used pigz instead of gzip, and I didn't directly compress a tar file. Let's experiment.
1. I switched back to using gzip; same result (no filename).
2. Stream the tar file over the network again, but write that directly to disk instead of piping straight into gzip/pigz; then compress separately.
Number 2 was the answer. Now hexdump says:
00000000 1f 8b 08 08 61 7d c5 68 00 03 68 6f 6d 65 61 73 |....a}.h..homeas|
00000010 73 69 73 74 61 6e 74 2e 74 61 72 00 ec bd db 8e |sistant.tar.....|So this is better, but HA still thinks the backup is encrypted. Sigh.
I did some more head-scratching. Let's try taking a homeassistant.tar.gz and backup.json we know is good (one created by HA), and re-wrap that with a tar created by me
tar -xf good/backup.tar
tar -cf test.tar ./backup.json homeassistant.tar.gzThis should be good, right? Nope, it thinks it's encrypted. What the hell.
Well, now the only difference is the wrapped tar file. Let's inspect that.
% hexdump -C test.tar| head -2
00000000 2e 2f 2e 5f 62 61 63 6b 75 70 2e 6a 73 6f 6e 00 |./._backup.json.|
00000010 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 |................|Uh, where'd that underscore come from in the filename for backup.json? Let's compare against an original backup:
% hexdump -C Custom_backup_2025.6.3_2025-09-13_11.43_28406136.tar | head -2
00000000 2e 2f 62 61 63 6b 75 70 2e 6a 73 6f 6e 00 00 00 |./backup.json...|
00000010 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 |................|So, something in my tar command is screwing things up?
Oh, wait. I'm on a Mac. That uses BSD tar, but on Linux it's GNU tar (or whatever the hell Python is really using, which uses or emulates GNU tar). BSD vs. Linux rears its head once again.
Fine, let's spawn a shell with GNU tar and try again.
% nix shell nixpkgs#{gzip,pigz,gnutar}
% which tar
/nix/store/3l2jw31r5051xiwmz5gr2d71lwb8zv5d-gnutar-1.35/bin/tar
% tar -cf ./test.tar ./backup.json homeassistant.tar.gz
% hexdump -C test.tar| head -2
00000000 2e 2f 62 61 63 6b 75 70 2e 6a 73 6f 6e 00 00 00 |./backup.json...|
00000010 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 |................|Bingo! I uploaded this, and HA finally saw it as a valid backup that I could restore from. Right..?
Problem 4: The restore does nothing?
At this point I want to throw all computers (or at least shitty Python software projects) into the fucking sea.
Restoring the backup chugged away for a few minutes, HA restarted itself and... went back to the onboarding screen? But it should have restored everything exactly and let me log in as me. Why is it thinking it's still a brand new install?
Wondering if I still had to go through onboarding anyway, I did that, let it create a new user account (using a different username, just in case), and poked around. There's no sign that it restored anything. No users were restored, none of my integrations or devices are there.
... But some of my custom entities exist? Automations are still missing. What the actual fuck? Poking into HA's logs I see some vague references to some of my custom things, so it did restore something.
I went back to the VM console and typed login from the ha > prompt to get a shell. (HA devs: why is that called login? You don't need to log in.) I poked around and found that the HA data directory is under /mnt/data/supervisor/homeassistant. All my stuff is definitely there, including the 25GB home-assistant_v2.db SQLite file. So why is almost everything missing?
What's in that database file? Is it actually everything?
I extracted the database from a backup:
% tar zxf homeassistant.tar.gz home-assistant_v2.dbAnd then opened the database and dumped the schema:
% sqlite3 home-assistant_v2.db
SQLite version 3.50.2 2025-06-28 14:00:48
Enter ".help" for usage hints.
sqlite> .schemaThis shows me all the CREATE TABLE DDL statements to give me an idea what's stored in it. It's all about events, states, statistics. So this is just the history of all the entities (i.e. timeseries-ish data). My user isn't in here. So where is it? Let's poke around on the existing HA machine then.
[root@sadpunk:/var/lib/homeassistant/data]# grep -r jamesog
.storage/auth_provider.homeassistant: "username": "jamesog",
.storage/auth: "username": "jamesog"
^CUh. OK. A hidden directory called storage that seems to contain a bunch of files containing JSON data, which seems to be some cross between configuration and a database. Weird, but OK. And now I understand. Let's go back to the glob I used to create the manual backup:
tar -cvf - \
--exclude='backups/*.tar' \
--exclude='*.db-shm' \
--exclude='*.log' \
--exclude='*.log.*' \
--exclude='tts/*' *That * does not capture hidden files (starting with a .) and, although I'd noticed a few hidden files/dirs, didn't think about it when creating the glob.
Soooo let's try doing the whole backup dance once again! This time I decided to actually stop HA as creating a backup of a live, in-use SQLite database isn't the best idea, and I noticed that sometimes the SQLite WAL file (database transaction journal) was large in the backups. I want to minimise the chance of having a corrupted database.
systemctl stop podman-homeassistant.service; \
time tar -cvf - \
--exclude='backups/*.tar' \
--exclude='*.db-shm' \
--exclude='*.log' \
--exclude='*.log.*' \
--exclude='tts/*' \
* .cloud .storage .HA_VERSION | nc -Nl 2222; \
systemctl start podman-homeassistant.serviceThis time I use a shell one-liner stop stop the Home Assistant service, kick off the tar/stream; then start HA again once netcat finishes. Then did the dance on the other side to create the wrapper tarball, uploaded that to HA, started the restore and...
SUCCESS!
I finally have a fully restored Home Assistant. I can log in as my own user. All my integrations (including custom stuff via HACS) are present, all data seems to be there, it's pulling data from my solar inverter. Amazing.
What a saga.
Addendum: Disclosure on use of LLM to debug
Throughout this adventure I used Claude as a bit of a rubber duck and to bounce ideas off. I've been playing with Claude to see how useful these LLMs really are. Claude is the one most programmers seem to prefer. I'm generally an "AI skeptic" but skepticism is nothing without evidence, so rather than just saying "AI BAD" I'm trying to use the tools to find their strengths and weaknesses. I still don't like the ethics of how all the AI companies source their data.
Most of the things Claude told me to try were pretty off-key and obviously wouldn't help. My initial prompt wasn't specific enough, and nor were subsequent prompts, but after a bit of back and forth it gave the idea of using hexdump to check the tar.gz headers. I would have got there eventually (I usually use xxd rather than hexdump but they're basically the same), but Claude's suggestion probably saved me about 30 minutes.
When it came to rubber-ducking the issue with the missing filename in the tar.gz (when I did nc | pigz > ha.tar.gz) it hallucinated about how to fix that, telling me to try:
pigz -n(wrong);or
pigz --name homeassistant.tar > homeassistant.tar.gz(also wrong);or to go back to
gzip(also wrong);or to use
pigzbut then uncompress withgunzipand then re-compress withgzip -n homeassistant.tar?! (very wrong, and defeats the point of usingpigz)
I told it to go read the manual for pigz because option 1 was clearly wrong. That gave me these options:
nc | pigz -N > homeassistant.tar.gz(wrong; pigz had no input filename because of stdin)nc > homeassistant.tar; pigz homeassistant.tarnc | pigz --name homeassistant.tar > homeassistant.tar.gz(we've been here before)
Option 2 turned out to be the right one as we saw above. I knew this was a possibility, but didn't want to have to go that route as it would result in the process being slower. Claude helped me prove this was the only option. Again, small time saver.
For the difference between BSD vs GNU tar, I already knew about the differences in format, but asked Claude to explain the difference in formats just to see what it would give me. Here it actually gave me a very useful (and correct) answer on the differences, especially with BSD tar on a Mac and why the ._backup.json has the underscore. (TL;DR extended attributes, POSIX, magic numbers).