Last year, I sold my laptop. I owned it for about 15 months, and in that time, I took it outside on maybe three occasions. The laptop hardly left my desk. I didn’t even open the lid, because most of the time, it was just plugged into an external monitor, speakers, keyboard, and mouse. So, I decided to ditch the laptop on craigslist and get a Mac mini instead. My first Mac mini broke, so I ordered another. The process was really simple. There’s three different Apple Stores within 15 minutes driving distance of my apartment, and all of them stock my base model Core i5 Mac mini. There’s even a courier service where Postmates will hand-deliver Apple products right to your door within 2 hours if you live close enough. I liked the extra processor cores, the extra I/O ports, the smaller desk footprint, and the lack of laptop-specific problems (like battery health, a fragile keyboard, or thermal constraints). So, for the first time in a while, I didn’t own a laptop1 and everything was fine. But over the holidays, I found the old abandoned MacBook Pro2 that I left at my parent’s house, and since I didn’t have a computer, I reclaimed it.
Until now, I had only ever used one Mac and one Linux machine. I used the Mac for web browsing and writing code and watching videos, while I used the Linux machine for building code and staging big uploads. My workflow didn’t require two Macs, so I hadn’t really thought about how an extra laptop could be incorporated. I want to use the Mac mini while I’m at home, but also get an equivalent experience on the MacBook for the few occasions I need to leave the house with a computer. Luckily, I’ve worked on a few projects over the years that made it really easy to expand from one Mac to two. Specifically, I want:
- The same installed software and settings
- Bi-directional syncing of my append-only folder of memes and screenshots
- My git repositories of source code and config files
The third item is pretty easy, since I’ve always pushed my git repositories to the home folder on my Linux machine (where it gets backed up into Ef), so I’ll just talk about how I accomplish the first two with a few bits of custom software.
Provisioning a Mac
When talking about data backups, I think a lot of people overlook their installed software and settings. “Setting up” your new computer might sound like a lot of fun, but since I re-install macOS on a 6-month cadence, it’d be tedious to re-install the same programs and configure the same settings every single time. Plus, a lot of Mac system settings don’t automatically sync between your computers, which means an inconsistent experience for anyone with multiple Macs. Luckily, there’s Homebrew and other software package managers for macOS. But my personal Mac provisioner “mini” goes one step further3.
I wrote mini with three goals. First, it should have a very realistic “dry run” mode, which will show me proposed changes before actually doing them. Second, it should be able to recover from most kinds of errors, because until my Mac is set up, I’d probably have a hard time fixing any bugs in its code. Third, it should automate every part of Mac setup that can reasonably be automated. I’ve seen lots of host provisioner software with superficial or missing “dry run” modes, which means running the provisioner carries a lot of risk of breaking your system. I wanted mini to be executed on a regular basis, so I could use it to keep the software on both of my Macs in sync.
Among the thing that mini does are:
- Install homebrew packages (and homebrew itself)
- Link my dotfiles
- Clone my git repositories and set their remotes appropriate
- Extend the display sleep timer to 3 hours while on A/C power
- Choose between natural scrolling (MacBook) and reverse scrolling (Mac mini)
- Increase my mouse tracking and scrolling speed
- Set my Dock preferences and system preferences
- Configure passwordless sudo for the Mac firewall CLI
- Disable IPv6 temporary addresses
- Prepare GnuPG to work with my YubiKey
- Change the screenshot image format from png to jpeg
When I set up a new Mac, the first thing I do is download the latest build of mini from my NUC and run it. Once it’s done, I’ll have my preferred web browser and all of my source code on the new machine, so I can continue with all of the provisioning steps that can’t be automated (but can be documented, of course). This includes things like installing apps from the Mac App Store and requesting certificates from my personal PKI hierarchy.
Like its predecessor, mini is written in Python. But since it’s no longer open source, I’m able to make use of all of my personal Python libraries for things like plist editing, flag parsing, and formatted output. I package this code into a single file using Bazel and the Google Subpar project, so it can be easily downloaded via HTTP to a new Mac.
Most of mini’s functionality is bundled into Task classes, which implement one step of the provisioning process (like installing a Homebrew package). There are no dependencies allowed between tasks, so the Homebrew package installation task will also install Homebrew itself if needed. All of the Task classes provide Done() → bool and Do(dry_run: bool) methods. Additional arguments can be passed to the constructor. Additionally, every task has a unique repr value, which is used for logging and to allow me to skip particular tasks with a command-line flag.
For consistency, mini provides utility functions for running commands, taking ownership of files, and asking for user input. Some of these functions also implement their own dry run modes, so Task classes can decide which commands to run and which to just simulate. There’s also a framework for trace logging, which collects useful information for error reporting and debugging. There are lots of places where steps can be retried or skipped, which improves mini’s robustness against unexpected errors.
I’ve always had a lot of skepticism about file syncing software. For example, if your sync client wakes up to find a file missing, then it’ll assume that you deleted it, which causes the deletion to be synced to your other computers. You won’t even notice this happening unless you regularly look at your sync activity. I think file syncing should be deliberate, so a human gets a chance to approve the changes, and explicit, so syncing only occurs when files aren’t being actively changed. Git repositories already have these two properties, since changes aren’t synced until you commit and push them. But file syncing software tends not to require this level of attention.
My file syncing needs are a bit unique, because I only add files. I never delete files, and I rarely ever edit files after the first version. These limitations make it easier to guarantee that my files aren’t corrupted, no matter how many times they’re transferred to new computers over the years. These files include scanned receipts, tax documents, medical records, screenshots, and lots of memes. In other words, they’re mostly small media files. On the other hand, I store frequently changed files in my git repositories, and I store big media files in my reduced redundancy storage system (so they can be offloaded when not needed). Previously, I just synced these files from my Mac to my Linux machine, where they’re backed up into Ef. Now, with the addition of my new Mac, I need to ensure that both of my Mac computers have updated copies of these files.
File syncing starts with the ingestion of new data into the sync folder. My “bar” script identifies new files on my desktop or in my Downloads folder, then moves them into the appropriate subfolder. This step also includes renaming files as needed (for example, timestamp prefixes for scanned documents, in order to ensure that file names are unique). Next, I pull down any new files from my Linux machine using rsync and the ignore-existing flag. I also pull down the SHA1 sums file from my Linux machine and put that in a temporary directory.
All changes to my sync folder need to be approved. My “foo” tool shows the proposed changes, represented as a delta on the SHA1 sums file. Any parts of the delta that are already included in the Linux machine’s sums file are marked as “expected”, since these reflect changes pushed by a different Mac. If the delta is accepted, then foo updates the local sums file and pushes changes to the Linux machine using rsync.
There are a lot of extra features baked into foo. For example, it identifies duplicate files, in case I accidentally save something under two different names. In addition to SHA1 sums, foo also stores file size and modification times, which it uses as a performance optimization to avoid reading unchanged files. Also, foo is generally a very fast multi-threaded file checksum program, which lets me quickly check files for data corruption. Did I mention it also supports GnuPG signing and verification of sum files? Anyway, let’s move on.
The sums file is the only file in the sync folder that’s ever modified in place, so it’s easy to tell at a glance when unexpected modifications or deletions occur. Additionally, ef also uses an append-only data model, which makes it easy to recover old versions of files.
I also added an offline mode to the file syncing process, which skips the parts that require communication with the Linux machine. I run the whole process every night (when there’s new data to add) before bed, so the script is aptly named bedtime. With the addition of my new Mac, I run bedtime on my laptop when I turn it on, to get it caught up with the latest changes. If there’s new software to be installed, I’ll sync down a fresh build of mini and run that. And maybe I’ll give my git repositories a pull if I’m working on code. Altogether, these pieces make up a pretty robust syncing system for my two Macs that meets my standard for data integrity. How does it compare with your own methods? Let me know in the comments below.