Building Complex Things

03 Mar 2025

I’m always fascinated when people build complex things, not so much by the final artifact but by the journey they travelled to get to the end result. Projects are very rarely plotted with a straight line from problem statement to final solution but when all you see is the final product it’s easy to discount the work that went into its creation with all the interesting choices and solutions to problems you’ll never know existed.

Here’s a retelling of a journey I went on to build a macOS virtualised CI solution. This is not a how to guide but in theory if you follow along you could build out your own working solution.

It Began

Our CI set up at work is a few Mac Studios that each have two CI runners installed per physical machine. The machines themselves had all been configured with a well crafted script that my colleague Sam made that installed all required tooling and got the environment into a good shape. The problem that we kept facing is that on first set up the machines were in a known good state but after that it was the wild west. Anyone could remote onto the machine and run anything they liked, whilst no one would ever do anything malicious, over time the machines drift or just generally become less healthy and builds become less repeatable.

I’d been thinking for a while about virtualisation and repeatable builds. All of our backend is set up around docker which gives you this but wouldn’t it be nice to have it for our builds that required macOS.

When Apple brought out the M1 chips they made virtualisation easier and they even had sample code for creating virtual machines. Although this was cool I didn’t get much further than having a play around as it felt like a lot more work would be required for me to use the technology.

I knew that people had gotten virtualisation to work to the point where they were offering cloud based CI services and after a bit of research I stumbled upon tart, which I bookmarked and then didn’t look at for months. It wasn’t until I saw some Tweets/Toots/Skeets (whatever the bloody platform was at the time) from Simon B. Støvring talking about his CI work that I started investigating properly.

The first virtual machine

Despite my hesitation on getting started it actually was quite simple to create a virtual machine and begin to mess around (I kicked myself for putting off trying sooner).

After installing tart from homebrew

brew install cirruslabs/cli/tart

It’s then a case of providing the url for an IPSW to a tart create invocation, that will then download the IPSW and create a virtual machine.

tart create example \
  --disk-size=120 \
  --from-ipsw=https://updates.cdn-apple.com/2024FallFCS/fullrestores/072-30094/44BD016F-6EE3-4EE5-8890-6F9AA008C537/UniversalMac_15.1.1_24B91_Restore.ipsw

The IPSW link above will no doubt go out of date pretty quickly but you can find the download urls at https://ipsw.me. I know the website looks a bit scammy and has adverts everywhere but it’s providing links to Apple domains so ¯_(ツ)_/¯.

After running the above (which will take a small eternity as the IPSW is a big download, don’t worry the download is cached for future invocations) you end up with… your terminal prompt back. To actually run the machine you just created you need to run

tart run example

example is the name I passed to the create command - if you create a machine with a different name make sure to run that instead.

This will boot the machine and allow you to start configuring the OS the same as if you fired up new macOS hardware for the first time. This was pretty exciting and brought back memories of the Dave Verwer course where I first learnt iOS and specifically the cool feeling from the first time deploying code I’d written onto a physical device.

At this point I was still a long way from the end goal but I’d made a start so the momentum kept me moving. The first thing I wanted to prove was could I actually get projects to build inside the virtual machine and was the performance alright. I think I discovered that I could get our project to build but for some reason the tests just would not run. When I hit this road block I moved things to the back burner again.

Asking for help

Weeks had passed and I was still keen to make this work and from following Simon B. Støvring on Mastodon I’d seen more posts he had made and discovered all the documentation that had been written for some tooling he’d made called Tartelet (see tartelet docs). Sure that I could make this work I tried again but kept getting the same result, I eventually just reached out on slack and got helpful responses from Simon and someone with the handle biscuit, which suggested two things to try

prewarming simulators using something like https://github.com/biscuitehh/yeetd/blob/main/Resources/prewarm_simulators.sh
bumping up the available memory on the virtual machines

I can’t specifically remember which one of the above it was but I now had proved to myself that I could indeed run our project inside virtual machines including the tests and the performance wasn’t noticeably worse than our existing set up.

Repeatable Configuration

The next thing I wanted to achieve was creating machines repeatably - for this I turned to the Tartelet docs mentioned before. The Tartelet docs talk you through sensible configuration to use for a CI runner but the configuration is done using the GUI. The problem I had is that I’m the type of person who locks my front door, walks to my car then turns around to check the front door is locked. This means that I just didn’t feel comfortable having to manually configure machines incase I mess a step up or mistype something.

After a bit of research I found the company that created tart (remember they provide a CI service) also has a repo that contains their configuration for building machines here. The first take away is that they are using Packer to provision machines. In this case Packer is using a tart plugin, which is using the tart tool under the hood. So after installing packer with mise and then the tart plugin I was set to explore building machines from scratch using just code.

mise install packer
packer plugins install github.com/cirruslabs/tart

Using the cirrus labs templates as a starting point I ended up with something like this to build a Sequoia machine

packer {
  required_plugins {
    tart = {
      version = ">= 1.12.0"
      source = "github.com/cirruslabs/tart"
    }
  }
}

source "tart-cli" "tart" {
  from_ipsw = "https://updates.cdn-apple.com/2024FallFCS/fullrestores/072-30094/44BD016F-6EE3-4EE5-8890-6F9AA008C537/UniversalMac_15.1.1_24B91_Restore.ipsw"
  vm_name = "base-sequoia"
  cpu_count = 8
  memory_gb = 8
  disk_size_gb = 100
  headless = false
  ssh_password = "runner"
  ssh_username = "runner"
  ssh_timeout = "120s"
  boot_command = [
    // hello, hola, bonjour, etc.
    // > Tap get started
    "<wait60s><spacebar>",

    // Language
    // > Typing english gets us to English UK
    "<wait10s>english<enter>",

    // Select Your Country or Region
    "<wait20s>united kingdom<leftShiftOn><tab><leftShiftOff><spacebar>",

    // Written and Spoken Languages
    // > Tap continue
    "<wait10s><leftShiftOn><tab><leftShiftOff><spacebar>",

    // Accessibility
    // > Tap Not now
    "<wait10s><leftShiftOn><tab><leftShiftOff><spacebar>",

    // Data & Privacy
    // > Tap continue
    "<wait10s><leftShiftOn><tab><leftShiftOff><spacebar>",

    // Migration Assistant
    // > Tap Not now
    "<wait10s><tab><tab><tab><spacebar>",

    // Sign In with Your Apple ID
    // > Tap Set up later
    "<wait10s><leftShiftOn><tab><leftShiftOff><leftShiftOn><tab><leftShiftOff><spacebar>",

    // Are you sure you want to skip signing in with an Apple ID?
    // > Tap Skip
    "<wait10s><tab><spacebar>",

    // Terms and Conditions
    // > Tap Agree
    "<wait10s><leftShiftOn><tab><leftShiftOff><spacebar>",

    // I have read and agree to the macOS Software License Agreement
    // > Tap Agree
    "<wait10s><tab><spacebar>",

    // Create a Computer Account
    // > Set username, account name, password all to runner
    "<wait10s>runner<tab><tab>runner<tab>runner<tab><tab><tab><spacebar>",

    // Enable Location Services
    // > Deselect and tap continue
    "<wait30s><leftShiftOn><tab><leftShiftOff><spacebar>",

    // Are you sure you don't want to use Location Services?
    // > Tap continue
    "<wait10s><tab><spacebar>",

    // Select Your Time Zone
    // > Type UTC and tap continue
    "<wait10s><tab><tab>UTC<enter><leftShiftOn><tab><tab><leftShiftOff><spacebar>",

    // Analytics
    // > Tap continue
    "<wait10s><leftShiftOn><tab><leftShiftOff><spacebar>",

    // Screen Time
    // > Tap Set up later
    "<wait10s><tab><spacebar>",

    // Siri
    // > Deselect and tap continue
    "<wait10s><tab><spacebar><leftShiftOn><tab><leftShiftOff><spacebar>",

    // Choose your look
    // > Select light mode
    "<wait10s><leftShiftOn><tab><leftShiftOff><spacebar>",

    // Welcome to Mac
    "<spacebar>",

    // Open terminal
    "<wait10s><leftAltOn>n<leftAltOff><wait3s><leftAltOn><leftShiftOn>g<leftShiftOff><leftAltOff>/Applications/Utilities/Terminal.app<enter><wait3s><leftAltOn>o<leftAltOff><wait3s>defaults write NSGlobalDomain AppleKeyboardUIMode -int 3<enter><wait5s><leftAltOn>q<leftAltOff>",

    // Open system settings
    "<wait10s><leftAltOn>n<leftAltOff><wait3s><leftAltOn><leftShiftOn>g<leftShiftOff><leftAltOff>/Applications/System Settings.app<enter><wait3s><leftAltOn>o<leftAltOff>",

    // Search for 'sharing'
    "<wait10s><leftAltOn>f<leftAltOff>sharing<enter>",

    // Tab to 'Screen Sharing' and enable it
    "<wait10s><tab><tab><tab><tab><tab><spacebar>",

    // Navigate to 'Remote Login' and enable it
    "<wait10s><tab><tab><tab><tab><tab><tab><tab><tab><tab><tab><tab><tab><spacebar>",

    // Close settings
    "<wait5s><leftAltOn>q<leftAltOff>"
  ]

  // A workaround for Virtualization.Framework's installation process not fully finishing in a timely manner
  create_grace_time = "30s"
  run_extra_args = []
}

build {
  sources = ["source.tart-cli.tart"]

  provisioner "shell" {
    inline = [
      "echo '🟢 \\033[1mEnable passwordless sudo\\033[0m'",
      "echo '└── echo runner | sudo -S sh -c \"mkdir -p /etc/sudoers.d/; echo \\047runner ALL=(ALL) NOPASSWD: ALL\\047 | EDITOR=tee visudo /etc/sudoers.d/runner-nopasswd\"'",
      "echo runner | sudo -S sh -c \"mkdir -p /etc/sudoers.d/; echo 'runner ALL=(ALL) NOPASSWD: ALL' | EDITOR=tee visudo /etc/sudoers.d/runner-nopasswd\""
    ]
  }

  provisioner "shell" {
    inline = [
      "echo '🟢 \\033[1mEnable autologin\\033[0m'",
      "echo '├── printf \\047\\x0f\\xfc\\x3c\\x4d\\xb7\\xce\\xdd\\x8d\\x65\\xd0\\x6c\\x2c\\047 > /tmp/kcpassword'",
      "printf '\\x0f\\xfc\\x3c\\x4d\\xb7\\xce\\xdd\\x8d\\x65\\xd0\\x6c\\x2c' > /tmp/kcpassword",
      "echo '├── sudo mv /tmp/kcpassword /etc/kcpassword'",
      "sudo mv /tmp/kcpassword /etc/kcpassword",
      "echo '├── sudo chmod 600 /etc/kcpassword'",
      "sudo chmod 600 /etc/kcpassword",
      "echo '├── sudo chown root:wheel /etc/kcpassword'",
      "sudo chown root:wheel /etc/kcpassword",
      "echo '└── sudo defaults write /Library/Preferences/com.apple.loginwindow autoLoginUser runner'",
      "sudo defaults write /Library/Preferences/com.apple.loginwindow autoLoginUser runner"
    ]
  }

  provisioner "shell" {
    inline = [
      "echo '🟢 \\033[1mDisable screensaver\\033[0m'",
      "echo '├── sudo defaults write /Library/Preferences/com.apple.screensaver loginWindowIdleTime 0'",
      "sudo defaults write /Library/Preferences/com.apple.screensaver loginWindowIdleTime 0",
      "echo '└── defaults -currentHost write com.apple.screensaver idleTime 0'",
      "defaults -currentHost write com.apple.screensaver idleTime 0"
    ]
  }

  provisioner "shell" {
    inline = [
      "echo '🟢 \\033[1mDisable sleeping\\033[0m'",
      "echo '├── sudo systemsetup -setdisplaysleep Off 2> /dev/null'",
      "sudo systemsetup -setdisplaysleep Off 2> /dev/null",
      "echo '├── sudo systemsetup -setsleep Off 2> /dev/null'",
      "sudo systemsetup -setsleep Off 2> /dev/null",
      "echo '└── sudo systemsetup -setcomputersleep Off 2> /dev/null'",
      "sudo systemsetup -setcomputersleep Off 2> /dev/null"
    ]
  }

  provisioner "shell" {
    inline = [
      "echo '🟢 \\033[1mDisable screen lock\\033[0m'",
      "echo '└── sysadminctl -screenLock off -password runner'",
      "sysadminctl -screenLock off -password runner"
    ]
  }

  provisioner "shell" {
    inline = [
      "echo '🟢 \\033[1mDisable spotlight indexing\\033[0m'",
      "echo '└── sudo mdutil -a -i off'",
      "sudo mdutil -a -i off"
    ]
  }
}

If this is stored in a file called sequoia.pkr.hcl and you run packer build sequoia.pkr.hcl and then wait you’ll end up with a new machine in tart called base-sequoia that has various basic things configured like passwordless login, turning off screensaver etc. Arriving at the above configuration took a short eternity with a lot of trial and error tweaking things and rerunning.

Some notes on the hcl file above:

hcl is a configuration language that has all kinds of features, which I did start using but then in the spirit of not wanting future maintainers of this tooling having to learn yet another thing I opted to entirely wrap it. The above is generated by calling a thin kotlin dsl I wrote, which has a few advantages in my mind

Other team mates can more easily help maintain things by calling the Kotlin dsl and not having to worry about learning hcl
With a dsl I can make things like <leftShiftOn><tab><leftShiftOff> safer by managing the on/off state
IDEs are going to be miles better at supporting Kotlin as opposed to a custom markup language
I’m in control of the dsl output, which allowed me to do pretty printing of the commands

That last bullet deserves some expansion. By Default invoking shell commands in packer won’t actually tell you what is being invoked. In the output all you see is lines like Provisioning with shell script: /var/folders/wl/92q0hw051vnbwncp7lxsp3m80000gq/T/packer-shell2461613400. When you are debugging configuration or even just want to see the progress it’s super helpful to be able to see where you are up to especially in case of failure. With this in mind I made it so in the dsl you’d call a function like this

script("Disable sleeping") {
    """
    sudo systemsetup -setdisplaysleep Off 2> /dev/null
    sudo systemsetup -setsleep Off 2> /dev/null
    sudo systemsetup -setcomputersleep Off 2> /dev/null
    """.trimIndent()
}

This would generate this hcl configuration

  provisioner "shell" {
    inline = [
      "echo '🟢 \\033[1mDisable sleeping\\033[0m'",
      "echo '├── sudo systemsetup -setdisplaysleep Off 2> /dev/null'",
      "sudo systemsetup -setdisplaysleep Off 2> /dev/null",
      "echo '├── sudo systemsetup -setsleep Off 2> /dev/null'",
      "sudo systemsetup -setsleep Off 2> /dev/null",
      "echo '└── sudo systemsetup -setcomputersleep Off 2> /dev/null'",
      "sudo systemsetup -setcomputersleep Off 2> /dev/null"
    ]
  }

At run time you’d actually see this

==> tart-cli.tart: Provisioning with shell script: /var/folders/wl/92q0hw051vnbwncp7lxsp3m80000gq/T/packer-shell2461613400
    tart-cli.tart: 🟢 Disable sleeping
    tart-cli.tart: ├── sudo systemsetup -setdisplaysleep Off 2> /dev/null
    tart-cli.tart: warning: this combination of display sleep and system sleep may prevent system sleep.
    tart-cli.tart: setdisplaysleep: Never
    tart-cli.tart: ├── sudo systemsetup -setsleep Off 2> /dev/null
    tart-cli.tart: setsleep: Never (computer, display, hard disk)
    tart-cli.tart: └── sudo systemsetup -setcomputersleep Off 2> /dev/null
    tart-cli.tart: setcomputersleep: Never

It might not be pretty but I was borrowing the lines (├─, └─) from the tree command to give a bit of structure to show that the high level step is 🟢 Disable sleeping and after that you have the commands that make this process up starting with ├─, └─ and then the std{out,err} of those commands below that.

Marvel at what the above achieved

At this point I kept running tart list and tart run base-sequoia just to marvel at the fact I had indeed got a machine working and can yield the power of starting and stopping it on command.

More boring configuration

Having a machine with no software installed is pretty boring so the next step was spending ages figuring out the incantations to get things like Xcode and Ruby (for fastlane) installed. I was doing this like a cave man by starting the machine, opening the terminal and manually typing commands (as copy/paste doesn’t work across machines) until a colleague gave a disapproving look and said “why don’t you just ssh into the machine?”. Things went faster after that.

The basic config for getting Xcode installed looked something like

packer {
  required_plugins {
    tart = {
      version = ">= 1.12.0"
      source = "github.com/cirruslabs/tart"
    }
  }
}

source "tart-cli" "tart" {
  vm_base_name = "base-sequoia"
  vm_name = "xcode"
  cpu_count = 8
  memory_gb = 8
  disk_size_gb = 100
  headless = true
  ssh_password = "runner"
  ssh_username = "runner"
  ssh_timeout = "120s"
  run_extra_args = []
}

build {
  sources = ["source.tart-cli.tart"]

  provisioner "shell" {
    inline = [
      "echo '🟢 \\033[1mCreate directory for artifacts we want to install\\033[0m'",
      "echo '└── mkdir -p /Users/runner/Downloads/packer'",
      "mkdir -p /Users/runner/Downloads/packer"
    ]
  }

  provisioner "file" {
    sources = [
      pathexpand("~/XcodesCache/Command_Line_Tools_for_Xcode_16.1.dmg"),
      pathexpand("~/XcodesCache/Xcode_16.1.xip"),
      pathexpand("~/XcodesCache/iOS_18.1_Simulator_Runtime.dmg")
    ]
    destination = "/Users/runner/Downloads/packer/"
  }

  provisioner "shell" {
    inline = [
      "echo '🟢 \\033[1mCreate ~/.zprofile\\033[0m'",
      "echo '└── echo \"export LANG=en_US.UTF-8\" >> ~/.zprofile'",
      "echo \"export LANG=en_US.UTF-8\" >> ~/.zprofile"
    ]
  }

  provisioner "shell" {
    inline = [
      "echo '🟢 \\033[1mInstall command line tools\\033[0m'",
      "echo '├── hdiutil attach \"/Users/runner/Downloads/packer/Command_Line_Tools_for_Xcode_16.1.dmg\"'",
      "hdiutil attach \"/Users/runner/Downloads/packer/Command_Line_Tools_for_Xcode_16.1.dmg\"",
      "echo '├── sudo installer -pkg \"/Volumes/Command Line Developer Tools/Command Line Tools.pkg\" -target \"/Volumes/Macintosh HD\"'",
      "sudo installer -pkg \"/Volumes/Command Line Developer Tools/Command Line Tools.pkg\" -target \"/Volumes/Macintosh HD\"",
      "echo '└── hdiutil detach \"/Volumes/Command Line Developer Tools\"'",
      "hdiutil detach \"/Volumes/Command Line Developer Tools\""
    ]
  }

  provisioner "shell" {
    inline = [
      "echo '🟢 \\033[1mInstall homebrew\\033[0m'",
      "echo '├── /bin/bash -c \"$(curl -fsSL https://raw.githubusercontent.com/Homebrew/install/HEAD/install.sh)\"'",
      "/bin/bash -c \"$(curl -fsSL https://raw.githubusercontent.com/Homebrew/install/HEAD/install.sh)\"",
      "echo '├── eval \"$(/opt/homebrew/bin/brew shellenv)\"'",
      "eval \"$(/opt/homebrew/bin/brew shellenv)\"",
      "echo '├── echo \\047eval \"$(/opt/homebrew/bin/brew shellenv)\"\\047 >> ~/.zprofile'",
      "echo 'eval \"$(/opt/homebrew/bin/brew shellenv)\"' >> ~/.zprofile",
      "echo '├── echo \\047export HOMEBREW_NO_AUTO_UPDATE=1\\047 >> ~/.zprofile'",
      "echo 'export HOMEBREW_NO_AUTO_UPDATE=1' >> ~/.zprofile",
      "echo '└── echo \\047export HOMEBREW_NO_INSTALL_CLEANUP=1\\047 >> ~/.zprofile'",
      "echo 'export HOMEBREW_NO_INSTALL_CLEANUP=1' >> ~/.zprofile"
    ]
  }

  provisioner "shell" {
    inline = [
      "echo '🟢 \\033[1mInstall mise\\033[0m'",
      "echo '├── curl https://mise.run | sh'",
      "curl https://mise.run | sh",
      "echo '├── export PATH=\"$HOME/.local/bin:$PATH\"'",
      "export PATH=\"$HOME/.local/bin:$PATH\"",
      "echo '├── echo \\047eval \"$(~/.local/bin/mise activate zsh)\"\\047 >> ~/.zprofile'",
      "echo 'eval \"$(~/.local/bin/mise activate zsh)\"' >> ~/.zprofile",
      "echo '├── mkdir -p ~/.config || true'",
      "mkdir -p ~/.config || true",
      "echo '├── echo \\047[alias]\\047 >> ~/.config/mise.toml'",
      "echo '[alias]' >> ~/.config/mise.toml",
      "echo '└── echo \"xcodes = \\047asdf:younke/asdf-xcodes\\047\" >> ~/.config/mise.toml'",
      "echo \"xcodes = 'asdf:younke/asdf-xcodes'\" >> ~/.config/mise.toml"
    ]
  }

  provisioner "shell" {
    inline = [
      "echo '🟢 \\033[1mInstall Xcode\\033[0m'",
      "echo '├── \"$HOME/.local/bin/mise\" trust --yes'",
      "\"$HOME/.local/bin/mise\" trust --yes",
      "echo '├── \"$HOME/.local/bin/mise\" use xcodes@\"1.6.0\" --yes'",
      "\"$HOME/.local/bin/mise\" use xcodes@\"1.6.0\" --yes",
      "echo '├── eval \"$(\"$HOME/.local/bin/mise\" activate --shims)\"'",
      "eval \"$(\"$HOME/.local/bin/mise\" activate --shims)\"",
      "echo '├── xcodes install \"16.1\" --path \"/Users/runner/Downloads/packer/Xcode_16.1.xip\" --experimental-unxip --empty-trash'",
      "xcodes install \"16.1\" --path \"/Users/runner/Downloads/packer/Xcode_16.1.xip\" --experimental-unxip --empty-trash",
      "echo '├── sudo xcodes select \"16.1\"'",
      "sudo xcodes select \"16.1\"",
      "echo '├── xcodebuild -runFirstLaunch'",
      "xcodebuild -runFirstLaunch",
      "echo '└── xcrun simctl runtime add \"/Users/runner/Downloads/packer/iOS_18.1_Simulator_Runtime.dmg\"'",
      "xcrun simctl runtime add \"/Users/runner/Downloads/packer/iOS_18.1_Simulator_Runtime.dmg\""
    ]
  }

  provisioner "shell" {
    inline = [
      "echo '🟢 \\033[1mInstall Ruby\\033[0m'",
      "echo '├── eval \"$(/opt/homebrew/bin/brew shellenv)\"'",
      "eval \"$(/opt/homebrew/bin/brew shellenv)\"",
      "echo '├── brew install readline libyaml gmp'",
      "brew install readline libyaml gmp",
      "echo '└── \"$HOME/.local/bin/mise\" use ruby@\"3.3.0\"'",
      "\"$HOME/.local/bin/mise\" use ruby@\"3.3.0\""
    ]
  }

  provisioner "shell" {
    inline = [
      "echo '🟢 \\033[1mDelete install artifacts\\033[0m'",
      "echo '└── rm -rf /Users/runner/Downloads/packer'",
      "rm -rf /Users/runner/Downloads/packer"
    ]
  }
}

There’s a few things to note with the above:

Duplicated hardcoded version numbers - in reality these are interpolated in and not manually maintained.
Assets related to installing Xcode are copied from the host machine to avoid needing to auth with Apple to download things.
Homebrew is installed after Xcode command line tools - again to avoid being asked to auth with Apple to download the installer.

Marvel some more

Again it was time to reflect on how far we’d come and how it wasn’t easy. Figuring out the exact shell commands to install all the things you need was a combination of Google, ChatGPT, talking to colleagues and just trying loads of things.

Between the two packer files above I learnt a load of new things

Various admin commands to disable screensavers and screen locks
What on earth kcpassword was
Mounting/unmounting volumes
More in depth knowledge of installing/working with mise
Ruby is still an absolute pain to install right

To even get this far though there’s a lot of foundational knowledge that I used

Being comfortable with the command line
Having the ability to navigate directories/files from the command line
Understanding environment variables in shell environments
Understanding file permissions
Knowing which commands look safe to run that were found on random internet searches
Knowing when to use sudo and not just using it blindly causing issues later on

What about CI?

Now we had machines running the next step was actually hooking up to the CI machinery to run some workloads. There were many misfires at this stage with trying different approaches. The most promising was to add more provisioning steps to install the CI runner inside the virtual machine and then orchestrate having 2 virtual machines booted that self registered with the CI server to run workloads. Once the virtual machine had run a workload it would then need to destroy itself and another machine be spun up in its place.

Although the above worked and seemed reasonable it was tricky and means that you always have to have virtual machines fired up even when not in use. It also felt a bit wrong having the virtual machines needing to know about CI when that information could be hidden from them - for instance I often just spin up the virtual machines to try things out in a clean environment but I don’t want to worry about it adding my personal machine to the CI work pool. Discussing this with a colleague we came to the conclusion that actually we could have a couple of CI runners on the bare metal and when they receive a work load they would clone a virtual machine and boot it, then copy the source files into the virtual machine before running the job. If you squint hard enough this feels a bit like docker where you’d have various layers being created to configure the environment then you’d copy your source code in and operate on that.

We messed around for a while trying to do this with combinations of tart commands directly and then one of us (can’t remember who) had the spark that we should just use Packer again for this as it has already massively simplified the task of copying files to/from the virtual machine as well as running shell tasks.

With this in mind we need a new packer file:

packer {
  required_plugins {
    tart = {
      version = ">= 1.12.0"
      source = "github.com/cirruslabs/tart"
    }
  }
}

source "tart-cli" "tart" {
  vm_base_name = "xcode"
  vm_name = "runner"
  cpu_count = 8
  memory_gb = 8
  disk_size_gb = 100
  headless = true
  ssh_password = "runner"
  ssh_username = "runner"
  ssh_timeout = "120s"
}

build {
  sources = ["source.tart-cli.tart"]

  provisioner "shell" {
    inline = [
      "echo '🟢 \\033[1mCreate workspace directory\\033[0m'",
      "echo '└── mkdir -p $HOME/workspace'",
      "mkdir -p $HOME/workspace"
    ]
  }

  provisioner "file" {
    source = "./"
    destination = "$HOME/workspace"
    direction = "upload"
  }

  provisioner "shell" {
    inline = [
      "echo '🟢 \\033[1mRun the entry point\\033[0m'",
      "echo '├── cd $HOME/workspace'",
      "cd $HOME/workspace",
      "echo '└── echo Hello, World!'",
      "echo 'Hello, World!'"
    ]
  }
}

This packer file will upload the current directory into the vm, then run echo 'Hello, World!'. Although this is pretty pointless it does demonstrate that everything works and we have the following:

A clean machine is run by cloning the xcode machine
We can copy our source code into the virtual machine
We can execute arbitrary code as an entry point

Most CI solutions will allow you to create a template or reuse configuration. We use GoCD at work, which is “interesting” but we was able to find a way to make it so we could have a template that will invoke something like the packer file above but allow each pipeline to provide the entry point to invoke inside the virtual machine.

Entry points and configuration

When we started I think I was tunnel visioned and made it so that in the CI template if you provided an entry point of publish it would follow a convention of looking for an executable file in your repo called .gocd/publish. Although this worked it wasn’t super discoverable and if you wanted to share configuration between pipelines you were pretty much forced to start sourcing bash files from bash files 🤮.

The final straw for this set up was when we needed to pass environment variables to the workloads. My colleague took the fun task of changing this set up to instead read a configuration file written in json that describes the entry point, environment variables and any other config you might want to pass to a pipeline.

Getting data out of the virtual machine

When CI fails it’s helpful to know why and sometimes those details don’t appear in the logs but in various artifacts scattered throughout build folders. Packer gives a mechanism for this by allowing you to register an action on failure, in these cases we opt to copy files from a known directory to the host so they can be uploaded to the CI server before the virtual machine is deleted. Having the CI always upload files it finds in a directory is handy because it means you can put reports or artifacts in this special folder on the VM and know it will be uploaded to the CI regardless of whether there was a failure or not.

Android

Although we had plans to have Android builds run in docker to save the macOS runners we still needed to support Android because we have some Kotlin Multiplatform projects that we just want to build in one place. I’m not going to pretend to know what wizardry went on here but a big problem with the Virtualization framework is that is doesn’t support nested emulation, which means no Android emulators. To get around this my colleague created an ssh tunnel between the host and the VM and runs the emulators on the bare metal but connects Adb in the virtual machine via this tunnel. This works well.

Taking stock of things

The above represents the foundational thinking blocks that underpin the solution we built. It took quite a while to build out as it was mostly stewing as an idea whilst the pieces starting drifting together slowly from different experiments. Not all effort is equal - there we some tasks where it felt like I’d made a massive leap towards the end goal and then others where the work was important but it felt like a hard slog.

Where we ended up after some more iterations was we have a Kotlin Multiplatform command line tool that controls all of the above. There’s a dsl that hides away the hcl files and instead you have a nice Kotlin interface. The tooling has the ability to build machines in “layers” where each subsequent machine depends on the machine above existing. This means that you only need to rebuild machines (which can be slow) when required. We also have a json file that each repo uses to configure various things like caching, environment variables and the entry points themselves.

Other use cases

At some point we used Github’s virtual runners and when something went wrong it was an absolute nightmare to debug without access to the machine after the run. With this set up I’m never more than a few commands away from being in the same environment as the one that produced the failure so I can explore and try things out to get to the root issue.

Being able to build virtual machines easily in a repeatable way is empowering as it means if I want to jump on the beta release trains for macOS or Xcode versions I can do it without risking my primary machine.

I’ve also found it useful having the ability to have a clean machine with no software installed to help quickly iterate install scripts to provide to colleagues. It’s amazing how many assumptions you can make about other people’s system having similar set ups to your own which all fly out the window when you use a clean machine.

Conclusion

Working on this kicked my ass and I’d picked it up and put it down a few times but as it was a nagging idea I really wanted to see it through. Fortunately once it had some legs and I was pretty confident we could get it done I spent dedicated time on this with my colleague Jack. The end solution builds on the above adding several key things that are needed to productionise it. We’ve been using it for various macOS and Kotlin Multiplatform projects for ~6 months and all the iOS pipelines for ~2 months and despite some teething issues we’ve not really seen many issues.

In the above I’ve mentioned colleagues a few times and it’s worth stating that this project was probably only doable because of being fortunate to work with people who can solve problems, brainstorm and muck in across a wide range of technologies and tolerate listening to me talk nonsense and rewrite my stuff multiple times as I realised that the me who wrote code yesterday was an idiot.

Project Scripts

23 Feb 2025

TL;DR

Try creating a cli executable in your project that exposes common project tasks that are written in the project’s core language. This allows better contribution and less single points of failure with pockets of knowledge in the team.

Scene setting

Over time projects accumulate helper scripts to perform various admin tasks. I’ve historically tried to avoid bash as much as possible for these scripts because the projects I work on often have teams of people unfamiliar with bash or its idiosyncrasies. With this in mind I’ve then gravitated towards Ruby because I’ve always loved the language and it’s a safer choice in my mind. Unfortunately I’ve been kidding myself because as much as I love Ruby it still has the same issue as bash with people not knowing it and also it’s a right pain to make sure people’s environment are set up.

The next logical step is to just use the main project’s language for building up tasks. This is potentially easier said than done but I’ve seen success with doing it. As a mobile developer this means using Swift with SPM to build out tasks on the iOS side and Kotlin with gradle on the backend/mulitplatform parts.

The good

In taking this approach I’ve removed myself from being the single point of failure on maintaining stuff. This not only means that I don’t have to be on hand to debug things but also opens the door for easier contribution/reuse. For example with Swift being the langauge used to write an admin script other people have contributed various tasks with the obvious plus being that the whole team can much more easily adopt and understand what is being done without trying to understand cryptic personal scripts.

Using languages like Swift/Kotlin encourages me to write more reusable code than if I was just slinging bash around. For example I’d write a Github client that can be reused rather than being lazy and copy/pasta’ing curl invocations around with duplicated configuration.

You have the full power of available libraries like type safe serialisation with Codable or by pulling in something like kotlinx.serialization. I can’t even count how many times I’ve written dodgy JSON interpolation in scripts when really I should have just not been lazy and used the right tools for the job.

Debugging is a super power for these scripts even though I might end up cave man debugging (print) I love having the option to use a debugger and inspect all the things or try changing state on the fly to see what would happen.

Types, types, types… I love types and they are really handy for helping me write safe code.

The bad

Both Swift and Kotlin just aren’t that great as scripting languages even though I really want them to be. This may be a personal lack of competence but when I’m writing scripts I’m looking for super fast feedback, which means I’ll often start just curling things on the command line or opening TextMate, setting it to bash and hitting ⌘+R. With both of these I’m running code straight away with very little ceremony, which I simply can’t reliably do for Swift or Kotlin as both pretty much require that I use an IDE to help with types and missed keywords (try, await, suspend…). This may sound contradictory to Types, types, types... but at different points in the development process I value different things. Often when I am just trying things out I’m not very professional and just want to throw code around to see what works before I put on my big developer pants and do the job properly.

Another weakness is forking. In bash or Ruby I can just slap backticks around a command to have it run in a subshell and then collect the result. It’s just not that simple in Swift/Kotlin even when pulling in libraries, which I do.

Approach

I’m not 100% sold on the exact naming/layout but as a reference this is what I set up. We have a shim at the root of the project called cli, this file’s job is to essentially cd into the project that has the tasks and call swift build followed by running the built exectuable. It’s a little bit of redirection but it’s certainly easier than expecting people to remember the calling convention. My other thinking is that if we come to some standarisation that in our projects you just call ./cli to be presented with all the various admin tasks then it’s just one thing to learn.

With this in place we currently use swift-argument-parser to build a cli that has various subcommands as an example for some inspiration here’s some top level tasks that we’ve built out

OVERVIEW: A utility for working with the ios repo.

USAGE: cli <subcommand>

OPTIONS:
  -h, --help              Show help information.

SUBCOMMANDS:
  ci                      Commands the CI pipeline uses
  code-gen                Regenerate generated code for the app.
  collect-debug-info      Print information useful for getting help with debugging.
  doctor                  Help diagnose environment set up issues.
  firewall                Add/Remove firewall rules for simulator
  kmp-doctor              Update local repository to add KMP files into Xcode
  set-marketing-value     Create a branch with a commit that updates to the passed version number.
  sim                     Utilities for working with simulators.

On the Kotlin side we’ve been using clikt to perform the role of swift-argument-parser but set up is very similar.

Misfires

I spent far too long trying to use cute tricks like #!/usr/bin/env kotlin with kts files to get the “scripting” feel with the language of choice. I personally found this a complete train wreck as I had to pull in loads of dependencies using @file:DependsOn and then very quickly hit the fact that I can’t write Kotlin without an IDE. For some reason IntelliJ was giving me no help with limited syntax highlighting and no suggestions. To actually get anything working I had to create a project, import dependencies in the normal way and then once I had code that worked and had all the right imports etc I could copy/pasta it over. At which point I sat scratching my head wondering why my brain hadn’t stopped me doing such a ridiculous thing - e.g. if I only committed the kts file I’d be committing the lossy version of my work that is hard to debug or work with.

Collecting debug information

04 Jul 2024

I work in a team with many colleagues where we are responsible for several code bases. Often if someone has an issue with running a project you end up either assuming you’ll have the same environment and forget to ask or spend time probing for details about the person’s system. I think this is an ideal case for putting a small script in your project that will collect information that will be generally useful for helping debug project level issues.

For example on an iOS project I might have a script like this as a starting point

bin/collect-debug-info

#!/bin/bash

cat << EOF
OS: $(sw_vers --productName) $(sw_vers --productVersion) ($(sw_vers --buildVersion))
Git: $(git rev-parse --abbrev-ref HEAD) ($(git rev-parse HEAD))
Xcode: $(xcode-select -p)

Simulators:
$(xcrun simctl list devices booted)

Mise $(mise --version):
$(mise list --current)
EOF

An example output might be:

OS: macOS 14.3.1 (23D60)
Git: main (aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa)
Xcode: /Applications/Xcode-15.4.0.app/Contents/Developer

Simulators:
== Devices ==
-- iOS 16.4 --
-- iOS 17.0 --
-- iOS 17.0 --
-- iOS 17.2 --
-- iOS 17.4 --
-- iOS 17.5 --
    iPhone 11 (AAAAAAAA-AAAA-AAAA-AAAA-AAAAAAAAAAAAE) (Booted)

Mise 2024.7.0 macos-arm64 (e518900 2024-07-03):
jq              1.7.1    ~/src/ios/my-proj/.mise.toml latest
ruby            3.3.0    ~/src/ios/my-proj/.mise.toml 3.3.0
swiftformat     0.53.9   ~/src/ios/my-proj/.mise.toml 0.53.9
swiftlint       0.55.0   ~/src/ios/my-proj/.mise.toml 0.55.0
tuist           4.17.0   ~/src/ios/my-proj/.mise.toml 4.17.0
xcodes          1.4.1    ~/src/ios/my-proj/.mise.toml 1.4.1

Now when someone asks for help and I suspect there might be environment issues I can just ask for the output of bin/collect-debug-info and we’ll be up to speed debugging in no time. This is the kind of script you can build up over time and add all kinds of useful info as and when you decide it would be useful to collect.

Older Newer

paul-samuels.com