Command line arguments with user defaults

You can use UserDefaults as a simple way to get the arguments passed to an app on launch without having to write any command line parsing. The basic capability is that you can pass an argument like -example some-string as a launch argument and this will be readable by defaults like this:

UserDefaults.standard.string(forKey: "example") //=> "some-string"

Supported types

UserDefaults supports a few types Array, Dictionary, Boolean, Data, Date, Number and String. It is possible to inject data and have them be understood in any of these types, the key is recognising that you need to use the same representation that plists use.

Array

Arrays are heterogeneous and can be represented like this

// -example <array><string>A</string><integer>1</integer></array>
UserDefaults.standard.array(forKey: "example") //=> Optional([A, 1])

Dictionary

Any key value pair

// -example <dict><key>A</key><integer>1</integer><key>B</key><integer>2</integer></dict>
UserDefaults.standard.dictionary(forKey: "example") //=> Optional(["B": 2, "A": 1])

Boolean

This can be represented by many variants. All of <true/>, 1 and true will count as true whereas <false/>, 0 and false will count as false.

// -example <true/>
UserDefaults.standard.bool(forKey: "example") //=> true

Data

// -example <data>SGVsbG8sIHdvcmxkIQ==</data>
UserDefaults.standard.data(forKey: "example")
    .flatMap { String(decoding: $0, as: UTF8.self) } //=> Optional("Hello, world!")

Date

Date doesn’t have a convenience function but still returns an honest date when encoded correctly

// -example <date>2024-03-10T22:19:00Z</date>
UserDefaults.standard.object(forKey: "example") as? Date //=> Optional(2024-03-10 22:19:00 +0000)

Number

For numbers you have the option to not wrap in any tags and hope for the best or to choose real or integer

// -example <real>1.23</real>
UserDefaults.standard.float(forKey: "example")   //=> 1.23
UserDefaults.standard.double(forKey: "example")  //=> 1.23
UserDefaults.standard.integer(forKey: "example") //=> 1

// -example <integer>1</integer>
UserDefaults.standard.float(forKey: "example")   //=> 1.0
UserDefaults.standard.double(forKey: "example")  //=> 1.0
UserDefaults.standard.integer(forKey: "example") //=> 1

Interestingly if you don’t provide a tag then integer(forKey:) doesn’t truncate it just returns 0

// -example 1.23
UserDefaults.standard.integer(forKey: "example") //=> 0

String

Strings can simply be passed directly unless you want to do anything more complex in which case you’d want to wrap in <string></string> tags.


Piecing things together

In some cases you might want to pass more complex data. One such example I came across was wanting to inject in a user profile that has multiple properties for UI testing. I could design the api so that the UI tests would pass multiple arguments but that would require validation and error handling in the app. It would be handy if I could just pass a JSON blob and then decode it inside the app. This is possible with two additional steps

  • Wrap the JSON blob in <string></string> tags
  • Escape the XML entities

Let’s imagine I have the following fictitious User type

struct User: Decodable {
    let name: String
    let token: String
    let isPaidMember: Bool
}

It might be handy in my UI tests to perform a login once at the beginning of all tests to get a token and then inject it into all the tests. I can also toggle the status of isPaidMember in each test. An appropriate argument that would work would look like this:

# {"name":"Paul","token":"some-token","isPaidMember":true}
-user <string>{&quot;name&quot;:&quot;Paul&quot;,&quot;token&quot;:&quot;some-token&quot;,&quot;isPaidMember&quot;:true}</string>

The corresponding code to parse this and fail silently in case of error would look like this

let user = UserDefaults.standard.string(forKey: "user")
    .flatMap { try? JSONDecoder().decode(User.self, from: Data($0.utf8)) }

Conclusion

UserDefaults can make launching your app and injecting data pretty simple. All your favourite primitive types are supported and once you get your head around using the plist markup you can start passing all kinds of data in just the right format for your needs.

Mobile UI testing with Maestro (Swift version)

In my last post I talked about building a DSL for Maestro using Kotlin. As an intersting exercise I asked some colleagues (Adam, Ellen and Saad) “could we build an equally nice DSL in Swift?”. Together we worked through a few ideas and this post expands on the findings we made.

TL;DR if you want to poke around and try out the result check out the repo here.


Working backwards from the target

Let’s look at the half way line of where we want to be.

try maestroRun("com.apple.MobileAddressBook") {
    LaunchApp("com.apple.MobileAddressBook")

    TapOn(.text("Add"))

    TapOn(.id("First name"))
    Input("John")

    if shouldEnterLastName {
        TapOn(.id("Last name"))
        Input("Appleseed")
    }

    TapOn(.text("Done"))
}

The above uses a @resultBuilder to collect an array of Commands, thanks to result builder we get the full expressivity of if/switch statements and loops.

The commands we collect are types that conform to Command, which is defined like this

public protocol Command {
    var data: Any { get }
}

The requirements from this protocol are pretty weak with a single getter of type Any but in this system this will represent a JSON encodable type String, Int, Object, Array, etc. Although the requirement is weak the assumption is that the framework will likely provide all implementations of Command. This could be strengthened by using an enum but I’ll leave that for a future tangent.

In order to build up a collection of these commands we can create a simple @resultBuilder with most the overloads required to get the various control flow operations to work.

@resultBuilder
public enum FlowBuilder {
    public static func buildBlock(_ components: any Command...) -> [any Command] {
        components
    }

    public static func buildArray(_ components: [any Command]) -> [any Command] {
        components
    }

    public static func buildOptional(_ component: (any Command)?) -> [any Command] {
        component.flatMap { [$0] } ?? []
    }

    public static func buildEither(first component: any Command) -> [any Command] {
        [component]
    }

    public static func buildEither(second component: any Command) -> [any Command] {
        [component]
    }
}

Implementing all of the above might look like busy work but it’s the secret sauce that allows us to use if/when statements and loops so it’s worth the effort for a natural call site.

Next up we need to build the actual yaml document that maestro will read. For this we can import a YAML library like Yams and dump the result of our builder and prepend some required front matter - in this case Maestro wants the key value pair to identify the appId of the app it’s targeting.

public func maestroCompose(_ bundleID: String, @FlowBuilder composition: () -> Flow) throws -> String {
    try "appId: \(bundleID)\n---\n" + Yams.dump(object: composition().map(\.data))
}

With this in place (and all the commands) we can now use a builder to generate a list of commands that will be rendered to the correct format.

If the maestro cli is installed in the standard location (~/.maestro/bin/maestro) we can create a helper function to shell out to the executable

public func maestroRun(_ bundleID: String, @FlowBuilder composition: () -> BasicFlow) throws {
    let fileURL = FileManager.default.temporaryDirectory.appendingPathComponent("commands.yaml")

    try maestroCompose(bundleID, composition: composition).write(
        to: fileURL,
        atomically: true,
        encoding: .utf8
    )

    let process = try Process.run(
        FileManager.default.homeDirectoryForCurrentUser.appendingPathComponent(".maestro/bin/maestro"),
        arguments: ["test", fileURL.path]
    )
    process.waitUntilExit()
}

Page Objects

A common pattern in UI testing is to create Page Objects that represent parts of the UI and hide all the details about finding items on the screen and encapsulate knowledge of how to then interact with these components.

In order to introduce the concept of Page Objects I wanted it so that you couldn’t just arbitrarily mix primitive commands and flows e.g.

try! maestroRun("com.apple.MobileAddressBook") {
    HomePage {
        $0.tapAdd() // Some scope type that can work with `HomePage` ✅
        LaunchApp("com.apple.MobileAddressBook") // ❌
    }
}

To achieve this we need to introduce a stronger type than [any Command]. I’ve got nothing against [any Command] but I wanted to use phantom types to give some additional compile time identity to the collection of commands. For example commands produced by one page object should be incompatible with other page objects because it makes no sense to try and interact with one page’s elements when it’s not the current page.

For this I created a new type called Flow that wraps the bare [any Command] we currently work with. The full Flow type is defined like this

public struct Flow<T> {
    public let commands: [any Command]

    init(_ command: Command) {
        commands = [command]
    }

    init(_ commands: [Command]) {
        self.commands = commands
    }

    init(_ flow: Flow?) {
        commands = flow?.commands ?? []
    }

    init(_ flows: [Flow]) {
        commands = flows.flatMap(\.commands)
    }
}

The type is essentially a dumb container that can be instantiated in loads of ways just to hold on to the underlying commands, it takes on the role of mostly erasing one type and introducing a new more specific one. The phantom type T makes it so that Flow<Void> is a distinct type from Flow<Other>, this allows us to use the same result builder but only allow the results to be compatible when the types line up

@resultBuilder
public enum PageBuilder<T: Page> {
    @available(*, deprecated, message: "Use the methods defined defined on the page")
    public static func buildExpression(_: Command) -> Flow<T> {
        fatalError()
    }

    public static func buildExpression(_ expression: Flow<T>) -> Flow<T> {
        expression
    }

    public static func buildExpression(_ expression: Page) -> Flow<T> {
        .init(expression)
    }

    public static func buildBlock(_ components: Flow<T>...) -> Flow<T> {
        buildArray(components)
    }

    public static func buildArray(_ components: [Flow<T>]) -> Flow<T> {
        .init(components)
    }

    public static func buildOptional(_ component: Flow<T>?) -> Flow<T> {
        Flow(component)
    }

    public static func buildEither(first component: Flow<T>) -> Flow<T> {
        component
    }

    public static func buildEither(second component: Flow<T>) -> Flow<T> {
        component
    }
}

To make use of the above builder we’d need to define a Page like this (bear with me this will be simplified):

struct HomePage: Page {
    var commands: [any Command] = []

    init(@PageBuilder<HomePage> content: @escaping (HomePage) -> Flow<HomePage>) {
        self.commands = content(self).commands
    }

    @FlowBuilder<HomePage>
    func tapAdd() -> Flow<HomePage> {
        TapOn(.text("Add"))
    }
}

The above looks annoying having to add the boilerplate of the init in the right format with funky page builder syntax and storage for commands. Luckily the kind Swift developers have given us a lovely tool for getting rid of boiler plate in the way of macros. The end result will look like this

@Page
struct HomePage {
    @FlowBuilder<HomePage>
    func tapAdd() -> Flow<HomePage> {
        TapOn(.text("Add"))
    }
}

I’m not going to get into the details of the macro too much as I’m pretty sure I’ve butchered the implementation but it kinda works so I was happy. The general idea was to have an attached extension macro to add conformance to the Page protocol and generate the required initialiser overloads. There is another attached member conformance that allows us to add the storage for var commands: [any Command]. For all the gruesome details check out the macro implementation.

With this in place we can now define page objects that encapsulate knowledge of how to find and interact with components and share this with our team e.g. here’s entering names into a form of the contacts app

@Page
struct EditFormPage {
    @FlowBuilder<EditFormPage>
    func setFirstName(_ name: String) -> Flow<EditFormPage> {
        TapOn(.id("First name"))
        Input(name)
    }

    @FlowBuilder<EditFormPage>
    func setLastName(_ name: String) -> Flow<EditFormPage> {
        TapOn(.id("Last name"))
        Input(name)
    }

    @FlowBuilder<EditFormPage>
    func tapDone() -> Flow<EditFormPage> {
        TapOn(.text("Done"))
    }
}

Sharing the above means that members of the team don’t need to look up the different selector strategies and ways of interacting with controls as they can simply write the following:

try! maestroRun("com.apple.MobileAddressBook") {
    LaunchApp("com.apple.MobileAddressBook")

    HomePage {
        $0.tapAdd()

        EditFormPage {
            $0.setFirstName("John")

            if shouldEnterLastName {
                $0.setLastName("Appleseed")
            }

            $0.tapDone()
        }
    }
}

Wrap up

This post (and accompanying repo) show combining a few Swift language features @resultBuilders, macros and phantom types to build a DSL that prevents consumers from creating wonky input as things are checked at compile time. Maestro isn’t just good for UI tests I’ve found it really helpful just automating flows I need to reproduce whilst doing my day to day dev work with minimum fuss so I’d highly recommended checking it out.

Mobile UI testing with Maestro

Maestro is an interesting UI automation framework that

is built on learnings from its predecessors (Appium, Espresso, UIAutomator, XCTest) and allows you to easily define and test your Flows

I tried it out and was impressed with how quickly I could get something working with so few dependencies. The biggest issue for me is that YAML is used for defining flows, which means it’s not very dynamic, you can’t leverage IDE tools like autocomplete and nobody likes parsing failures due to whitespace issues.

I managed to get around some of these issues by creating a DSL in a modern language which can then spit out the YAML to feed to the maestro cli. Here’s a dive into that experimentation using Kotlin to reach this result

Screen recording showing the final results of this blog post


Figure out what we need to build

First let’s start by looking at a basic script that launches the iOS calendar app and then taps on the plus button. This gets us a feel for the yaml document we need to generate.

appId: com.apple.mobilecal
---
- launchApp
- tapOn: Add

Hopefully in the above the commands are fairly self explanatory.

I started by thinking about the API I wanted to write and then worked backwards from there. In Kotlin I’d want to write something like the following:

maestroConduct {
    launchApp()
    tapOn("Add")
}

It might look simple to achieve the above but there are quite a few language features that we need to take advantage of:

  • Interfaces and code visibility
  • Higher order functions (the maestroConduct function takes another function as its argument)
  • Functions with receivers (the function passed as a trailing closure will be called on an explicit receiver)
  • @DslMarker will be used to reduce the autocomplete options the IDE offers inside the scope of the trailing closure

As a high level plan my aim is to build a List of command structures by invoking the DSL e.g. when I invoke launchApp() it needs to add a command of "launchApp" to my commands list.

I can start by defining an interface for the two methods I currently support

interface ScriptScope {
    fun launchApp()
    fun tapOn(text: String)
}

Next I’ll write a function that takes a lambda with a receiver of ScriptScope that is responsible for invoking the lambda and collecting the list of commands.

internal fun buildSteps(configure: ScriptScope.() -> Unit): List<Any> = mutableListOf<Any>().also { steps ->
    object : ScriptScope {
       override fun launchApp() {
           steps.add(
               "launchApp"
           )
       }

       override fun tapOn(text: String) {
           steps.add(
               mapOf("tapOn" to mapOf("text" to text))
           )
       }
    }.configure()
}

The ScriptScope.() -> Unit is Function literal with receiver which means that the lambda we pass will be invoked with ScriptScope as the receiver (read: this), which means that it has access to all the methods declared on ScriptScope.

The helper function declared above contains a concrete implementation of the ScriptScope interface which essentially just starts pushing basic maps and strings into a list of commands that is returned at the end.


With this defined I can now call something like

buildSteps {
    launchApp()
    tapOn("Add")
}

and it will build the equivalent of

listOf(
    "launchApp",
    mapOf("tapOn" to mapOf("text" to "Add"))
)

This is almost the original code I envisaged I’d want to write and matches the structure in the YAML but building a list of commands isn’t the ultimate goal, we need to be able to run them as well.


Running commands

In order to test we are on the right track with building up a layer on top of maestro we need to see it working. The full flow we need to achieve is

  • Run the DSL to collect commands
  • Serialize to YAML and write to a file
  • Invoke the maestro cli

These steps are what the maestroConduct function will handle.

fun maestroConduct(configure: ScriptScope.() -> Unit) {
    val tmpFile = createTempDirectory()/"commands.yaml"

    tmpFile.toFile().writeText(
        """
            appId: com.apple.mobilecal

        """.trimIndent() +
                YAMLMapper()
                    .setSerializationInclusion(JsonInclude.Include.NON_NULL)
                    .writeValueAsString(buildSteps(configure))
    )

    ProcessBuilder("maestro", "test", tmpFile.toString())
        .redirectOutput(ProcessBuilder.Redirect.INHERIT)
        .redirectError(ProcessBuilder.Redirect.INHERIT)
        .start()
        .waitFor()
}

Centralise knowledge

The above is mildly interesting noodling with Kotlin but now it’s time to make this something more useful to a wider team. In UI testing there is the concept of the PageObject pattern where we abstract away low level details about how to work with parts of UI.

In this example I might add a PageObject for the TodayScope (where the app launches me to) and an AddEventScope which represents the form for inputting an event.

Again let’s look at the code we might want to write and then work backwards

maestroConduct {
    launchApp()
    today {
        addEvent {
            setTitle("Write blog post")
        }
    }
}

In the above the today scope is here to only introduce methods that are applicable to the today screen. Equally the addEvent introduces a scope that will only expose functions that make sense on the add event screen. Some code to do this might look like:

interface TodayScope {
    fun addEvent(configure: AddEventScope.() -> Unit)
}

interface AddEventScope {
    fun setTitle(title: String)
}

internal fun ScriptScope.buildAddEvent(configure: AddEventScope.() -> Unit) {
   object : AddEventScope {
       override fun setTitle(title: String) {
           // insert code to perform steps here
       }
   }.configure()
}

fun ScriptScope.today(configure: TodayScope.() -> Unit) {
    object : TodayScope {
        override fun addEvent(configure: AddEventScope.() -> Unit) {
            tapOn("Add")
            buildAddEvent(configure)
        }
    }.configure()
}

The above looks complicated but it’s just boiler plate to introduce our scopes. There is a problem with the scopes currently where inside the trailing closure passed to addEvent our code can see the AddEventScope context and its parent context of TodayScope which means you could write this non sensical code

today {
    addEvent {
        addEvent {} // This doesn't make sense ❌
    }
}

This is where we can leverage the @DslMarker annotation for scope control. We start by introducing an annotation

@DslMarker
annotation class ScopeMarker

With this annotation defined we can add it to our scopes above

@ScopeMarker
interface TodayScope {
    fun addEvent(configure: AddEventScope.() -> Unit)
}

@ScopeMarker
interface AddEventScope {
    fun setTitle(title: String?)
}

Now the compiler doesn’t allow our code inside the addEvent trailing closure to see the parent scope without being explicit (e.g. [email protected]). This should prevent accidental misuse.


With that out of the way we have another small thing to resolve with the tapOn method. Currently it selects items by looking for literal text but when we inspect the app (using the maestro studio command) we can see there is a better option of searching by id. It’s often better to search by id as it will be locale agnostic and if it works the same as XCTest it should be much faster than looking up text.

To allow for this let’s add a little abstraction to the ScriptScope.tapOn function to take a selection strategy:

interface ScriptScope {
    ...
    fun tapOn(text: String)

    // Selectors

    fun id(id: String) = Selector("id", id)
    fun text(text: String) = Selector("text", text)
    data class Selector internal constructor(val key: String, val value: String)
}

With this in place we can switch between the two selector approaches with ease

maestroConduct {
    tapOn(id("my.id"))
    tapOn(text("Button title")
}

The next tangent is that entering text is not something we have covered. It’s more of the same stuff we have done before so we’ll add a new function to the ScriptScope interface and then add an implementation

@ScopeMarker
interface ScriptScope {
    ...

    fun enterText(text: String)
}

internal fun buildSteps(configure: ScriptScope.() -> Unit): List<Any> = mutableListOf<Any>().also { steps ->
    object : ScriptScope {
        ...
        override fun enterText(text: String) {
            steps.add(mapOf("inputText" to text))
        }
    }
}

With those three detours out of the way we can implement the actions required for setting title

internal fun ScriptScope.buildAddEvent(configure: AddEventScope.() -> Unit) {
   object : AddEventScope {
       override fun setTitle(title: String?) {
           tapOn(id("Title"))
           enterText("Write blog post")
       }
   }.configure()
}

The above doesn’t compile because of a previous change adding @ScopeMarker, which now means inside AdventScope we can’t see the ScriptScope methods we need (tapOn and enterText). We can overcome this by saying that our object will also conform to ScriptScope and the implementation will come from the parent context

- object : AddEventScope {
+ object : AddEventScope, ScriptScope by this {

This may seem like a lot of work but the centralising of this knowledge should really pay off. If someone from my team wanted to create a quick script to test some UI they wouldn’t need to dig around in the weeds of figuring out what identifiers they need to tap and details around entering text. Instead they can use a typesafe api with autocompletion that will guide them to write the right things - if someone is stuck they can type this. and the compiler will suggest the available methods.


The other advantages of this approach is that we are using a full programming language so we could do any external setting up of test data or network requests to configure environments before dumping the YAML and running it. A simple example of where a full programming language is useful is to imagine we get a bug after adding 3 notes, we could simply put the current steps in a repeat and very quickly generate the YAML to get our app in the right state

today {
    repeat(3) {
        addEvent {
            setTitle("Write blog post")
            setNotes("This is really cool")
            tapAdd()
        }
    }
}

The above is much more concise than the raw YAML output

appId: com.apple.mobilecal
---
- launchApp:
    arguments: {}
- tapOn:
    text: "Add"
- tapOn:
    id: "Title"
- inputText: "Write blog post"
- tapOn:
    id: "All-day"
- "scroll"
- tapOn:
    text: "Notes"
- inputText: "This is really cool"
- tapOn:
    text: "Add"
- tapOn:
    text: "Add"
- tapOn:
    id: "Title"
- inputText: "Write blog post"
- tapOn:
    id: "All-day"
- "scroll"
- tapOn:
    text: "Notes"
- inputText: "This is really cool"
- tapOn:
    text: "Add"
- tapOn:
    text: "Add"
- tapOn:
    id: "Title"
- inputText: "Write blog post"
- tapOn:
    id: "All-day"
- "scroll"
- tapOn:
    text: "Notes"
- inputText: "This is really cool"
- tapOn:
    text: "Add"

In fairness to maestro it does have the ability to run sub flows but that’s another thing people on the team would have to learn and understand.


Side notes

Using the Apple calendar app really shows how important it is to add good identifiers to your code to make this stuff easy. The Calendar app doesn’t do this so most of the time the best you can do it search for text and that’s not guaranteed to be unique. There were certain things I just couldn’t do, like tap on the “All-day” toggle, because the ability to identify views was just plain missing.


Conclusion

I really like Maestro and think it could be a really useful if a little time is invested in refining it to a specific project’s needs. This post looks at a few interesting techniques related to Kotlin that improve the user experience for me but YMMV.