Mobile UI testing with Maestro (Swift version)

In my last post I talked about building a DSL for Maestro using Kotlin. As an intersting exercise I asked some colleagues (Adam, Ellen and Saad) “could we build an equally nice DSL in Swift?”. Together we worked through a few ideas and this post expands on the findings we made.

TL;DR if you want to poke around and try out the result check out the repo here.


Working backwards from the target

Let’s look at the half way line of where we want to be.

try maestroRun("com.apple.MobileAddressBook") {
    LaunchApp("com.apple.MobileAddressBook")

    TapOn(.text("Add"))

    TapOn(.id("First name"))
    Input("John")

    if shouldEnterLastName {
        TapOn(.id("Last name"))
        Input("Appleseed")
    }

    TapOn(.text("Done"))
}

The above uses a @resultBuilder to collect an array of Commands, thanks to result builder we get the full expressivity of if/switch statements and loops.

The commands we collect are types that conform to Command, which is defined like this

public protocol Command {
    var data: Any { get }
}

The requirements from this protocol are pretty weak with a single getter of type Any but in this system this will represent a JSON encodable type String, Int, Object, Array, etc. Although the requirement is weak the assumption is that the framework will likely provide all implementations of Command. This could be strengthened by using an enum but I’ll leave that for a future tangent.

In order to build up a collection of these commands we can create a simple @resultBuilder with most the overloads required to get the various control flow operations to work.

@resultBuilder
public enum FlowBuilder {
    public static func buildBlock(_ components: any Command...) -> [any Command] {
        components
    }

    public static func buildArray(_ components: [any Command]) -> [any Command] {
        components
    }

    public static func buildOptional(_ component: (any Command)?) -> [any Command] {
        component.flatMap { [$0] } ?? []
    }

    public static func buildEither(first component: any Command) -> [any Command] {
        [component]
    }

    public static func buildEither(second component: any Command) -> [any Command] {
        [component]
    }
}

Implementing all of the above might look like busy work but it’s the secret sauce that allows us to use if/when statements and loops so it’s worth the effort for a natural call site.

Next up we need to build the actual yaml document that maestro will read. For this we can import a YAML library like Yams and dump the result of our builder and prepend some required front matter - in this case Maestro wants the key value pair to identify the appId of the app it’s targeting.

public func maestroCompose(_ bundleID: String, @FlowBuilder composition: () -> Flow) throws -> String {
    try "appId: \(bundleID)\n---\n" + Yams.dump(object: composition().map(\.data))
}

With this in place (and all the commands) we can now use a builder to generate a list of commands that will be rendered to the correct format.

If the maestro cli is installed in the standard location (~/.maestro/bin/maestro) we can create a helper function to shell out to the executable

public func maestroRun(_ bundleID: String, @FlowBuilder composition: () -> BasicFlow) throws {
    let fileURL = FileManager.default.temporaryDirectory.appendingPathComponent("commands.yaml")

    try maestroCompose(bundleID, composition: composition).write(
        to: fileURL,
        atomically: true,
        encoding: .utf8
    )

    let process = try Process.run(
        FileManager.default.homeDirectoryForCurrentUser.appendingPathComponent(".maestro/bin/maestro"),
        arguments: ["test", fileURL.path]
    )
    process.waitUntilExit()
}

Page Objects

A common pattern in UI testing is to create Page Objects that represent parts of the UI and hide all the details about finding items on the screen and encapsulate knowledge of how to then interact with these components.

In order to introduce the concept of Page Objects I wanted it so that you couldn’t just arbitrarily mix primitive commands and flows e.g.

try! maestroRun("com.apple.MobileAddressBook") {
    HomePage {
        $0.tapAdd() // Some scope type that can work with `HomePage` ✅
        LaunchApp("com.apple.MobileAddressBook") // ❌
    }
}

To achieve this we need to introduce a stronger type than [any Command]. I’ve got nothing against [any Command] but I wanted to use phantom types to give some additional compile time identity to the collection of commands. For example commands produced by one page object should be incompatible with other page objects because it makes no sense to try and interact with one page’s elements when it’s not the current page.

For this I created a new type called Flow that wraps the bare [any Command] we currently work with. The full Flow type is defined like this

public struct Flow<T> {
    public let commands: [any Command]

    init(_ command: Command) {
        commands = [command]
    }

    init(_ commands: [Command]) {
        self.commands = commands
    }

    init(_ flow: Flow?) {
        commands = flow?.commands ?? []
    }

    init(_ flows: [Flow]) {
        commands = flows.flatMap(\.commands)
    }
}

The type is essentially a dumb container that can be instantiated in loads of ways just to hold on to the underlying commands, it takes on the role of mostly erasing one type and introducing a new more specific one. The phantom type T makes it so that Flow<Void> is a distinct type from Flow<Other>, this allows us to use the same result builder but only allow the results to be compatible when the types line up

@resultBuilder
public enum PageBuilder<T: Page> {
    @available(*, deprecated, message: "Use the methods defined defined on the page")
    public static func buildExpression(_: Command) -> Flow<T> {
        fatalError()
    }

    public static func buildExpression(_ expression: Flow<T>) -> Flow<T> {
        expression
    }

    public static func buildExpression(_ expression: Page) -> Flow<T> {
        .init(expression)
    }

    public static func buildBlock(_ components: Flow<T>...) -> Flow<T> {
        buildArray(components)
    }

    public static func buildArray(_ components: [Flow<T>]) -> Flow<T> {
        .init(components)
    }

    public static func buildOptional(_ component: Flow<T>?) -> Flow<T> {
        Flow(component)
    }

    public static func buildEither(first component: Flow<T>) -> Flow<T> {
        component
    }

    public static func buildEither(second component: Flow<T>) -> Flow<T> {
        component
    }
}

To make use of the above builder we’d need to define a Page like this (bear with me this will be simplified):

struct HomePage: Page {
    var commands: [any Command] = []

    init(@PageBuilder<HomePage> content: @escaping (HomePage) -> Flow<HomePage>) {
        self.commands = content(self).commands
    }

    @FlowBuilder<HomePage>
    func tapAdd() -> Flow<HomePage> {
        TapOn(.text("Add"))
    }
}

The above looks annoying having to add the boilerplate of the init in the right format with funky page builder syntax and storage for commands. Luckily the kind Swift developers have given us a lovely tool for getting rid of boiler plate in the way of macros. The end result will look like this

@Page
struct HomePage {
    @FlowBuilder<HomePage>
    func tapAdd() -> Flow<HomePage> {
        TapOn(.text("Add"))
    }
}

I’m not going to get into the details of the macro too much as I’m pretty sure I’ve butchered the implementation but it kinda works so I was happy. The general idea was to have an attached extension macro to add conformance to the Page protocol and generate the required initialiser overloads. There is another attached member conformance that allows us to add the storage for var commands: [any Command]. For all the gruesome details check out the macro implementation.

With this in place we can now define page objects that encapsulate knowledge of how to find and interact with components and share this with our team e.g. here’s entering names into a form of the contacts app

@Page
struct EditFormPage {
    @FlowBuilder<EditFormPage>
    func setFirstName(_ name: String) -> Flow<EditFormPage> {
        TapOn(.id("First name"))
        Input(name)
    }

    @FlowBuilder<EditFormPage>
    func setLastName(_ name: String) -> Flow<EditFormPage> {
        TapOn(.id("Last name"))
        Input(name)
    }

    @FlowBuilder<EditFormPage>
    func tapDone() -> Flow<EditFormPage> {
        TapOn(.text("Done"))
    }
}

Sharing the above means that members of the team don’t need to look up the different selector strategies and ways of interacting with controls as they can simply write the following:

try! maestroRun("com.apple.MobileAddressBook") {
    LaunchApp("com.apple.MobileAddressBook")

    HomePage {
        $0.tapAdd()

        EditFormPage {
            $0.setFirstName("John")

            if shouldEnterLastName {
                $0.setLastName("Appleseed")
            }

            $0.tapDone()
        }
    }
}

Wrap up

This post (and accompanying repo) show combining a few Swift language features @resultBuilders, macros and phantom types to build a DSL that prevents consumers from creating wonky input as things are checked at compile time. Maestro isn’t just good for UI tests I’ve found it really helpful just automating flows I need to reproduce whilst doing my day to day dev work with minimum fuss so I’d highly recommended checking it out.