Mobile UI testing with Maestro

Maestro is an interesting UI automation framework that

is built on learnings from its predecessors (Appium, Espresso, UIAutomator, XCTest) and allows you to easily define and test your Flows

I tried it out and was impressed with how quickly I could get something working with so few dependencies. The biggest issue for me is that YAML is used for defining flows, which means it’s not very dynamic, you can’t leverage IDE tools like autocomplete and nobody likes parsing failures due to whitespace issues.

I managed to get around some of these issues by creating a DSL in a modern language which can then spit out the YAML to feed to the maestro cli. Here’s a dive into that experimentation using Kotlin to reach this result

Screen recording showing the final results of this blog post


Figure out what we need to build

First let’s start by looking at a basic script that launches the iOS calendar app and then taps on the plus button. This gets us a feel for the yaml document we need to generate.

appId: com.apple.mobilecal
---
- launchApp
- tapOn: Add

Hopefully in the above the commands are fairly self explanatory.

I started by thinking about the API I wanted to write and then worked backwards from there. In Kotlin I’d want to write something like the following:

maestroConduct {
    launchApp()
    tapOn("Add")
}

It might look simple to achieve the above but there are quite a few language features that we need to take advantage of:

  • Interfaces and code visibility
  • Higher order functions (the maestroConduct function takes another function as its argument)
  • Functions with receivers (the function passed as a trailing closure will be called on an explicit receiver)
  • @DslMarker will be used to reduce the autocomplete options the IDE offers inside the scope of the trailing closure

As a high level plan my aim is to build a List of command structures by invoking the DSL e.g. when I invoke launchApp() it needs to add a command of "launchApp" to my commands list.

I can start by defining an interface for the two methods I currently support

interface ScriptScope {
    fun launchApp()
    fun tapOn(text: String)
}

Next I’ll write a function that takes a lambda with a receiver of ScriptScope that is responsible for invoking the lambda and collecting the list of commands.

internal fun buildSteps(configure: ScriptScope.() -> Unit): List<Any> = mutableListOf<Any>().also { steps ->
    object : ScriptScope {
       override fun launchApp() {
           steps.add(
               "launchApp"
           )
       }

       override fun tapOn(text: String) {
           steps.add(
               mapOf("tapOn" to mapOf("text" to text))
           )
       }
    }.configure()
}

The ScriptScope.() -> Unit is Function literal with receiver which means that the lambda we pass will be invoked with ScriptScope as the receiver (read: this), which means that it has access to all the methods declared on ScriptScope.

The helper function declared above contains a concrete implementation of the ScriptScope interface which essentially just starts pushing basic maps and strings into a list of commands that is returned at the end.


With this defined I can now call something like

buildSteps {
    launchApp()
    tapOn("Add")
}

and it will build the equivalent of

listOf(
    "launchApp",
    mapOf("tapOn" to mapOf("text" to "Add"))
)

This is almost the original code I envisaged I’d want to write and matches the structure in the YAML but building a list of commands isn’t the ultimate goal, we need to be able to run them as well.


Running commands

In order to test we are on the right track with building up a layer on top of maestro we need to see it working. The full flow we need to achieve is

  • Run the DSL to collect commands
  • Serialize to YAML and write to a file
  • Invoke the maestro cli

These steps are what the maestroConduct function will handle.

fun maestroConduct(configure: ScriptScope.() -> Unit) {
    val tmpFile = createTempDirectory()/"commands.yaml"

    tmpFile.toFile().writeText(
        """
            appId: com.apple.mobilecal

        """.trimIndent() +
                YAMLMapper()
                    .setSerializationInclusion(JsonInclude.Include.NON_NULL)
                    .writeValueAsString(buildSteps(configure))
    )

    ProcessBuilder("maestro", "test", tmpFile.toString())
        .redirectOutput(ProcessBuilder.Redirect.INHERIT)
        .redirectError(ProcessBuilder.Redirect.INHERIT)
        .start()
        .waitFor()
}

Centralise knowledge

The above is mildly interesting noodling with Kotlin but now it’s time to make this something more useful to a wider team. In UI testing there is the concept of the PageObject pattern where we abstract away low level details about how to work with parts of UI.

In this example I might add a PageObject for the TodayScope (where the app launches me to) and an AddEventScope which represents the form for inputting an event.

Again let’s look at the code we might want to write and then work backwards

maestroConduct {
    launchApp()
    today {
        addEvent {
            setTitle("Write blog post")
        }
    }
}

In the above the today scope is here to only introduce methods that are applicable to the today screen. Equally the addEvent introduces a scope that will only expose functions that make sense on the add event screen. Some code to do this might look like:

interface TodayScope {
    fun addEvent(configure: AddEventScope.() -> Unit)
}

interface AddEventScope {
    fun setTitle(title: String)
}

internal fun ScriptScope.buildAddEvent(configure: AddEventScope.() -> Unit) {
   object : AddEventScope {
       override fun setTitle(title: String) {
           // insert code to perform steps here
       }
   }.configure()
}

fun ScriptScope.today(configure: TodayScope.() -> Unit) {
    object : TodayScope {
        override fun addEvent(configure: AddEventScope.() -> Unit) {
            tapOn("Add")
            buildAddEvent(configure)
        }
    }.configure()
}

The above looks complicated but it’s just boiler plate to introduce our scopes. There is a problem with the scopes currently where inside the trailing closure passed to addEvent our code can see the AddEventScope context and its parent context of TodayScope which means you could write this non sensical code

today {
    addEvent {
        addEvent {} // This doesn't make sense ❌
    }
}

This is where we can leverage the @DslMarker annotation for scope control. We start by introducing an annotation

@DslMarker
annotation class ScopeMarker

With this annotation defined we can add it to our scopes above

@ScopeMarker
interface TodayScope {
    fun addEvent(configure: AddEventScope.() -> Unit)
}

@ScopeMarker
interface AddEventScope {
    fun setTitle(title: String?)
}

Now the compiler doesn’t allow our code inside the addEvent trailing closure to see the parent scope without being explicit (e.g. [email protected]). This should prevent accidental misuse.


With that out of the way we have another small thing to resolve with the tapOn method. Currently it selects items by looking for literal text but when we inspect the app (using the maestro studio command) we can see there is a better option of searching by id. It’s often better to search by id as it will be locale agnostic and if it works the same as XCTest it should be much faster than looking up text.

To allow for this let’s add a little abstraction to the ScriptScope.tapOn function to take a selection strategy:

interface ScriptScope {
    ...
    fun tapOn(text: String)

    // Selectors

    fun id(id: String) = Selector("id", id)
    fun text(text: String) = Selector("text", text)
    data class Selector internal constructor(val key: String, val value: String)
}

With this in place we can switch between the two selector approaches with ease

maestroConduct {
    tapOn(id("my.id"))
    tapOn(text("Button title")
}

The next tangent is that entering text is not something we have covered. It’s more of the same stuff we have done before so we’ll add a new function to the ScriptScope interface and then add an implementation

@ScopeMarker
interface ScriptScope {
    ...

    fun enterText(text: String)
}

internal fun buildSteps(configure: ScriptScope.() -> Unit): List<Any> = mutableListOf<Any>().also { steps ->
    object : ScriptScope {
        ...
        override fun enterText(text: String) {
            steps.add(mapOf("inputText" to text))
        }
    }
}

With those three detours out of the way we can implement the actions required for setting title

internal fun ScriptScope.buildAddEvent(configure: AddEventScope.() -> Unit) {
   object : AddEventScope {
       override fun setTitle(title: String?) {
           tapOn(id("Title"))
           enterText("Write blog post")
       }
   }.configure()
}

The above doesn’t compile because of a previous change adding @ScopeMarker, which now means inside AdventScope we can’t see the ScriptScope methods we need (tapOn and enterText). We can overcome this by saying that our object will also conform to ScriptScope and the implementation will come from the parent context

- object : AddEventScope {
+ object : AddEventScope, ScriptScope by this {

This may seem like a lot of work but the centralising of this knowledge should really pay off. If someone from my team wanted to create a quick script to test some UI they wouldn’t need to dig around in the weeds of figuring out what identifiers they need to tap and details around entering text. Instead they can use a typesafe api with autocompletion that will guide them to write the right things - if someone is stuck they can type this. and the compiler will suggest the available methods.


The other advantages of this approach is that we are using a full programming language so we could do any external setting up of test data or network requests to configure environments before dumping the YAML and running it. A simple example of where a full programming language is useful is to imagine we get a bug after adding 3 notes, we could simply put the current steps in a repeat and very quickly generate the YAML to get our app in the right state

today {
    repeat(3) {
        addEvent {
            setTitle("Write blog post")
            setNotes("This is really cool")
            tapAdd()
        }
    }
}

The above is much more concise than the raw YAML output

appId: com.apple.mobilecal
---
- launchApp:
    arguments: {}
- tapOn:
    text: "Add"
- tapOn:
    id: "Title"
- inputText: "Write blog post"
- tapOn:
    id: "All-day"
- "scroll"
- tapOn:
    text: "Notes"
- inputText: "This is really cool"
- tapOn:
    text: "Add"
- tapOn:
    text: "Add"
- tapOn:
    id: "Title"
- inputText: "Write blog post"
- tapOn:
    id: "All-day"
- "scroll"
- tapOn:
    text: "Notes"
- inputText: "This is really cool"
- tapOn:
    text: "Add"
- tapOn:
    text: "Add"
- tapOn:
    id: "Title"
- inputText: "Write blog post"
- tapOn:
    id: "All-day"
- "scroll"
- tapOn:
    text: "Notes"
- inputText: "This is really cool"
- tapOn:
    text: "Add"

In fairness to maestro it does have the ability to run sub flows but that’s another thing people on the team would have to learn and understand.


Side notes

Using the Apple calendar app really shows how important it is to add good identifiers to your code to make this stuff easy. The Calendar app doesn’t do this so most of the time the best you can do it search for text and that’s not guaranteed to be unique. There were certain things I just couldn’t do, like tap on the “All-day” toggle, because the ability to identify views was just plain missing.


Conclusion

I really like Maestro and think it could be a really useful if a little time is invested in refining it to a specific project’s needs. This post looks at a few interesting techniques related to Kotlin that improve the user experience for me but YMMV.

Missing Xcode run test buttons

It happens more often than I’d like that Xcode loses the run test button in the text gutter. Often I’ll resort to closing Xcode and doing various rituals to try and get the little devils to appear again, which is really annoying if you want to avoid context switching.

When you just want to crack on I’ve had success with this flow:

  • Copy the test class name
  • Show the test navigator panel (⌘6)
  • Paste the class name into the filter (⌥⌘J followed by ⌘V)
  • Press the play button in this panel

Screen recording showing the above steps described above

Swift Parameter Packs

Parameter packs are new in Swift 5.9 and they are pretty cool. Here’s a few example uses from an initial exploration.


valuesAt

I often want to pluck out specific values from a type but don’t necessarily want to do it over multiple lines e.g. plucking details out of a Github pull request might look like this

let number = pullRequest.number
let user = pullRequest.user.login
let head = pullRequest.head.sha

The above could go on further with many more parameters but the main take away is it creates some repetitive noise and sometimes I just want a one liner. With parameter packs we can write a valuesAt function to achieve this result

let (number, user, head) = valuesAt(pullRequest, keyPaths: \.number, \.user.login, \.head.sha)

The end result is the same in that I have 3 strongly typed let bindings but I can get it onto one line.

The implementation of valuesAt looks like this:

func valuesAt<T, each U>(
    _ subject: T, 
    keyPaths keyPath: repeat KeyPath<T, each U>
) -> (repeat each U) {
    (repeat (subject[keyPath: each keyPath]))
}

decorateAround

With higher order functions in Swift it’s easy to write a function that decorates another. The issue is handling varying arity and differing return types means we previously had to write loads of function overloads. With parameter packs we can write a generic function that allows us to write wrappers inline.

Imagine I need to log the arguments and return value for a function but I don’t have access to the source so I can’t just modify the function directly. What I can do is decorate the function and then use the newly generated function in the original functions place.

let decoratedAddition: (Int, Int) -> Int = decorateAround(+) { add, a, b in
    let result = add(a, b)
    print("\(a) + \(b) = \(result)")
    return result
}

print(decoratedAddition(1, 2))

//=> 1 + 2 = 3
//=> 3

With the above the core underlying function is unchanged but I’ve added additional observability. With this particular set up the decorateAround actually gives the caller lots of flexibility as they can also optionally inspect/modify the arguments to the wrapped function and then modify the result.

The code to achieve this triggers my semantic satiation for the words repeat and each but here it is in all its glory

func decorateAround<each Argument, Return>(
    _ function: @escaping (repeat each Argument) -> Return,
    around: @escaping ((repeat each Argument) -> Return, repeat each Argument) -> Return
) -> (repeat each Argument) -> Return {
    { (argument: repeat each Argument) in
        around(function, repeat each argument)
    }
}

We could go further and create helpers that make it simple to decoratePre and decoratePost and only use the decorateAround variant when we need full flexibility.


memoize

With the general pattern of decoration there are other things we can expand on. One such function would be to memoize expensive computations so if we call a decorated function with the same inputs multiple times we expect the computation to be performed only once. One example might be loading a resource from disk and keeping it in a local cache to avoid the disk IO when the same file is requested

let memoizedLoadImage = memoize(loadImage)

memoizedLoadImage(URL(filePath: "some-url"))
memoizedLoadImage(URL(filePath: "some-url"))

memoizedLoadImage(URL(filePath: "other-url"))

In the above example the image at some-url will only have the work performed to load it the first time, on the subsequent call the in memory cached result will be returned. The final call to other-url will not have any result in the cache and so would trigger a disk load.

In order to build this one we have to get a little more inventive with things as the cache is a Dictionary so we need to build a key somehow but tuples are not Hashable. I ended up building an array for the key that has all the arguments type erased to AnyHashable. The code looks like this:

func memoize<each Argument: Hashable, Return>(
    _ function: @escaping (repeat each Argument) -> Return
) -> (repeat each Argument) -> Return {
    var storage = [AnyHashable: Return]()
    
    return { (argument: repeat each Argument) in
        var key = [AnyHashable]()
        repeat key.append(AnyHashable(each argument))
        
        if let result = storage[key] {
            return result
        } else {
            let result = function(repeat each argument)
            storage[key] = result
            return result
        }
    }
}

Conclusion

Parameter packs are an interesting feature - I’m not sure the above code snippets are particularly good or even sensible to use but I hope it helps people get their toe in the door on using the feature and potentially coming up with stronger use cases than I’ve imagined.