Creating Swift Package Manager tools from your existing codebase

The Swift Package Manager (SPM) is perfect for writing quick tools and you can even bring along your existing code from your production apps. The trick is realising that you can symlink a folder into the SPM project, which means with some work you can create a command line tool that wraps parts of your production code.

Why would you want to do this?

It's very project dependent but a common use case would be for creating support/debugging/CI validation tools. For example a lot of apps work with remote data - in order to carry out it's function the app will need to convert the remote data into custom types and use business rules to do useful things with this data. There are multiple failure points in this flow that will manifest as either an app crash or incorrect app behaviour. The way to debug this would be to fire up the app with the debugger attached and start exploring, this is where it would be nice to have tools to help explore problems and potentially prevent them.

Caveats

You can not use code that imports UIKit which means that this technique is only going to work for Foundation based code. This sounds limiting but ideally business logic and data manipulation code shouldn't know about UIKit.

Having dependencies makes this technique harder. You can still get this to work but it will require more configuration in Package.swift.

How do you do it?

This depends on how your project is structured. I've got an example project here. This project is a small iOS app that displays a list of blog posts (don't look at the iOS project itself it's not really important for this). The blog posts come from a fake JSON feed that doesn't have a particularly nice structure, so the app needs to do custom decoding. In order to keep this light I'm going to build the simplest wrapper possible - it will:

  • Read from standard in
  • Use the production parsing code
  • Print the decoded results or an error

You can go wild and add a lot more features to this but this simple tool will give us really quick feedback on whether some JSON will be accepted by the production code or show any errors that might occur, all without firing up a simulator.

The basic structure of this example project looks like this:

.
└── SymlinkedSPMExample
    ├── AppDelegate.swift
    ├── Base.lproj
    │   └── LaunchScreen.storyboard
    ├── Info.plist
    ├── ViewController.swift
    └── WebService
        ├── Server.swift
        └── Types
            ├── BlogPost.swift
            └── BlogPostsRequest.swift

I have deliberately created a Types directory that contains only the code I want to reuse.

To create a command line tool that makes use of this production code I can perform the following:

mkdir -p tools/web-api
cd tools/web-api
swift package init --type executable

This has scaffolded a project that we can now manipulate. First let's get our production source symlinked:

cd Sources
ln -s ../../../SymlinkedSPMExample/WebService/Types WebService
cd ..

You'll want to use a relative path for the symlink or it will break when moving between machines

The project structure now looks like this:

.
├── SymlinkedSPMExample
│   ├── AppDelegate.swift
│   ├── Base.lproj
│   │   └── LaunchScreen.storyboard
│   ├── Info.plist
│   ├── ViewController.swift
│   └── WebService
│       ├── Server.swift
│       └── Types
│           ├── BlogPost.swift
│           └── BlogPostsRequest.swift
└── tools
    └── web-api
        ├── Package.swift
        ├── README.md
        ├── Sources
        │   ├── WebServer -> ../../../SymlinkedSPMExample/WebService/Types/
        │   └── web-api
        │       └── main.swift
        └── Tests

Now I need to update the Package.swift file to create a new target for this code and to add a dependency so that the web-api executable can utilise the production code.

Package.swift

// swift-tools-version:4.0

import PackageDescription

let package = Package(
    name: "web-api",
    targets: [
        .target(name: "web-api", dependencies: [ "WebService" ]),
        .target(name: "WebService"),
    ]
)

Now that SPM knows how to build the project let's write the code mentioned above to use the production parsing code.

main.swift

import Foundation
import WebService

do {
  print(try JSONDecoder().decode(BlogPostsRequest.self, from: FileHandle.standardInput.readDataToEndOfFile()).posts)
} catch {
  print(error)
}

With this in place we can now start to run JSON through the tool and see if the production code would handle it or not:

Here's what it looks like when we try and send valid JSON through the tool:

$ echo '{ "posts" : [] }' | swift run web-api
[]

$ echo '{ "posts" : [ { "title" : "Some post", "tags" : [] } ] }' | swift run web-api
[WebService.BlogPost(title: "Some post", tags: [])]

$ echo '{ "posts" : [ { "title" : "Some post", "tags" : [ { "value" : "cool" } ] } ] }' | swift run web-api
[WebService.BlogPost(title: "Some post", tags: ["cool"])]

Here's an example of the error messages we get with invalid JSON:

$ echo '{}' | swift run web-api
keyNotFound(CodingKeys(stringValue: "posts", intValue: nil), Swift.DecodingError.Context(codingPath: [], debugDescription: "No value associated with key CodingKeys(stringValue: \"posts\", intValue: nil) (\"posts\").", underlyingError: nil))

$ echo '{ "posts" : [ { } ] }' | swift run web-api
keyNotFound(CodingKeys(stringValue: "title", intValue: nil), Swift.DecodingError.Context(codingPath: [CodingKeys(stringValue: "posts", intValue: nil), _JSONKey(stringValue: "Index 0", intValue: 0)], debugDescription: "No value associated with key CodingKeys(stringValue: \"title\", intValue: nil) (\"title\").", underlyingError: nil))

$ echo '{ "posts" : [ { "title" : "Some post" } ] }' | swift run web-api
keyNotFound(CodingKeys(stringValue: "tags", intValue: nil), Swift.DecodingError.Context(codingPath: [CodingKeys(stringValue: "posts", intValue: nil), _JSONKey(stringValue: "Index 0", intValue: 0)], debugDescription: "No value associated with key CodingKeys(stringValue: \"tags\", intValue: nil) (\"tags\").", underlyingError: nil))
  • The first example is erroring as there is no key posts
  • The second example is erroring because a post does not have a title key
  • The third example is erroring because a post does not have a tags key

In real life I would be piping the output of curling a live/staging endpoint not hand crafting JSON.

This is really cool because I can see that the production code does not parse some of these examples and I get the error messages that explain why. If I didn't have this tool I would need to run the app manually and figure out a way to get the different JSON payloads to run through the parsing logic.

Conclusion

This post covers the basic technique of using SPM to create tools using your production code. You can really run with this and create some beautiful workflows like:

  • Add the tool as a step in the CI pipeline of the web-api to ensure no deployments could take place that break the mobile clients.
  • Expand the tool to also apply the business rules (from the production code) to see if errors are introduced at the level of the feed, parsing or business rules.

I've started using the idea in my own projects and I'm excited about how it going to help me and potentially other members of my team.

Swift Iteration Showdown

I've had conversations where people don't see the benefit of using functions like map over hand rolling a for loop with an accumulator, with the main argument being that everyone understands a for loop so "why add complexity?". Whilst I agree that most developers could rattle off a for loop in their sleep it doesn't mean we should use the lower level constructs that have more risk. I compare it to a light switch in my house, I fully understand how to connect two wires to turn a light on but I'm not about to enter a dark room and start twiddling with wires when I can just use a switch.

The other trend I have picked up on during code reviews and discussions is that people are happy to use higher order functions like map, forEach and reduce but because they've not come from a functional programming background they can often misuse these functions. To make the most of these functions and to avoid surprises for future you and your team mates there are some rules that should be obeyed (some functional languages enforce these rules). The misuse I see is generally around putting side effects where they don't belong.

I think the best way to get people on board is to start off with examples of basic iteration and progressively change the iteration style whilst analysing the potential areas for improvement with each code listing. Hopefully by the end we'll have a shared understanding of what is good/bad and be able to identify the tradeoffs in each iteration style.


Basic iteration

The most basic iteration in most languages is some variation of for (int i = 0; i < end; i++) { ... }. This is not available in Swift but we can approximate it with the following

let collection = [ 1, 2, 3, nil, 5, 6 ]

var index = 0
while index < collection.count {
  print(collection[index])
  index += 1  
}

There are a lot of things about this loop that could be considered bad:

  • There is mutable state with the index variable.
    Mutable state is more difficult to reason about than immutable state and things become easier if we can remove it. I've found that in longer code listings mutable state can be especially tricky as you have to read more just to keep track of the mutation. Diff tools can add to the difficulty as they may decide to collapse a long listing, meaning that you need to unfold all the code before you can sanity check that mutation is happening where you expect.

  • We are in control of the iteration.
    There are plenty of opportunities to mess this up - here's some:

    • Forget to increment the index
    • Increment the index too much
    • Write the condition with <= and get an out of bounds
    • Mess the condition up completely by adding more complexity - e.g. index < collection.count || someThingThatEvaluatesToTrue
  • This only works for zero based indices.
    ArraySlice is not guaranteed to be zero based as it's a view on the original collection.

  • .count could change between loops.

  • It's less efficient than newer forms of iteration as items are fetched one at a time.

  • It's not very well scoped.
    The code block that is being evaluated during the loop causes side effects in the surrounding scope, in this case the side effect is the mutation of index.

Whilst this iteration works you can see from the list above that there is plenty of room for improvement for making this safer and reducing complexity.


Safer indexing

We can make an improvement on the above by getting the indices from the collection itself

let collection = [ 1, 2, 3, nil, 5, 6 ]

for index in collection.indices {
  print(collection[index])
}

The improvements here are quite nice:

  • We've removed the mutable index variable.
    Remember mutability is evil.

  • We've removed the side effect of mutating index.
    We still have the side effect caused by calling print but that's the core of the algorithm so that's fine.

  • We are no longer in control of the iteration.
    We are still in control of accessing elements at specific indices but this is an improvement.

  • We are no longer calculating indices as they are given to us by the collection, which means this will work with non zero indexed collections like ArraySlice instances.


Remove indexing

Most of my complaints so far have been related to indexing - both calculating the correct index and then performing the access. We can improve this situation by using for/in syntax.

let collection = [ 1, 2, 3, nil, 5, 6 ]

for item in collection {
  print(item)
}

The improvements here are:

  • We've removed indexing/manual control of the iteration.

  • Iteration can be more efficient - for example in Objective-C this would use NSFastEnumeration under the hood that would request batches of objects instead of fetching items one by one, which could be more efficient.

The previous disadvantages are now mostly gone. To continue evaluating the tradeoffs we need to step our thinking up a level to see the improvements that can be made. Some bad things about the above are:

  • There is no easy way to intuit the intent of the loop without studying the whole body.
    For example am I iterating to:

    • build up a new collection?
    • reduce a collection down to a single value?
    • purely cause side effects?
  • It's not composable.
    The iteration is using a code block not a closure. This means I can't reuse the block of code like I can with a closure.


forEach

To handle all the issues raised so far we can jump to using forEach

let collection = [ 1, 2, 3, nil, 5, 6 ]

collection.forEach { print($0) }

There are a few nice advantages with this implementation:

  • We are no longer concerned with the act of iterating.
    We simply provide a block and forEach will ensure that it is invoked for each element in the collection.

  • Improved composability.
    As forEach takes a block we can also reuse the same block in multiple places, which allows us to build our programs from smaller pieces.

  • We removed boilerplate.
    This is a big win as iteration can be considered as boiler plate, which we all know is tiresome and error prone to write. In addition it's not DRY to have the same structures all over the place so it makes sense to codify patterns.

  • The function signature hints at the iterations intent.
    When you look at this function signature

  func forEach(_ body: (Element) throws -> Void) rethrows
  

you can see that there is no return value - this is a big hint that this function is all about side effects. If a function does not return a result then for it to add any value it needs to cause side effects. For clarification in all the examples above the side effect has been printing to stdout.


map

We get the same benefits (listed above) when we use map. The difference here is that the function signature reveals a different intent:

func map<T>(_ transform: (Element) throws -> T) rethrows -> [T]

This function does have a return type, which suggests this function is all about creating something and not having side effects. The function signature also shows that we need to provide a function that transforms Element to T, which provides the full picture - this function will create a new collection by using the provided function to map values. In terms of inferring that this function should not have side effects you can derive this from a few different principles:

  • Single Responsibility Principle.
    When applied at the function level it really is about only doing one thing. In this case that one thing is transforming each element to build a new output.

  • Query/Command separation.
    In an ideal world a function should either be a query or a command. A query is what we have here - we invoke the function and expect a result back. A command would be where we don't want a result but we want to do something. (There are occasions where you will have both a query/command in one function but it's best to try and separate these things).

For comparison here's classic iteration vs map:

let collection = [ 1, 2, 3, nil, 5, 6 ]

// Classic
var results = [Int?]()
for item in collection {
  results.append("\($0)")
}
print(results)

// map
print(collection.map { "\($0)" })

You can see that classic iteration introduces a few issues again:

  • Mutable state.
    results is declared as var so that it can be mutated in each run of the loop. In this listing we can see all the mutation but in a longer code listing you might have to validate that mutation isn't happening in other places.
    results also has the position of being mutable after the loop has concluded - so even if you do the correct mutation inside the loop it doesn't mean you are in the clear. You need to validate that no mutation is happening after the loop.

  • Efficiency.
    In the example above map is more efficient as it will likely preallocate a new array with the correct size to take all the transformed elements. My naive implementation of classic iteration is not that clever so the array could potentially need to allocate storage multiple times.

  • Side effects leaving the scope of the iteration block.
    In order to mutate results the block inside the loop has to mutate a value outside it's own scope. Ideally we want to keep the scope of mutation as small as possible.

  • It's wordier.
    We read more than we write so it's important that our ideas are communicated succinctly.

  • I can't easily infer intent.
    With map I know the function is returning a new collection by applying some transform. With the classic loop I am relying on reading the whole listing and recognising the pattern of collecting transformed items. Recognising common patterns requires practice and experience so may be harder for newer developers.

  • It's nearly all boilerplate.
    In the classic iteration example nearly all of the code is boilerplate, which is obfuscating my really interesting algorithm that stringifies each item.

  • It's not an expression.
    I've deliberately made the map super short and inlined it inside the print. That's the beautiful thing about map compared to classic iteration - map is an expression whereas the for/in loop is control flow, which changes the places where they are allowed to be used.


Wrap up

The issues I have seen in code reviews are that people are happy to use forEach, map and other higher order functions but they then muddy the water by not following the intent of the functions and introducing side effects in functions that should be side effect free or adding an accumulator to a function that should be purely about side effects. I've even seen multiple scenarios where people use a map and also have an accumulator for collecting different bits, there is almost always a more appropriate API to prevent us from the misuse.

In order to take advantage of the higher level functions we should make sure that we follow the rules around side effects and single responsibility. The aim should be to use these functions to clarify intent, the intent becomes confusing if you use a map and also cause side effects or if you use forEach whilst modifying an accumulator. If we stick to the rules then it makes the task of reading code simpler as the functions behave in known ways and we don't need to become experts at recognising patterns.

This post is in no way advocating that you go and change every iteration to use higher order functions but I do hope that you'll be able to know the tradeoffs and decide when it is appropriate to use each tool. My real aim with this post is to allow people who have read it to have a deeper appreciation of the tradeoffs around different iteration techniques and to allow for a shared base knowledge to build ideas upon.

Swift Partially Applied Functions

TL;DR

Partially applied functions are really helpful and you've probably used them without thinking about it. If you've ever used a higher order function like map or forEach there is a good chance you have been using partially applied functions. This post is a little exploration into what partially applied functions are, how you may have been using them and some ways of using them going forwards.


What is partial application?

Wikipedia has this to say:

... partial application (or partial function application) refers to the process of fixing a number of arguments to a function, producing another function of smaller arity.

*arity is a fancy way of saying "number of arguments a function takes".


Example of partial application.

A small example might be helpful here. If I need to update multiple cells in a UICollectionView that are all within section 3 then I would create the IndexPaths with something like:

let idsToUpdate = [ 3, 6, 7, 9 ]
let indexPaths = idsToUpdate.map { IndexPath(item: $0, section: 3) }
// => [[3, 3], [3, 6], [3, 7], [3, 9]]

I expect most people have written code like the above and not realised that there is a partially applied function hiding in plain site. Let's make the partially applied function a little more visible by pulling it up a level

let indexPathInSection3: (Int) -> IndexPath = { item in 
  IndexPath(item: item, section: 3) 
}

let idsToUpdate = [ 3, 6, 7, 9 ]
let indexPaths = idsToUpdate.map(indexPathInSection3)
// => [[3, 3], [3, 6], [3, 7], [3, 9]]

To frame this updated snippet in terms of the definition above - we are:

  • Taking the function IndexPath.init(item:section:)

    • It has a type of (Int, Int) -> IndexPath
    • Its arity is 2 (it takes 2 arguments)
  • We generate a new function indexPathInSection3 by fixing the argument section with the value 3

    • It has a type of (Int) -> IndexPath
    • Its arity is 1 (it takes 1 argument)

What does this mean?

The above proves that we've probably all been doing this without even realising it. The cool thing is now that we have thought about the concept and know what it's called we can build on this foundation.


Taking it further

It would be nice if there was a simple syntax to create partially applied functions. Ideally it would be a part of the Swift language, as other languages offer this facility and if we have things like @autoclosure I'm sure this could be built as well. With my imagination running I think the syntax wouldn't really vary much from what we use now to grab a reference to a function, with the only difference being that you supply some arguments e.g.

let indexPathInSection3 = IndexPath(item:, section: 3) // => (Int) -> IndexPath

This would allow my original example to end up being

let idsToUpdate = [ 3, 6, 7, 9 ]
let indexPaths = idsToUpdate.map(IndexPath(item:, section: 3))
// => [[3, 3], [3, 6], [3, 7], [3, 9]]

Wishing aside we can have some fun building out our own syntax using function overloading and a single cased enum. The basic idea here will be to create a higher order function that:

1) Takes the function we want to partially apply
2) Takes positional arguments
3) It returns a new function that takes the remaining arguments


To keep this small we'll look at the code required to partially apply a function that takes two arguments. We'll start with a function signature (formatted for easier reading)

func partiallyApply<Arg0, Arg1, Return>(
  _ function: (Arg0, Arg1) -> Return,
  ...
  ...
) -> ...

We start with a function called partiallyApply which has 3 placeholder types that represent the 2 input arguments and the return. The placeholder types allow us to work with a function that can have any combination of types for the arguments/return. The function to partially apply is the first argument and its signature it defined in terms of the placeholder types.


We now need to look at the positionally placed arguments but before we can do that we need to have some way to indicate which argument/s we are not fixing. A one cased enum should serve us well here

enum DeferredArgument {
    case `defer`
}

This enum is really just being used as a sentinel value as we can't use something like Optional because we won't be able to tell if the caller is fixing an argument to the value nil or if they are trying to omit the argument.


Now that we have a way to mark an argument as missing we can fill out the rest of the function signature

func partiallyApply<Arg0, Arg1, Return>(
  _ function: (Arg0, Arg1) -> Return,
  _ arg0: DeferredArgument,
  _ arg1: Arg1
) -> (Arg0) -> Return

The function body we now need to write should return a new function that takes Arg0 as it's input.

func partiallyApply<Arg0, Arg1, Return>(
  _ function: (Arg0, Arg1) -> Return, 
  _ arg0: DeferredArgument, 
  _ arg1: Arg1) -> (Arg0) -> Return {
    return { function($0, arg1) }
}

For this to be useful we also need to provide a function overload that makes it possible to mark arg1 as missing.

func partiallyApply<Arg0, Arg1, Return>(
  _ function: (Arg0, Arg1) -> Return, 
  _ arg0: Arg0, 
  _ arg1: DeferredArgument) -> (Arg1) -> Return {
    return { function(arg0, $0) }
}

This gives us the ability to create partially applied functions and choose which argument is fixed - the callsites would look something like:

let square = partiallyApply(pow, .defer, 2)
square(2) // => 4
let raise2ToThePowerOf = partiallyApply(pow, Double(2), .defer)
raise2ToThePowerOf(16) // => 65536

Finally with these two functions written we can return to our first example and partially apply the creation of an IndexPath using our new helpers

let idsToUpdate = [ 3, 6, 7, 9 ]
let indexPaths = idsToUpdate.map(
    partiallyApply(IndexPath.init(item:section:), 3, .defer)
)
// => [[3, 3], [3, 6], [3, 7], [3, 9]]

Putting it all together in one listing we get

//: Playground - noun: a place where people can play

import UIKit

enum DeferredArgument {
    case `defer`
}

func partiallyApply<Arg0, Arg1, Return>(_ function: @escaping (Arg0, Arg1) -> Return, _ arg0: DeferredArgument, _ arg1: Arg1) -> (Arg0) -> Return {
    return { function($0, arg1) }
}

func partiallyApply<Arg0, Arg1, Return>(_ function: @escaping (Arg0, Arg1) -> Return, _ arg0: Arg0, _ arg1: DeferredArgument) -> (Arg1) -> Return {
    return { function(arg0, $0) }
}

let idsToUpdate = [ 3, 6, 7, 9 ]
let indexPaths = idsToUpdate.map(
    partiallyApply(IndexPath.init(item:section:), 3, .defer)
)

print(indexPaths)

Why bother with all this?

To be honest I'm not trying to sell this too hard because I am just showing something I found interesting. This would be much more useful/convenient for me if it was part of the language and the syntax was kept roughly the same as just grabbing a function reference (as I showed above).

I do believe that abstractions can be powerful but if they are too complex to comprehend then the tradeoffs will not be worth it. In this case the benefit of using a helper function would be that it's super explicit what is going on. In a normal code review I would need to really pay attention with the following code

let idsToUpdate = [ 3, 6, 7, 9 ]
let indexPaths = idsToUpdate.map { IndexPath(item: $0, section: 3) }
// => [[3, 3], [3, 6], [3, 7], [3, 9]]

I would need to make sure that the closure is correctly formed. If the closure is multiple lines long then I would need to mentally parse the whole thing to check for side effects or logic errors. Whereas with the helper function I can relax a little bit and know that there is no funny business going on in the closure and it would be easier to reason about the expected behaviour.

Conclusion

In the above I've just given a label to something that we probably all do without thinking about. If Swift had this a language feature it would lower the barrier to entry and hopefully allow people to partially apply more often and in safer ways than hand rolling a closure every time when a simple fixing of arguments is all that is required.