Handling bad input with Decodable

When consuming data from an external service we can’t always guarantee that the source data will be structured how we expect. This makes our apps potentially brittle when using Decodable because it is so strict on structure. In general everything needs to decode successfully to get results back, if any object fails to decode then you get no results at all. This is not ideal for our end users as we should fail gracefully by showing the content that our app can parse and not just show empty data screens.


Problem outline

In the following code we have a data type called Person that has two non optional properties. We also have some input data in the form of a JSON array that contains two objects. The first object has both of the required properties whereas the second does not.

struct Person: Codable {
    let name: String
    let favouriteAnimal: String
}

let data = Data("""
[
    {
        "name" : "Paul Samuels",
        "favouriteAnimal" : "Dog"
    },
    {
        "name" : "Elliot Samuels"
    }
]
""".utf8)

do {
    _ = try JSONDecoder().decode([Person].self, from: data)
} catch {
    print(error)
}

When executing the above code we do not get an array with one valid Person instance instead we end up in the catch statement with a keyNotFound error, which is caused by the second object in the JSON array not having both of the required properties.


How do we fix this?

This is actually not as easy as you might think (at least I didn’t think it was).

The first problem to solve is that the default behaviour when decoding a collection is that it will throw if decoding any of the collection’s children throws. If we decode a collection manually we can avoid this behaviour but then we encounter the issue that the decoder’s currentIndex will only progress forwards when a decode of a child object is successful. This basically means that when trying to decode the collection’s children if any of the decodes fail we won’t be able to continue iterating through the collection.

There are a few strategies that we could take:

1) Loosen the strictness of our type
This seems like a terrible idea to me. Our types should model our domain accurately e.g. if our app can’t handle a property not being present then we shouldn’t model it as an optional. I try to get rid of optionality as soon as possible or you find that you get optional handling code leaking throughout your entire codebase.

2) Create a private type that is less strict
This is an improvement on the above because we are restricting the scope of the more permissive type. Internally we might parse a private _Person type that has the optionality and then optionally convert this back to our Person type.

3) Create a wrapper to swallow any exceptions during parsing
This is fairly similar to option 2 in that we are creating a wrapper but it’s more generic as we won’t need to manually hand roll these private types.


Option 2

I’m not bothering with option 1 as it’s a weak option so I’m jumping straight to option 2.

Let’s start by creating the more permissive variant of Person where the properties are now optional:

struct _Person: Decodable {
    let name: String?
    let favouriteAnimal: String?
}

When using the [_Person].self type the decoding will no longer throw with our input data. Next we need to convert _Person instances into our stronger Person type - we’ll add a new init to do the optional conversion:

extension Person {
    init?(person: _Person) {
        guard
            let name = person.name,
            let favouriteAnimal = person.favouriteAnimal else {
                return nil
        }

        self.name            = name
        self.favouriteAnimal = favouriteAnimal
    }
}

With this scaffolding in place our call to decode the JSON now looks like this:

try JSONDecoder().decode([_Person].self, from: data).compactMap(Person.init(person:))

It’s not particularly pretty but it’s now gracefully parsing as much of the input data as possible and discarding the bits that are not usable.


Option 3

This option comes from this answer on stackoverflow. The high level idea is to wrap the type we want to decode with something that will successfully decode regardless of whether the wrapped type succeeds or not.

The code to do this looks like the following:

struct FailableDecodable<Base: Decodable>: Decodable {
    let base: Base?

    init(from decoder: Decoder) throws {
        let container = try decoder.singleValueContainer()
        self.base     = try? container.decode(Base.self)
    }
}

With this in place our decoding changes slightly to:

- try JSONDecoder().decode([_Person].self, from: data).compactMap(Person.init(person:))
+ try JSONDecoder().decode([FailableDecodable<Person>].self, from: data).compactMap { $0.base }

I think Option 3 is the stronger option here because it requires less code and less duplication of types.


How do we debug?

We’ve made our app more resilient to bad input but we’ve introduced an issue. How on earth do we debug this now when stuff goes wrong?

In general if the backend feed is providing 10 objects then our app should be able to parse and handle all 10 of those objects. If the app is silently discarding data then this could be seen as good and bad

The silencing of errors is occurring due to the try? being used in FailableDecodable. What we need to do is capture the errors rather than just discarding them.


Again this is another case where it’s not immediately obvious how we can resolve the problem. This is still thinking in progress but here’s one potential way:

Let’s start by adding context to our Decoder. A Decoder has a userInfo property that seems like it will be ideal. We’ll add a class at a known key that can hold our errors collection (it needs to be a class so that the errors collection can be mutated).

The following code performs the above but in a slightly more involved way to remove the stringly typed aspect of storing things in a Dictionary:

class DecoderDebugContext {
    var errors = [Error]()
}

private let decoderDebugContextKey = CodingUserInfoKey(rawValue: "com.paul-samuels.decoder-debug-context")!

extension JSONDecoder {
    var debugContext: DecoderDebugContext? {
        return userInfo[decoderDebugContextKey] as? DecoderDebugContext
    }

    var debugContextEnabled: Bool {
        get { return debugContext != nil }
        set {
            if newValue {
                userInfo[decoderDebugContextKey] = debugContext ?? DecoderDebugContext()
            } else {
                userInfo[decoderDebugContextKey] = nil
            }
        }
    }
}

extension Decoder {
    var debugContext: DecoderDebugContext? {
        return userInfo[decoderDebugContextKey] as? DecoderDebugContext
    }

    var debugContextEnabled: Bool {
        return debugContext != nil
    }
}

Now that we have this scaffolding in place we need to teach FailableDecodable how to use this stuff. This is essentially removing the try? and expanding it into a full try/catch and using the error:

struct FailableDecodable<Base: Decodable>: Decodable {
    let base: Base?

    init(from decoder: Decoder) throws {
        let container = try decoder.singleValueContainer()

        do {
            self.base = try container.decode(Base.self)
        } catch {
            decoder.debugContext?.errors.append(error)
            self.base = nil
        }
    }
}

Pulling all of this together we can now parse our code in a fail safe way and get both the results and errors (when we want them). To get the errors we need to configure the decoder before using it like this:

let decoder = JSONDecoder()
decoder.debugContextEnabled = true
let result = try decoder.decode([FailableDecodable<Person>].self, from: data).compactMap { $0.base }

print(result)
print(decoder.debugContext?.errors ?? [])

Conclusion

In this post we’ve looked at some motivations for wanting to parse data in a more permissive way and how you can achieve it using Decodable. Some of these things are not very obvious but once you get your head around how it works you can start to see further ways to improve debugging.


Bonus debugging fun

Getting both the results and any errors that were generated whilst creating the results is great but wouldn’t it be even better if we could also capture the raw data. This sounds a little daft because we should have access to the raw data if we are decoding it - this is true but in my experience I see people tend to chuck raw data into a decode call and leave it at that. By doing this we are losing the potential context of the raw data because we don’t parse/log it anywhere.

We can fix this problem by creating a new generic type that will decode what we want but also optionally grab the raw data as well. This example makes use of AnyCodable from Flight School.

enum DebugDecodable<T: Decodable>: Decodable {
    case debug(AnyDecodable, T)
    case simple(T)

    init(from decoder: Decoder) throws {
        let base = try decoder.singleValueContainer().decode(T.self)
        if decoder.debugContextEnabled {
            self = .debug(try decoder.singleValueContainer().decode(AnyDecodable.self), base)
        } else {
            self = .simple(base)
        }
    }
}

This is now super powerful as we have access to the raw data, the parsed data and any errors that occurred when throwing data away.

let decoder = JSONDecoder()
decoder.debugContextEnabled = true
let result = try  decoder.decode(DebugDecodable<[FailableDecodable<Person>]>.self, from: data)

print(result)
print(decoder.debugContext?.errors ?? [])