Encoding and decoding polymorphic objects in Swift

29 May 2018

Having produced large bits of complicated software, I’m a fan of the strictness that languages like Swift and Go enforce on you, but at the same time I do enjoy using the dynamic features of programming languages to let me write less code, and at times these two ideals rub up against each other. This post is mostly a way for me to brain dump some of the bits of code I’ve had to reason about recently as I rationalise these two conflicting ideals: hopefully this will save someone else some effort if they’re trying to do similar things, or people can perhaps point out alternative easier ways. A large part of this was inspired by this stackoverflow answer by Hamish Knight.

Prologue: Starting simple by dealing with variable decoding in Swift.

As a warm up to the main topic, let’s look at a simpler problem in the same domain: decoding API responses from a service where the JSON structure may change depending on what you asked and whether you get an error back etc. In a dynamic language like Objective-C or Python this relatively easy, as you’ll decode the input to an array/dictionary of elemental types before then examining that value to work out what it conforms to. But in Swift and Go this approach won’t work, as both want to decode JSON to a specific type of structure (a struct in Go, a struct/class/enum in Swift) with specific properties known ahead of time. This is all good if your API only ever returns a single form of JSON, but I’ve yet to see an API what doesn’t somewhere have quite different structures of JSON at some point (usually the different between a successful and an unsuccessful result).

So, how do you deal with this? You could define a type in your code that is the superset of all possible responses and just code into that and then deal with any optional values that aren’t set, but I’d strongly recommend not doing that - we’re using strongly typed languages for a reason, and this is just trying to escape that. What you should do is define a structure for each possible response, and then test decode each one every time you get a response. For example, in Swift we would do:

import Foundation

struct ValidResponse1: Decodable {
    let name: String
    let count: Int
}

struct ErrorResponse: Decodable {
    let message: String
}

let jsonString = """
{"message": "Things went wrong"}
"""
let jsonData = jsonString.data(using: .utf8)!
let decoder = JSONDecoder()
do {
    let resp = try decoder.decode(ValidResponse1.self, from: jsonData)
    // process resp
} catch {
    do {
        let resp = try decoder.decode(ErrorResponse.self, from: jsonData)
        // process resp
    } catch {
        // do any more in here or handle unknown response
    }
}

Actually, although that’s similar what I’d write in Go, in Swift we can hide this mess by defining an enum and implementing a custom decoder constructor on that. This then leaves your top level code a lot nicer:

enum APIResponse: Decodable {
    case ValidResponse(ValidResponse1)
    // insert other response structs here
    case Error(ErrorResponse)

    init(from decoder: Decoder) throws {
        let container = try decoder.singleValueContainer()
        do {
            let res = try container.decode(ValidResponse1.self)
            self = .ValidResponse(res)
            return
        } catch {}
        // If you had more API Response types add more do/catch blocks here as above.

        // Let the final decode attempt propagate its error
        let res = try container.decode(ErrorResponse.self)
        self = .Error(res)
    }
}

do {
    let resp = try decoder.decode(APIResponse.self, from: jsonData)
    switch resp {
    case .ValidResponse:
        // process resp
    case .Error:
        // process error
    }
} catch {
    // handle unknown response here
}

This is the solution I used to talk to the Docker API in my little Stevedore application. As you can see, hiding all the decoding code in the enum decodable constructor makes for a nice and simple path in your main code logic. But the other thing to note here is a pattern whereby we coerced our many response types into a single container type and then used that in deserialisation. This is a pattern that we’ll find another user for in our main topic.

Chapter 1: Building our app model using polymorphism

In this section nothing scary, I’m just going to set the scene for what is to come. I’ve been playing around with AudioKit, looking at how to build simple effects chains to let me build up more interesting audio effects by composition. The aim here is just to try and understand some of what the elemental audio effects to do a guitar signal, and how each one impacts the sound.

To implement this I use a normal polymorphism pattern where I define a base type of effect that all specific effect classes will inherit from. This being Swift, rather than use class inheritance I’m using protocols, but you could just use a superclass and subclasses to achieve a similar result if you needed to for some reason.

protocol Effect {
    var name: String { get set }
    var active: Bool { get set }

    func doEffect(_ s: SoundSample) -> Void
}

struct ReverbEffect: Effect {
    var name = "Reverb"
    var active = true
    var delay = 3.2
    var feedback = 40

    func doEffect(_ s: SoundSample) -> Void { /* some custom reverb code here */ }
}

struct DistortionEffect: Effect {
    var name = "Distortion"
    var active = true
    var gain = 1.5
    var tone = 6

    func doEffect(_ s: SoundSample) -> Void { /* some custom distortion code here */ }
}

// Add another dozen of so similar effect structs here…

Having built up my effect library I’m interested in building up chains of individual audio effects to make something interesting. A dumb version of this code will look like:

let effectsChain: [Effect] = [DistortionEffect(), ReverbEffect()]

let sample = GetSoundSample()
for effect in effectsChain {
    effect.doEffect(sample)
}
sample.play()

So we get some audio, run it through the sequence of effect processors, and play the sample out. This is why the polymorphic approach is appealing here: we don’t care which effects are in our chain, we just call the same protocol on them and we’re done.

In actuality, those structs have to be classes in my actual application, as internally each one has a reference to an AudioKit class object, and in general if your structure has to store a pointer to an object it too has to be defined as a class rather than a struct.

Chapter 2: Trying to save our effects chain

Having build up a nice sounding chain, the next thing I want to do is save it so that the next time I load my application I can reload it. The instinctive thing to do is, similar to the example in our prologue, just slap Codable onto the protocol, let that get picked up by structs, and away we go.

…
protocol Effect: Codable {
…

// Save our chain as JSON for saving and restoring
do {
    let jsonData = try JSONEncoder().encode(effectsChain)
} catch {
    // process encoding errors here
}

But if we try to compile that we get the following:

% swift main.swift
main.swift:45:36: error: generic parameter 'T' could not be inferred
    let jsonData = try jsonEncoder.encode(effectsChain)

The problem here is that although effectsChain contains a list of structs that conform to a protocol, as far as the encoder is concerned you’ve passed it an array of different structs of different types: they don’t have a common ancestor. You can get exactly the same error if you try the following:

let v: [Any] = [“hello”, 42]
let jsonData = try JSONEncoder().encode(v)

If we’d actually used class inheritance rather than a protocol here, encoding would have worked, as the encoder would have had enough information to work with. But that would be a false sense of achievement! Because if you use classes it will encode okay, but loading things back in will fail, as you just don’t have enough information here in the type you’re encoding to say which concrete classes you should decode to - they’ll all end up as the base class (you can see this here in this gist). If I run the gist you will get:

We restored the chain: [main.Effect, main.Effect]

This is not what we want at all! To solve both of these problems we’re going to have to use a similar pattern to the one we did in the prologue of tidying all our concrete instances under a single type that we can explicitly encode and decode and be aware of the differences between our individual effects.

Chapter 3: Creating a collective type for encoding and decoding

Whilst we could use an enum type to wrap our effect implementations as we did in the prologue to solve this, we’d then lose the advantage in the rest of code to just call “doEffect” (and all the other methods on the protocol that I’ve glossed over to keep the sample code short). Instead we’ll use an enum to help with the type mapping, but our top level wrapper will be a regular struct so we can keep our polymorphic behaviour in the rest of our code without adding switch statements everywhere.

The first step is we create an codable enum that enumerates all the concrete types that we’ll want to encode/decode as a string. This will also have a computed property on it that returns the actual concrete type for the value stored in the enum.

enum EffectType : String, Codable {
    case reverb, distortion

    var metatype: Effect.Type {
        switch self {
        case .reverb:
            return ReverbEffect.self
        case .distortion:
            return DistortionEffect.self
        }
    }
}

At some point in our code we were going to have this switch statement, and this is where we’ve hidden it, so that in the rest of our code we don’t need to see it. We can then add this to our protocol and have each struct include a suitably initialised:

protocol Effect: Codable {
    var type: EffectType { get }
    …
}

struct ReverbEffect: Effect {
    let type = EffectType.reverb
    …
}

struct DistortionEffect: Effect {
    let type = EffectType.distortion
    …
}

Now we have everything we need to make a simple wrapper structure that will contain a single effect and then encode it along with the type information so that it can be uniquely decoded back to the correct type:

struct EffectWrapper {
    var effect: Effect
}

extension EffectWrapper: Codable {
    private enum CodingKeys: CodingKey {
        case type, effect
    }

    init(from decoder: Decoder) throws {
        let container = try decoder.container(keyedBy: CodingKeys.self)
        let type = try container.decode(EffectType.self, forKey: .type)
        self.effect = try type.metatype.init(from: container.superDecoder(forKey: .effect))
    }

    func encode(to encoder: Encoder) throws {
        var container = encoder.container(keyedBy: CodingKeys.self)
        try container.encode(effect.type, forKey: .type)
        try effect.encode(to: container.superEncoder(forKey: .effect))
    }
}

Note we use an extension here to add the codable functionality so that it the init(from decoder:Decoder) method doesn’t replace the default initialiser, otherwise we’d have to redefine the regular init() again rather than just letting the compiler do that.

With that done, I can now happily encode and decode my objects like so:

// Save our chain as JSON for saving and restoring
do {
    let wrappedChain: [EffectWrapper] = effectsChain.map{EffectWrapper(effect:$0)}
    let jsonData = try JSONEncoder().encode(wrappedChain)
    let jsonString = String(data: jsonData, encoding: .utf8)

    if let json = jsonString {
        print(json)
    }

    // now restore
    let newChain = try JSONDecoder().decode([EffectWrapper].self, from:jsonData)
    print("We restored the chain: %@", newChain)
} catch {
    // handle errors
}

The full code for this example is in this gist, and if you run it you’ll see something like:

% swift main.swift

[{"type":"reverb","effect":{"active":true,"delay":3.2000000000000002,"type":"reverb","name":"Reverb","feedback":40}},{"type":"distortion","effect":{"tone":6,"active":true,"type":"distortion","name":"Distorion","gain":1.5}}]

We restored the chain: %@ [main.EffectWrapper(effect: main.ReverbEffect(type: main.EffectType.reverb, name: "Reverb", active: true, delay: 3.2000000000000002, feedback: 40)), main.EffectWrapper(effect: main.DistortionEffect(type: main.EffectType.distortion, name: "Distorion", active: true, gain: 1.5, tone: 6))]

Now we can happily save and restore our effects chain, and all the type strictness is handled away from the main application logic.

Epilogue: Closing comments

For those of us used to Objective-C’s secure coding, this seems a lot more verbose, but that’s the flip side of having a stricter language with limited introspection capabilities. My main gripe about this approach is that I have to keep the EffectType enum up to date as I add new effects, but because I have to define the type from protocol in the new struct I’m unlikely to forget to do that, though it is susceptible to copy/paste errors - say I end up with a new Flanger effect struct that I lave with the type property set to reverb and the encoded properties aren’t the same - that won’t get detected by the compiler and will blow up instead at run time, which is sad. But this is why you have unit tests I guess.

Tech notes by Michael Winston Dales

Encoding and decoding polymorphic objects in Swift

Prologue: Starting simple by dealing with variable decoding in Swift.

Chapter 1: Building our app model using polymorphism

Chapter 2: Trying to save our effects chain

Chapter 3: Creating a collective type for encoding and decoding

Epilogue: Closing comments