Airmail Journal #3

Sync Race Conditions

It's been a while since I've posted about Airmail but I have been working on it whenever I find time outside of my usual contracting work. Since my early thoughts I've changed the design somewhat: I'm now using 2 NoSQL databases, MongoDB & CouchDB, instead of a traditional RDBMS. MongoDB handles authentication and payments, where CouchDB handles user data.

Why CouchDB?

CouchDB is something I've used before in my contracting work and it is particularly useful when it comes to syncing data. It has built in support for revision identifiers, change sequences, and persistent HTTP connections. The main advantage here is that my own API can just provide a lightweight wrapper around CouchDB's HTTP API for syncing changes.

Local Changes

The UI on the device only ever refers to its own persistent store, but there are classes that listen for changes to the data on the device, and cache them in a separate file until the next sync operation takes place—I call this the Local Changes cache. I'm using Core Data, so the Local Changes cache listens for the NSManagedObjectContextDidSaveNotification and processes the objects that have been changed.

The Local Changes cache is intelligent enough to concatenate multiple changes to the same object that happen while the device is offline, and send them all to the server as one change. It also means that I don't have to query the database for changes, rather I can just keep this cache in memory.

To put it in a simple format, the local changes cache might look like this:

{
    "envelopes": [{
        "id": "17bb2c1c-313d-41a6-ae31-c90785155687",
        "changes": [{
            "attribute": "name",
            "value": "A New Name",
            "timestamp": 1423083473
        }]
    }, {
        "id": "0cfd5804-d92d-47fa-b3c0-3845835dee25",
        "changes": [{
            "attribute": "budget",
            "value": "50.00",
            "timestamp": 1423083525
        }]
    }],
    "transfers": [{
        "id": "23590686-48a0-4f85-b0e7-3f0606cb0244",
        "changes": [{
            "attribute": ...,
            "value": ...,
            "timestamp": ...
        }, {
            "attribute": ...,
            "value": ...,
            "timestamp": ...
        }]
    }]
}

Note that each change has a timestamp associated with it, which is used for conflict resolution on the server. This allows the conflict resolution to be attribute-specific, rather than just whomever made the last change to the object wins.

I plan to go into the device sync engine in more detail in another post, but this description serves to provide context for the race condition in the following thought experiment.

The Race Condition

Presume there are 2 people sharing data, John and Marie. Both their devices are currently in Airplane Mode and they both change the name of the same envelope, Holidays. John changes it to Holiday 2015 and Marie changes it to Family Holidays (Vacations for my US friends 😝). John's change is made before Marie's, so once a full sync completes Family Holidays will be the winning change.

John disables Airplane mode and updates the server to Holiday 2015, and subsequently Marie disables Airplane mode, and herein arrives the problem: once Marie's device downloads the changes from the server, her database will be updated Holiday 2015 despite the fact that she currently has the winning change queued up in Local Changes. The Local Changes and the database are now out of sync, and when Marie's device pushes its own changes up to the server: Family Holidays will become the value on the server, and on John's device, but not on her own.

The Way Around

The only solution I can see is to do some conflict resolution on the device, something I was hoping to avoid. When changes are downloaded from the server the application will have to check its Local Changes before writing to the database. Annoyingly.