(Progressive Refinement) An improved solution
Refining the design
Okay, so the big problem seems to be that the database is written before the blob store. This allows records to be read before they are ready. It also allows the system to fail in a state in which records that should not be readable are readable.
Let’s update our design such that:
- The blob store is written before the database.
- If the metadata isn’t present in the database, the server returns an empty record.
This should fix both of the problems we saw before.
Write Profile
Read Profile
Updating our model
We can refine our existing model to implement this behavior. First we update the state machine:
Then we update the formal specification.
Note: Comments have changed to reflect the narrative. See the previous page for more comprehensive comments.
Checking our improved design
Starting small
Let’s start with our our single server case again.
Single server still errors. Invariant ConsistentReads is violated.
- 1. Initial predicate
- blobStoreState
- u1
- : UNSET
- u1
- databaseState
- u1
- : UNSET
- u1
- operations :
- serverStates
- s1
- state: waiting
- userId: UNSET
- metadata: UNSET
- image: UNSET
- s1
- blobStoreState
- 2. StartWrite
- blobStoreState
- u1
- : UNSET
- u1
- databaseState
- u1
- : UNSET
- u1
- operations
-
- userId : u1
- metadata : m2
- image : i1
- type : WRITE
-
- serverStates
- s1
- state: started_write
- userId: u1
- metadata: m2
- image: i1
- s1
- blobStoreState
- 3. WriteBlob
- blobStoreState
- u1
- : i1
- u1
- databaseState
- u1
- : UNSET
- u1
- operations
-
- userId : u1
- metadata : m2
- image : i1
- type : WRITE
-
- serverStates
- s1
- state: wrote_blob
- userId: u1
- metadata: m2
- image: i1
- s1
- blobStoreState
- 4. WriteMetadataAndReturn Successfully completed write: metadata m2, image i1
- blobStoreState
- u1
- : i1
- u1
- databaseState
- u1
- : m2
- u1
- operations
-
- userId : u1
- metadata : m2
- image : i1
- type : WRITE
-
- serverStates
- s1
- state: waiting
- userId: u1
- metadata: m2
- image: i1
- s1
- blobStoreState
- 5. StartWrite
- blobStoreState
- u1
- : i1
- u1
- databaseState
- u1
- : m2
- u1
- operations
-
- userId : u1
- metadata : m2
- image : i1
- type : WRITE
-
- userId : u1
- metadata : m1
- image : i2
- type : WRITE
-
- serverStates
- s1
- state: started_write
- userId: u1
- metadata: m1
- image: i2
- s1
- blobStoreState
- 6. WriteBlob
- blobStoreState
- u1
- : i2
- u1
- databaseState
- u1
- : m2
- u1
- operations
-
- userId : u1
- metadata : m2
- image : i1
- type : WRITE
-
- userId : u1
- metadata : m1
- image : i2
- type : WRITE
-
- serverStates
- s1
- state: wrote_blob
- userId: u1
- metadata: m1
- image: i2
- s1
- blobStoreState
- 7. FailWrite Failed write after writing blob. Inconsistent data: metadata m2, image i2
- blobStoreState
- u1
- : i2
- u1
- databaseState
- u1
- : m2
- u1
- operations
-
- userId : u1
- metadata : m2
- image : i1
- type : WRITE
-
- userId : u1
- metadata : m1
- image : i2
- type : WRITE
-
- serverStates
- s1
- state: waiting
- userId: UNSET
- metadata: UNSET
- image: UNSET
- s1
- blobStoreState
- 8. StartRead
- blobStoreState
- u1
- : i2
- u1
- databaseState
- u1
- : m2
- u1
- operations
-
- userId : u1
- metadata : m2
- image : i1
- type : WRITE
-
- userId : u1
- metadata : m1
- image : i2
- type : WRITE
-
- serverStates
- s1
- state: started_read
- userId: u1
- metadata: UNSET
- image: UNSET
- s1
- blobStoreState
- 9. ReadMetadata
- blobStoreState
- u1
- : i2
- u1
- databaseState
- u1
- : m2
- u1
- operations
-
- userId : u1
- metadata : m2
- image : i1
- type : WRITE
-
- userId : u1
- metadata : m1
- image : i2
- type : WRITE
-
- serverStates
- s1
- state: read_metadata
- userId: u1
- metadata: m2
- image: UNSET
- s1
- blobStoreState
- 10. ReadBlobAndReturn Return inconsistent data: metadata m2, image i2
- blobStoreState
- u1
- : i2
- u1
- databaseState
- u1
- : m2
- u1
- operations
-
- userId : u1
- metadata : m2
- image : i1
- type : WRITE
-
- userId : u1
- metadata : m1
- image : i2
- type : WRITE
-
- userId : u1
- metadata : m2
- image : i2
- type : READ
-
- serverStates
- s1
- state: waiting
- userId: u1
- metadata: m2
- image: i2
- s1
- blobStoreState
We don’t have a perfectly working solution yet, but notice that the previous solution failed in 7 steps, whereas this solution failed in 10. Generally, the more steps that need to occur before failure, the more unlikely the failure—but not always. Let’s take a closer look at the steps. In this case, the single server needed to write successfully, then write unsuccessfully, then read. In the previous one, the server just needed to write unsuccessfully, then read. But once we spell it out, that doesn’t increase our confidence. Of course users are going to write multiple times, over days, so effectively all we need is a write failure for an error to occur.
Testing multiple servers
It may be informative to see how multiple servers fail.
Invariant ConsistentReads is violated.
- 1. Initial predicate
- blobStoreState
- u1
- : UNSET
- u1
- databaseState
- u1
- : UNSET
- u1
- operations :
- serverStates
- s2
- state: waiting
- userId: UNSET
- metadata: UNSET
- image: UNSET
- s1
- state: waiting
- userId: UNSET
- metadata: UNSET
- image: UNSET
- s2
- blobStoreState
- 2. StartWrite
- blobStoreState
- u1
- : UNSET
- u1
- databaseState
- u1
- : UNSET
- u1
- operations
-
- userId : u1
- metadata : m2
- image : i1
- type : WRITE
-
- serverStates
- s2
- state: waiting
- userId: UNSET
- metadata: UNSET
- image: UNSET
- s1
- state: started_write
- userId: u1
- metadata: m2
- image: i1
- s2
- blobStoreState
- 3. WriteBlob
- blobStoreState
- u1
- : i1
- u1
- databaseState
- u1
- : UNSET
- u1
- operations
-
- userId : u1
- metadata : m2
- image : i1
- type : WRITE
-
- serverStates
- s2
- state: waiting
- userId: UNSET
- metadata: UNSET
- image: UNSET
- s1
- state: wrote_blob
- userId: u1
- metadata: m2
- image: i1
- s2
- blobStoreState
- 4. WriteMetadataAndReturn Successfully completed write: metadata m2, image i1
- blobStoreState
- u1
- : i1
- u1
- databaseState
- u1
- : m2
- u1
- operations
-
- userId : u1
- metadata : m2
- image : i1
- type : WRITE
-
- serverStates
- s2
- state: waiting
- userId: UNSET
- metadata: UNSET
- image: UNSET
- s1
- state: waiting
- userId: u1
- metadata: m2
- image: i1
- s2
- blobStoreState
- 5. StartWrite
- blobStoreState
- u1
- : i1
- u1
- databaseState
- u1
- : m2
- u1
- operations
-
- userId : u1
- metadata : m2
- image : i1
- type : WRITE
-
- userId : u1
- metadata : m1
- image : i2
- type : WRITE
-
- serverStates
- s2
- state: waiting
- userId: UNSET
- metadata: UNSET
- image: UNSET
- s1
- state: started_write
- userId: u1
- metadata: m1
- image: i2
- s2
- blobStoreState
- 6. WriteBlob Overwrote blob with image i2. Haven't wrote metadata
- blobStoreState
- u1
- : i2
- u1
- databaseState
- u1
- : m2
- u1
- operations
-
- userId : u1
- metadata : m2
- image : i1
- type : WRITE
-
- userId : u1
- metadata : m1
- image : i2
- type : WRITE
-
- serverStates
- s2
- state: waiting
- userId: UNSET
- metadata: UNSET
- image: UNSET
- s1
- state: wrote_blob
- userId: u1
- metadata: m1
- image: i2
- s2
- blobStoreState
- 7. StartRead
- blobStoreState
- u1
- : i2
- u1
- databaseState
- u1
- : m2
- u1
- operations
-
- userId : u1
- metadata : m2
- image : i1
- type : WRITE
-
- userId : u1
- metadata : m1
- image : i2
- type : WRITE
-
- serverStates
- s2
- state: started_read
- userId: u1
- metadata: UNSET
- image: UNSET
- s1
- state: wrote_blob
- userId: u1
- metadata: m1
- image: i2
- s2
- blobStoreState
- 8. ReadMetadata
- blobStoreState
- u1
- : i2
- u1
- databaseState
- u1
- : m2
- u1
- operations
-
- userId : u1
- metadata : m2
- image : i1
- type : WRITE
-
- userId : u1
- metadata : m1
- image : i2
- type : WRITE
-
- serverStates
- s2
- state: read_metadata
- userId: u1
- metadata: m2
- image: UNSET
- s1
- state: wrote_blob
- userId: u1
- metadata: m1
- image: i2
- s2
- blobStoreState
- 9. ReadBlobAndReturn Return inconsistent data: metadata m2, image i2
- blobStoreState
- u1
- : i2
- u1
- databaseState
- u1
- : m2
- u1
- operations
-
- userId : u1
- metadata : m2
- image : i1
- type : WRITE
-
- userId : u1
- metadata : m1
- image : i2
- type : WRITE
-
- userId : u1
- metadata : m2
- image : i2
- type : READ
-
- serverStates
- s2
- state: waiting
- userId: u1
- metadata: m2
- image: i2
- s1
- state: wrote_blob
- userId: u1
- metadata: m1
- image: i2
- s2
- blobStoreState
Now it fails in 9 steps rather than 6. But again, all it needs is a successful write to happen before the simultaneous read and write. The error hasn’t fundamentally changed that much.
Summary
So we’ve eliminated one class of error: the kind in which the blob store is unset and returned in a read. The problem now is that the blob store and the metadata can get out of sync. This can happen because of either a failure while writing, or one server writing while another one is reading. We’re not done yet. But I have a hunch we can make it work.