(Progressive Refinement) An improved solution
Refining the design
Okay, so the big problem seems to be that the database is written before the blob store. This allows records to be read before they are ready. It also allows the system to fail in a state in which records that should not be readable are readable.
Let’s update our design such that:
- The blob store is written before the database.
- If the metadata isn’t present in the database, the server returns an empty record.
This should fix both of the problems we saw before.
Write Profile
Read Profile
Updating our model
We can refine our existing model to implement this behavior. First we update the state machine:
Then we update the formal specification.
Note: Comments have changed to reflect the narrative. See the previous page for more comprehensive comments.
---------------------------- MODULE improved ----------------------------
EXTENDS Naturals, Sequences
CONSTANTS
USERIDS,
SERVERS,
METADATAS,
IMAGES
VARIABLES
databaseState,
blobStoreState,
serverStates,
operations
vars == <<databaseState, blobStoreState, serverStates, operations>>
(***************************************************************************)
(* Strong Typing *)
(***************************************************************************)
UserIdVal == USERIDS \union {"UNSET"}
MetadataVal == METADATAS \union {"UNSET"}
ImageVal == IMAGES \union {"UNSET"}
(***************************************************************************)
(* Describes all possible states a server can be in. *)
(***************************************************************************)
ServerStateVal ==
[
state: {
\* current:
"waiting", \* next: StartWrite or StartRead
\* after: StartWrite
"started_write", \* next: WriteBlob or FailWrite
\* after: WriteBlob
"wrote_blob", \* next: WriteMetadataAndReturn or FailWrite
\* after: StartRead
"started_read", \* next: ReadMetadata
\* after: ReadMetadata, ReadMetadataAndReturnEmpty
"read_metadata" \* next: ReadBlobAndReturn
},
userId: UserIdVal,
metadata: MetadataVal,
image: ImageVal
]
OperationValue == [type: {"READ", "WRITE"},
userId: UserIdVal,
metadata: MetadataVal,
image:ImageVal]
TypeOk ==
/\ databaseState \in [USERIDS -> MetadataVal]
/\ blobStoreState \in [USERIDS -> ImageVal]
/\ serverStates \in [SERVERS -> ServerStateVal]
/\ operations \in Seq(OperationValue)
Init ==
/\ databaseState = [u \in USERIDS |-> "UNSET"]
/\ blobStoreState = [u \in USERIDS |-> "UNSET"]
/\ serverStates = [s \in SERVERS |-> [state |-> "waiting",
userId |-> "UNSET",
metadata |-> "UNSET",
image |-> "UNSET"
]]
/\ operations = <<>>
(***************************************************************************)
(* State Machine: All of the states are functions of s (server), because *)
(* the only actively modeled actors in this system are our servers, but *)
(* there can be multiple working simultaneously. *)
(***************************************************************************)
(***************************************************************************)
(* Writes *)
(***************************************************************************)
StartWrite(s) ==
/\ serverStates[s].state = "waiting"
/\ \E u \in USERIDS, m \in METADATAS, i \in IMAGES:
/\ serverStates' = [serverStates EXCEPT
![s].state ="started_write",
\* Set values for the upcoming write
![s].userId = u,
![s].metadata = m,
![s].image = i]
\* Record the write for observability
/\ operations' = Append(operations,
[
type |-> "WRITE",
userId |-> u,
metadata |-> m,
image |-> i
])
/\ UNCHANGED <<databaseState, blobStoreState>>
WriteBlob(s) ==
LET currentState == serverStates[s]
IN
/\ currentState.state = "started_write"
/\ blobStoreState' = [blobStoreState EXCEPT
![currentState.userId] = currentState.image ]
/\ serverStates' = [serverStates EXCEPT
![s].state ="wrote_blob"]
/\ UNCHANGED <<databaseState, operations>>
\* Writing the database is now the last part of a write operation
WriteMetadataAndReturn(s) ==
LET currentState == serverStates[s]
IN
/\ currentState.state = "wrote_blob"
/\ databaseState' = [databaseState EXCEPT
![currentState.userId] = currentState.metadata]
/\ serverStates' = [serverStates EXCEPT
![s].state ="waiting"]
/\ UNCHANGED <<blobStoreState, operations>>
FailWrite(s) ==
/\ serverStates[s].state \in {"started_write", "wrote_blob"}
/\ serverStates' = [serverStates EXCEPT
![s].state ="waiting",
![s].userId ="UNSET",
![s].metadata = "UNSET",
![s].image = "UNSET"]
/\ UNCHANGED <<databaseState, blobStoreState, operations>>
(***************************************************************************)
(* Reads *)
(***************************************************************************)
StartRead(s) ==
\* Reading only starts when a server is waiting
/\ serverStates[s].state = "waiting"
/\ \E u \in USERIDS:
serverStates' = [serverStates EXCEPT
![s].state ="started_read",
![s].userId =u]
/\ UNCHANGED <<databaseState, blobStoreState>>
/\ UNCHANGED operations
\* If database record is present
ReadMetadata(s) ==
LET currentState == serverStates[s]
IN
/\ currentState.state = "started_read"
\* Represents reading the metadata while the database record is set
/\ databaseState[currentState.userId] # "UNSET"
/\ serverStates' =
[serverStates EXCEPT
![s].state ="read_metadata",
![s].metadata = databaseState[currentState.userId]]
/\ UNCHANGED <<databaseState, blobStoreState>>
/\ UNCHANGED operations
\* If database record is not present
ReadMetadataAndReturnEmpty(s) ==
LET currentState == serverStates[s]
IN
/\ currentState.state = "started_read"
\* Represents reading the metadata while the database record is unset
/\ databaseState[currentState.userId] = "UNSET"
/\ serverStates' = [serverStates EXCEPT
![s].state ="waiting"]
/\ operations' = Append(operations,
(***********************************************)
(* Returns an empty record *)
(***********************************************)
[
type |-> "READ",
userId |-> currentState.userId,
metadata |-> "UNSET",
image |-> "UNSET"
])
/\ UNCHANGED <<databaseState, blobStoreState>>
ReadBlobAndReturn(s) ==
LET currentState == serverStates[s]
IN
/\ currentState.state = "read_metadata"
/\ serverStates' = [serverStates EXCEPT
![s].state ="waiting",
![s].image = blobStoreState[currentState.userId]]
/\ operations' = Append(operations,
[
type |-> "READ",
userId |-> currentState.userId,
metadata |-> currentState.metadata,
image |-> blobStoreState[currentState.userId]
])
/\ UNCHANGED <<databaseState, blobStoreState>>
(***************************************************************************)
(* Specification / Next *)
(***************************************************************************)
Next ==
\* For every step, pick a server and have it advance one state
\E s \in SERVERS:
\/ StartWrite(s)
\/ WriteBlob(s) \* New step
\/ WriteMetadataAndReturn(s) \* New step
\/ FailWrite(s)
\/ StartRead(s)
\/ ReadMetadata(s) \* New step
\/ ReadMetadataAndReturnEmpty(s) \* New step
\/ ReadBlobAndReturn(s)
Spec == Init /\ [][Next]_vars
(***************************************************************************)
(* Invariants *)
(***************************************************************************)
ConsistentReads ==
\* If there are no operations, they are consistent
\/ operations = <<>>
\/ \A i \in 1..Len(operations): \* For every read operation
LET readOp == operations[i] IN
\/ /\ readOp.type = "READ"
\* There must exist a write operation
/\ \/ \E j \in 1..i:
LET writeOp == operations[j] IN
/\ writeOp.type = "WRITE"
\* With the same data
/\ readOp.userId = writeOp.userId
/\ readOp.metadata = writeOp.metadata
/\ readOp.image = writeOp.image
\/ \* Ignore unset reads
/\ readOp.metadata = "UNSET"
/\ readOp.image = "UNSET"
\/ readOp.type = "WRITE" \* Ignore writes
(***************************************************************************)
(* This is used for model checker configuration so the simulation doesn't *)
(* go on forever. *)
(***************************************************************************)
StopAfter3Operations ==
Len(operations) <= 3
=============================================================================
Checking our improved design
Starting small
Let’s start with our our single server case again.
Single server still errors. Invariant ConsistentReads is violated.
- 1. Initial predicate
- blobStoreState
- u1
- : UNSET
- u1
- databaseState
- u1
- : UNSET
- u1
- operations :
- serverStates
- s1
- state: waiting
- userId: UNSET
- metadata: UNSET
- image: UNSET
- s1
- blobStoreState
- 2. StartWrite
- blobStoreState
- u1
- : UNSET
- u1
- databaseState
- u1
- : UNSET
- u1
- operations
-
- userId : u1
- metadata : m2
- image : i1
- type : WRITE
-
- serverStates
- s1
- state: started_write
- userId: u1
- metadata: m2
- image: i1
- s1
- blobStoreState
- 3. WriteBlob
- blobStoreState
- u1
- : i1
- u1
- databaseState
- u1
- : UNSET
- u1
- operations
-
- userId : u1
- metadata : m2
- image : i1
- type : WRITE
-
- serverStates
- s1
- state: wrote_blob
- userId: u1
- metadata: m2
- image: i1
- s1
- blobStoreState
- 4. WriteMetadataAndReturn Successfully completed write: metadata m2, image i1
- blobStoreState
- u1
- : i1
- u1
- databaseState
- u1
- : m2
- u1
- operations
-
- userId : u1
- metadata : m2
- image : i1
- type : WRITE
-
- serverStates
- s1
- state: waiting
- userId: u1
- metadata: m2
- image: i1
- s1
- blobStoreState
- 5. StartWrite
- blobStoreState
- u1
- : i1
- u1
- databaseState
- u1
- : m2
- u1
- operations
-
- userId : u1
- metadata : m2
- image : i1
- type : WRITE
-
- userId : u1
- metadata : m1
- image : i2
- type : WRITE
-
- serverStates
- s1
- state: started_write
- userId: u1
- metadata: m1
- image: i2
- s1
- blobStoreState
- 6. WriteBlob
- blobStoreState
- u1
- : i2
- u1
- databaseState
- u1
- : m2
- u1
- operations
-
- userId : u1
- metadata : m2
- image : i1
- type : WRITE
-
- userId : u1
- metadata : m1
- image : i2
- type : WRITE
-
- serverStates
- s1
- state: wrote_blob
- userId: u1
- metadata: m1
- image: i2
- s1
- blobStoreState
- 7. FailWrite Failed write after writing blob. Inconsistent data: metadata m2, image i2
- blobStoreState
- u1
- : i2
- u1
- databaseState
- u1
- : m2
- u1
- operations
-
- userId : u1
- metadata : m2
- image : i1
- type : WRITE
-
- userId : u1
- metadata : m1
- image : i2
- type : WRITE
-
- serverStates
- s1
- state: waiting
- userId: UNSET
- metadata: UNSET
- image: UNSET
- s1
- blobStoreState
- 8. StartRead
- blobStoreState
- u1
- : i2
- u1
- databaseState
- u1
- : m2
- u1
- operations
-
- userId : u1
- metadata : m2
- image : i1
- type : WRITE
-
- userId : u1
- metadata : m1
- image : i2
- type : WRITE
-
- serverStates
- s1
- state: started_read
- userId: u1
- metadata: UNSET
- image: UNSET
- s1
- blobStoreState
- 9. ReadMetadata
- blobStoreState
- u1
- : i2
- u1
- databaseState
- u1
- : m2
- u1
- operations
-
- userId : u1
- metadata : m2
- image : i1
- type : WRITE
-
- userId : u1
- metadata : m1
- image : i2
- type : WRITE
-
- serverStates
- s1
- state: read_metadata
- userId: u1
- metadata: m2
- image: UNSET
- s1
- blobStoreState
- 10. ReadBlobAndReturn Return inconsistent data: metadata m2, image i2
- blobStoreState
- u1
- : i2
- u1
- databaseState
- u1
- : m2
- u1
- operations
-
- userId : u1
- metadata : m2
- image : i1
- type : WRITE
-
- userId : u1
- metadata : m1
- image : i2
- type : WRITE
-
- userId : u1
- metadata : m2
- image : i2
- type : READ
-
- serverStates
- s1
- state: waiting
- userId: u1
- metadata: m2
- image: i2
- s1
- blobStoreState
We don’t have a perfectly working solution yet, but notice that the previous solution failed in 7 steps, whereas this solution failed in 10. Generally, the more steps that need to occur before failure, the more unlikely the failure—but not always. Let’s take a closer look at the steps. In this case, the single server needed to write successfully, then write unsuccessfully, then read. In the previous one, the server just needed to write unsuccessfully, then read. But once we spell it out, that doesn’t increase our confidence. Of course users are going to write multiple times, over days, so effectively all we need is a write failure for an error to occur.
Testing multiple servers
It may be informative to see how multiple servers fail.
Invariant ConsistentReads is violated.
- 1. Initial predicate
- blobStoreState
- u1
- : UNSET
- u1
- databaseState
- u1
- : UNSET
- u1
- operations :
- serverStates
- s2
- state: waiting
- userId: UNSET
- metadata: UNSET
- image: UNSET
- s1
- state: waiting
- userId: UNSET
- metadata: UNSET
- image: UNSET
- s2
- blobStoreState
- 2. StartWrite
- blobStoreState
- u1
- : UNSET
- u1
- databaseState
- u1
- : UNSET
- u1
- operations
-
- userId : u1
- metadata : m2
- image : i1
- type : WRITE
-
- serverStates
- s2
- state: waiting
- userId: UNSET
- metadata: UNSET
- image: UNSET
- s1
- state: started_write
- userId: u1
- metadata: m2
- image: i1
- s2
- blobStoreState
- 3. WriteBlob
- blobStoreState
- u1
- : i1
- u1
- databaseState
- u1
- : UNSET
- u1
- operations
-
- userId : u1
- metadata : m2
- image : i1
- type : WRITE
-
- serverStates
- s2
- state: waiting
- userId: UNSET
- metadata: UNSET
- image: UNSET
- s1
- state: wrote_blob
- userId: u1
- metadata: m2
- image: i1
- s2
- blobStoreState
- 4. WriteMetadataAndReturn Successfully completed write: metadata m2, image i1
- blobStoreState
- u1
- : i1
- u1
- databaseState
- u1
- : m2
- u1
- operations
-
- userId : u1
- metadata : m2
- image : i1
- type : WRITE
-
- serverStates
- s2
- state: waiting
- userId: UNSET
- metadata: UNSET
- image: UNSET
- s1
- state: waiting
- userId: u1
- metadata: m2
- image: i1
- s2
- blobStoreState
- 5. StartWrite
- blobStoreState
- u1
- : i1
- u1
- databaseState
- u1
- : m2
- u1
- operations
-
- userId : u1
- metadata : m2
- image : i1
- type : WRITE
-
- userId : u1
- metadata : m1
- image : i2
- type : WRITE
-
- serverStates
- s2
- state: waiting
- userId: UNSET
- metadata: UNSET
- image: UNSET
- s1
- state: started_write
- userId: u1
- metadata: m1
- image: i2
- s2
- blobStoreState
- 6. WriteBlob Overwrote blob with image i2. Haven't wrote metadata
- blobStoreState
- u1
- : i2
- u1
- databaseState
- u1
- : m2
- u1
- operations
-
- userId : u1
- metadata : m2
- image : i1
- type : WRITE
-
- userId : u1
- metadata : m1
- image : i2
- type : WRITE
-
- serverStates
- s2
- state: waiting
- userId: UNSET
- metadata: UNSET
- image: UNSET
- s1
- state: wrote_blob
- userId: u1
- metadata: m1
- image: i2
- s2
- blobStoreState
- 7. StartRead
- blobStoreState
- u1
- : i2
- u1
- databaseState
- u1
- : m2
- u1
- operations
-
- userId : u1
- metadata : m2
- image : i1
- type : WRITE
-
- userId : u1
- metadata : m1
- image : i2
- type : WRITE
-
- serverStates
- s2
- state: started_read
- userId: u1
- metadata: UNSET
- image: UNSET
- s1
- state: wrote_blob
- userId: u1
- metadata: m1
- image: i2
- s2
- blobStoreState
- 8. ReadMetadata
- blobStoreState
- u1
- : i2
- u1
- databaseState
- u1
- : m2
- u1
- operations
-
- userId : u1
- metadata : m2
- image : i1
- type : WRITE
-
- userId : u1
- metadata : m1
- image : i2
- type : WRITE
-
- serverStates
- s2
- state: read_metadata
- userId: u1
- metadata: m2
- image: UNSET
- s1
- state: wrote_blob
- userId: u1
- metadata: m1
- image: i2
- s2
- blobStoreState
- 9. ReadBlobAndReturn Return inconsistent data: metadata m2, image i2
- blobStoreState
- u1
- : i2
- u1
- databaseState
- u1
- : m2
- u1
- operations
-
- userId : u1
- metadata : m2
- image : i1
- type : WRITE
-
- userId : u1
- metadata : m1
- image : i2
- type : WRITE
-
- userId : u1
- metadata : m2
- image : i2
- type : READ
-
- serverStates
- s2
- state: waiting
- userId: u1
- metadata: m2
- image: i2
- s1
- state: wrote_blob
- userId: u1
- metadata: m1
- image: i2
- s2
- blobStoreState
Now it fails in 9 steps rather than 6. But again, all it needs is a successful write to happen before the simultaneous read and write. The error hasn’t fundamentally changed that much.
Summary
So we’ve eliminated one class of error: the kind in which the blob store is unset and returned in a read. The problem now is that the blob store and the metadata can get out of sync. This can happen because of either a failure while writing, or one server writing while another one is reading. We’re not done yet. But I have a hunch we can make it work.