Efficient Transactions for Geo-Distributed Datastores
In traditional transaction processing systems, transaction processing
occurs separately from the commit protocol. The transaction performs
read and write operations and completes any computation that it may need
to do before initializing a commit protcol. In the case where the read
and write operations are dependent and cannot be parallelized
(i.e., the value of a key determines the subsequent key access), significant
latency would be incurred if the transaction coordinator was in a different
location from the data, as many cross-datacenter round trips would be required
to perform the read and write requests. For the same reason, the commit protocol
requires more latency if partipating servers are located remotely from the
transaction coordinator. Consequently, CrossStitch's design enables
a transaction designer to group operations, which are located in the same
datacenter, together. Moreover, we present CrossStitch's pipelined atomic
commit protocol, which executes currently with the transaction.
Using Code Shipping and Group Operations Together
In the CrossStitch framework, transactions are structured as a series of
states , where each state is characterized by a key request and
some computation. A transaction shifts state when a new key request is made.
States can be grouped together such that adjacent states are found in the same
datacenter. This allows us to minimize the latency incurred from cross-datacenter
round trips.
Pipeline Atomic Commit Protocol
CrossStitch introduces its pipelined atomic commit protocol (PLC) that executes
concurrently with the transaction's execution. In PLC,
vote request,
vote commit, and
vote abort messages are sent between participating
servers to determine the commit status of the transaction. The role of coordinator shifts down the
transaction chain as each participating server enters the precommit, analogous to the
vote commit, stage. While this is happening, transaction execution progresses.
Consequently, transaction commit time is masked by transaction execution time, and
the overall completion time of a transaction is lower than protocols that separate
the commit protocol and transaction execution.
Replication Message Weaving
We can extend CrossStitch's PLC for replication. The transaction is forwarded
to the key's primary and secondary server (on a write request). Both
servers execute the key request and the transaction, and the secondary server
sends a message to the next hop's primary server. Consequently, a participating
server does not enter the precommit (vote commit) stage until it has received
all of its expected messages (acknowledgement, precommit message from previous server,
and replica's message). By weaving replication messages, we reduce the overall
completion time of a transaction by hiding replication latency.