I published the post "A practical approach to read-write quorum systems" a few months ago. The post refers to the paper Read-Write Quorum Systems Made Practical - Michael Whittaker, Aleksey Charapko, Joseph M. Hellerstein, Heidi Howard, Ion Stoica. It illustrates the implementation of "Quoracle". Quoracle provides the optimal quorums in respect of either the `load`

, `network`

or `latency`

.

I have decided to rewrite the tool in Golang to explore the ecosystem and the tooling of the language. This article goes through the Golang implementation of the original Python library.

## Quorum expressions definition

First of all, let's discuss the implementation of the expressions. The original paper uses expressions and nodes to describe quorums:

```
a, b, c = Node("a"), Node("b"), Node("c")
majority = QuorumSystem(reads=a*b + b*c + a*c)
```

The example above builds the majority quorums using the following pairs: `[a,b], [b,c], [a,c]`

. Python can represent the expression above by overloading the `*`

and the `+`

operations. The original quoracle library uses the operations overload approach.

Go advocates simplicity, and it does not embrace operator overloading. So, it is necessary to proceed with a different approach.

In Golang, these methods describe the operations:

The `ExprOperator`

interface defines the operations between two logical expressions. A logical expression between nodes represents many quorums.

Thus, it is possible to describe quorum as follows:

```
a, b, c :=
NewNode("a"), NewNode("b"), NewNode("c")
// (a * b) + (b * c) + (a * c)
majority := NewQuorumSystemWithReads(a.Multiply(b).Add(b.Multiply(c)).Add(a.Multiply(c)))
```

The `ExprOperator`

interface provides the same functionalities as the original library. In the example above, the pairs: `[a,b], [b,c], [a,c]`

are majority quorums. The next section goes through the Golang definition of a `QuorumSystem`

and how to use it.

## Quorum system overview definition

Now that we know how to define an `Expr`

of nodes, we can declare a read-write quorum system. The below implementation describes the `QuorumSystem`

struct used in the library.

The `reads`

and `writes`

fields represent the quorums. The `nameToNode`

field keeps track of the different nodes in the quorum system. The `QuorumSystem`

struct has the methods for calculating the quorum system's capacity, latency, load, and network load.

The `StrategyOptions`

parameter struct represents the configurations for the strategy optimisation:

The `Optimize`

property points to the optimisation target. The `LoadLimit`

, `NetworkLimit`

, `LatencyLimit`

define an optional limit on the load, the network, and latency. The `ReadFraction`

and the `WriteFraction`

determine the workload distribution of the read and write operations.

The `F`

field represents the resilience of the quorum. A quorum is F-resilient for some integer *f*, despite removing any *f* nodes from *r*, *r* is still a read/write quorum[1].

The library defines the initialization functions for a new `QuorumSystem`

:

The above code omits some methods implementations for brevity. If the caller provides either the read or writes quorum, the constructor computes the logical dual operation of the other quorum and initialise a new `QuorumSystem`

struct. If the caller provides both read and write quorums, the constructor checks the validity of the quorums and returns a new `QuorumSystem`

struct with the corresponding quorums.

The following section shows how to translate the optimal strategy problem into a linear programming problem.

## Optimal strategy problem definition

The original python implementation of quoracle uses the PuLP library and coin-or. The previous blog post looked at how to use PuLP for optimisation problems in a python runtime.

The Golang implementation uses a library called lanl/clp. Also lanl/clp relies on coin-or for solving linear programming optimization problems.

The codebase defines a helper struct to build a linear programming problem:

A `lpDefinition`

struct contains the variables needed to describe a linear programming problem.

Let's suppose that we have three six-faced dices. Two dice are not allowed to have the same value. The goal is to find a difference between the 1st and 2nd-largest dice smaller than the 2nd and 3rd dice. The following `lpDefinition`

represents the problem:

The `lpDefinition`

above stores a value in the `Vars`

array for each dice. The `Constraints`

matrix represents the dice range constraint, from 1 to 6. The `Objective`

matrix contains the two goals of the problem:

- Each dice must be different from the others (line 10 and 11);
- The difference between the 1st and 2nd largest dice must be smaller than the one between the 2nd and the 3rd dice.

The example above shows how to use the helper struct in the codebase to build a minimisation problem. Next, we will take a detailed look at the implementation for optimising the metrics. Depending on the optimisation target, the problem definition builds a different `lpDefinition`

struct.

### Load optimization definition

Let's start by describing how the load optimization problem is implemented. To recap, the formula for the load defined in the paper[1] is:

$$ \frac{f_r}{cap_R(x)} \sum_{{r \in R | x \in r }} p_r + \frac{1 - f_r}{cap_W(x)} \sum_{{w \in W | x \in w }} p_w \leq L_{f_r} $$

The following snippet of code implements the above formula, and it builds the LP problem:

The snippet omits some code for brevity. The `buildLoadDef`

local function encapsulates the logic for building the load optimisation problem.

The function initialises the `Vars`

and the `Constraints`

from the `lpVariable`

. Next, it adds the `l`

variable and constraint that indicates the load(\(L_{fr}\)) for the specific read fraction. It builds the load formula for every `lpVariable`

in the problem. For each `Node`

in the quorum system, it applies the following expression in case of a read quorum:

```
tmp[v.Index] += fr * v.Value / float64(*qs.GetNodeByName(n.Name).ReadCapacity)
```

otherwise, in the case of a write quorum, it proceeds by using:

```
tmp[v.Index] += (1 - fr) * v.Value / float64(*qs.GetNodeByName(n.Name).WriteCapacity)
```

The `ReadCapacity`

and the `WriteCapacity`

are configurable for each `Node`

.

The code needs to maintain the same order in the `Vars`

and the `Objectives`

arrays. Thus, each `lpVariable`

uses an `Index`

to refer to the exact position of each element in the arrays.

### Network load optimization definition

This section describes the implementation of the network load. Let's start by refreshing the formula[1]:

$$ f_r ( \sum_{r \in R} p_r \cdot |r|) + (1 - f_r) ( \sum_{w \in W} p_w \cdot |w|) $$

\(|r|\) and \(|w|\) are the length of the read and write quorums sets. The library builds the network load minimization problem as follow:

The above code initialises a new `Vars`

and the `Constraints`

fields for each quorum. Then, it applies the network load formula by multiplying the length of the quorum with `fr`

. Also, the implementation adds a row in the `Objectives`

matrix in case we specify a network limit.

### Latency optimization definition

The last optimization target is the latency. The formula described in the paper defines the latency as:

$$ f_r ( \sum_{r \in R} p_r \cdot latency(r)) + (1 - f_r) ( \sum_{w \in W} p_w \cdot latency(w)) $$

Below is the optimization definition of the latency:

The implementation creates a new `lpDefinition`

populating the `Vars`

and the `Constraints`

. Then, it retrieves the latency for each `readQuourmVars`

and `writeQuorumVars`

. The latency of a quorum is the shortest time required to form a quorum after contacting the nodes in that quorum. The code calculates `l`

using the `readQuorumLatency`

and the `writeQuorumLatency`

methods.

Next, it continues by applying the formula of the latency for the read quorums:

```
obj[v.Index] = fr * v.Value * float64(l)
```

the code takes the same approach for the write quorums using the opposite workload:

```
obj[v.Index] = (1 - fr) * v.Value * float64(l)
```

The following section describes how to translate the optimisation result into a new `Strategy`

. Also, it shows how to execute the LP optimisation using the definitions seen in this section.

## Strategy initialisation

The previous section described how to build the problem definition. Now we can proceed by executing the optimisation. The snippet of code below describes the optimisation execution and the initialisation of the strategy:

As a first step, the code initialises a `NewSimplex`

. The simplex algorithm is a popular linear programming algorithm. We want to find the least objective of our problem. Thus the code sets the optimisation direction as `clp. Minimise`

.

The code proceeds by creating a new `lpDefinition`

based on the optimisation definitions seen in the previous section. For example, if the optimisation target is `Network`

, the code calls the `buildNetworkDef`

function.

On top of the optimisation target, the code needs to add another objective: the total sum of the read and write probabilities must be 1. The `getTotalProbabilityObjectives`

method takes care of that. It returns a new objective array where the read and write probabilities targets `1`

.

The code executes the optimisation and checks that the resulting `status`

is optimal. If the operation succeeds, the code gets back the optimal solution. Then, each quorum initialises a new `SigmaRecord`

with the quorum and its probability of being selected.

Finally, it creates a new `Strategy`

with the `SigmaR`

(array of `SigmaRecord`

for the read quorums) and the `SigmaW`

(array of `SigmaRecord`

for the write quorums).

Let's refresh the definition of strategy as mentioned in the paper:

$$ \sigma = (\sigma_R, \sigma_W) $$

\(\sigma_R\) and \(\sigma_W\) are probabilities distributions: \(\sigma_R(r)\) and \(\sigma_W(w)\) are respectively the probabilities that the strategy chooses a *read quorum* \(r\) and a *write quorum* \(w\).

quoracle-go represents a `Strategy`

in a similar way using the following structs:

The `SigmaR`

and `SigmaW`

corresponds to the strategy's probability of choosing a specific quorum. Also, the `Strategy`

struct maintains a hashmap storing the node and its likelihood of being selected.

The `Strategy`

struct exposes some methods that retrieve some metrics and information. Below is the list of methods provided with the `Strategy`

struct:

The code above omits the implementation of the functions for brevity. The `GetReadQuorum`

, `GetWriteQuorum`

uses a probability distribution to return the quorums. The `Load`

, `Capacity`

, `NetworkLoad`

and `Latency`

methods return the respective metrics for a given read or write workload. The `NodeLoad`

, `NodeUtilization`

and `NodeThroughput`

methods target specific `Node`

in the quorum system. The implementations use the probability of a node getting selected to calculate the respective node metric.

## Searching for the optimal quorum strategy

We have seen how the codebase leverages linear programming to find the optimal strategy.

Let's reiterate one of the primary purposes of quoracle. Given the node's details, an optimisation target and workload distribution, it returns the optimised strategy.

The `Search`

method implements the rule mentioned above. It calculates the optimal strategy by trying all the combinations of quorums. Whenever the optimal strategy for a given valid quorum returns a better target metric, the `Search`

function saves that.

Below is the code implementation of the `Search`

function:

The `dupFreeExprs`

and the `doSearch`

functions encapsulate the core logic.

The `dupFreeExprs`

returns all the possible combinations of quorums composed using a list of nodes.

The `doSearch`

uses the `dupFreeExprs`

outcome to initialise a new quorum system and find an optimal strategy. The implementation keeps track of the `Strategy`

and `QuorumSystem`

with the most optimised metric.

The search process is time measured. When the operation reaches a specified timeout, then the search stops. The timeout-controlled function prevents long-hanging processes.

## Wrap up

This post went through the Golang port of quoracle, describing the main components implemented in the codebase. The code is available at samueleresca/quoracle-go.

The project had two primary purposes. First, to put into practice the core concepts described in the paper[1]. Secondly, to explore the Golang ecosystem and tooling.

Some of the concepts might seem very theoretical, but it is essential to know the basics. Quorums are the foundation of distributed systems topics such as replication and consensus.

## References

[1]Read-Write Quorum Systems Made Practical - Michael Whittaker, Aleksey Charapko, Joseph M. Hellerstein, Heidi Howard, Ion Stoica

[2]lanl/clp provides a Go interface to the COIN-OR Linear Programming (CLP) library

[3]github.com/mwhittaker/quoracle