In [Noe 86] a simulation study for the comparison of available copies, quorum
consensus, and regeneration was carried out to determine which replication protocol was
the most efficient given a specific configuration of distributed system, and a certain set of
failure characteristics.
The model was programmed in SIMULA [Birwhistle 73], and assumed a local area
network consisting of a number of separate computers interconnected by a
communications medium such as an Ethernet, with no communications failures. The
parameters used in the simulation, such as crash rates and node load, were obtained from
studies of existing distributed systems and from mathematical models, and all parameters
were the same for each replication protocol simulated. Crash frequency varied between 100 and 300 days, with repair times having a mean of 7 days. The number of replicated
resources ranged from only one copy to having three copies, and the ratio of read requests
to write requests varied from a probability of 0.3 up to 0.7, with request frequencies
varying from between 50 and 400 requests per day. The number of nodes in the system also
varied from 10 to 30. All measured results were taken over a simulated time of 2 years of operation.
Simulation Results
The quantities calculated from the results were the read and write availability of the
replicated service. The read availability was defined and calculated as the total number of successful read requests divided by the total number of read requests. Write availability
was similarly defined in terms of write requests.
The Voting protocol provided less protection than either of the other protocols and
would not even be considered until a maximum of 3 copies were used. In such a case the
optimal size for a read and write quorum is 2; with a write quorum of 3 the replicated
resource performed worse than in the non-replicated case because there are three ways to
lose a single copy and destroy the write quorum.
What was found from the results was that replication provides a significant increase in
availability. However, there is little point in going beyond a maximum of two copies. Both
the Available Copies and Regeneration techniques provide a substantial increase in
availability, raising the value of read and write availability very close to 1.0 i.e., whenever a
request is performed upon a replicated resource it will be carried out successfully. There is
very little additional gain with either of these protocols in having a maximum of 3 copies of
each resource.
Both Available Copies and Regeneration are preferable to Voting if network
partitions are rare, or if measures are added to prevent or reconcile independent updates
during partition rejoining. The read and write availability of the Available Copies
technique are the same, and remain relatively constant despite changes in the request rate and the number of nodes. Regeneration can be preferable to Available Copies in an
unstable environment that suffers from high crash frequencies, with a high number of
updates and frequent reconfiguration of the network. Further, Regeneration can equal or
surpass the performance of the Available Copies technique only if enough additional
storage is supplied to allow regeneration.