Sunday, December 31, 2017
Effectiveness of Replication Strategies
In [Noe 86] a simulation study for the comparison of available copies, quorum consensus, and regeneration was carried out to determine which replication protocol was the most efficient given a specific configuration of distributed system, and a certain set of failure characteristics.
The model was programmed in SIMULA [Birwhistle 73], and assumed a local area network consisting of a number of separate computers interconnected by a communications medium such as an Ethernet, with no communications failures. The parameters used in the simulation, such as crash rates and node load, were obtained from studies of existing distributed systems and from mathematical models, and all parameters were the same for each replication protocol simulated. Crash frequency varied between 100 and 300 days, with repair times having a mean of 7 days. The number of replicated resources ranged from only one copy to having three copies, and the ratio of read requests to write requests varied from a probability of 0.3 up to 0.7, with request frequencies varying from between 50 and 400 requests per day. The number of nodes in the system also varied from 10 to 30. All measured results were taken over a simulated time of 2 years of operation.
The quantities calculated from the results were the read and write availability of the replicated service. The read availability was defined and calculated as the total number of successful read requests divided by the total number of read requests. Write availability was similarly defined in terms of write requests.
The Voting protocol provided less protection than either of the other protocols and would not even be considered until a maximum of 3 copies were used. In such a case the optimal size for a read and write quorum is 2; with a write quorum of 3 the replicated resource performed worse than in the non-replicated case because there are three ways to lose a single copy and destroy the write quorum.
What was found from the results was that replication provides a significant increase in availability. However, there is little point in going beyond a maximum of two copies. Both the Available Copies and Regeneration techniques provide a substantial increase in availability, raising the value of read and write availability very close to 1.0 i.e., whenever a request is performed upon a replicated resource it will be carried out successfully. There is very little additional gain with either of these protocols in having a maximum of 3 copies of each resource.
Both Available Copies and Regeneration are preferable to Voting if network partitions are rare, or if measures are added to prevent or reconcile independent updates during partition rejoining. The read and write availability of the Available Copies technique are the same, and remain relatively constant despite changes in the request rate and the number of nodes. Regeneration can be preferable to Available Copies in an unstable environment that suffers from high crash frequencies, with a high number of updates and frequent reconfiguration of the network. Further, Regeneration can equal or surpass the performance of the Available Copies technique only if enough additional storage is supplied to allow regeneration.