While reading Baron’s take on 5.1, I saw Mark‘s comment and part of it stuck with me:
And this is a huge problem when you run replication over a flaky network.
When you have a probability of error, there is a number of machines you can run to ensure you *always* have a failure. This number of machines is much less than you think.