Many things in software are impossible magic, until they are not. His argument b...

blantonl · on April 24, 2011

Regardless, I do agree that building your application today like it is a solved problem is the wrong way to do it.

That presumption assumes that the application is being used as the right tool to resolve the problem. And it also assumes that "the problem" is a finite and solvable item.

sigil · on April 25, 2011

> And it also assumes that "the problem" is a finite and solvable item.

Yes. To make this a bit more concrete, if "the problem" is making distributed storage look and behave exactly like local storage, the CAP Theorem has something to say about its solvability.

jamesaguilar · on April 25, 2011

Depends. Local storage is also not perfectly available. If the network is reliable, you can probably get availability high enough that the system feels "close enough" to how local storage feels. Today's networks aren't that reliable, but someday there may be enough redundancy and bandwidth for this to happen.

sigil · on April 25, 2011

> Local storage is also not perfectly available.

Technically true, although you don't have to contend with the consistency or partitioning factors in the local disk case -- there's only one copy of the state. This means you can focus on making the availability factor as close to 1.0 as possible.

This may not be the case when you're forced to balance all three CAP factors. I sometimes wonder if a follow on result to CAP will be a "practical" (physical or information theoretic) limit like C x A x P <= 1-h for some constant h, and we'll just have to come to terms with that as computer scientists, as physics had to with dx x dp >= h. This is of course wildly unsubstantiated pessimism.

Also, I would gladly entertain any argument demolishing the "local disks are not subject to CAP" claim I made above by talking about read / write caches as separate copies of the local disk state.

jamesaguilar · on April 25, 2011

I doubt it. Suppose that there is a network that is never partitioned, and machines connected to that network that never fail. In that case consistency and availability should be perfect. Although networks will never be perfectly reliable, nor machines, they seem to be getting more reliable. Perhaps someday we may be able to say that the odds of enough partitions or machine failures to make the system unavailable are lower than the odds of you getting struck by lightning, at which point you will have for practical purposes defeated the constraints of the CAP theorem.

sigil · on April 25, 2011

> Suppose that there is a network that is never partitioned, and machines connected to that network that never fail. In that case consistency and availability should be perfect.

You mean, in that case tolerance to partition and availability should be perfect.

> Perhaps someday we may be able to say that the odds of enough partitions or machine failures to make the system unavailable are lower than the odds of you getting struck by lightning, at which point you will have for practical purposes defeated the constraints of the CAP theorem.

So this is the really interesting question. All the CAP theorem says is that (C,A,P) != (1.0,1.0,1.0). How close to (1.0,1.0,1.0) could we make (C,A,P)? If infinitely close, then we have achieved perfection by the limit, and the CAP theorem is rather pointless. If not, then what is the numeric limit?

As you speculate, maybe the numeric limit on C x A x P is so close to 1.0 that the odds of seeing a consistency, availability, or partitioning problem are much smaller than getting hit by lightning.

Then again, maybe not. Who knows? ;)

To avoid sounding like a total crackpot, here is an interesting paper that explores the physical limits of computation:

http://arxiv.org/pdf/quant-ph/9908043v3

jamesaguilar · on April 25, 2011

> You mean, in that case tolerance to partition and availability should be perfect.

No. If a network is never partitioned, you don't need to write algorithms that can tolerate partitions. Therefore consistency and availability are possible.

> So this is the really interesting question. All the CAP theorem says is that (C,A,P) != (1.0,1.0,1.0). How close to (1.0,1.0,1.0) could we make (C,A,P)? If infinitely close, then we have achieved perfection by the limit, and the CAP theorem is rather pointless. If not, then what is the numeric limit?

I think you have misunderstood the theorem (at least, if my bachelor-degree-level understanding is correct). C, A, and P are not variables you can multiply together or perform mathematical operations on. They are more like booleans. "Is the web service consistent (are requests made against it atomically successful or unsuccessful)?" "Is the web service available (will all requests to it terminate)?" "Is the web service partition-tolerant (will the other properties still hold if some nodes in the system cannot communicate with others)?" These questions cannot be "0.5 yes". They are either all-the-way-yes or all-the-way-no.

> . . . and the CAP theorem is rather pointless

Not really. It is pointful for networks that experience partitions. It just doesn't apply to reliable networks. It also sort-of doesn't apply when an unreliable network is acting reliably, with the caveat that since it is not possible to tell in advance when a network will stop behaving reliably, you still have to choose between these three properties when writing your algorithms for when the network behaves badly.

sigil · on April 25, 2011

> C, A, and P are not variables you can multiply together or perform mathematical operations on. They are more like booleans.

Right, but I wasn't restating CAP, just wondering about a follow on to CAP that considers the probability of remaining consistent, the probability of remaining available, and the probability of no failures due to network partitions in physical terms.

Is this not an interesting thing to consider? What if someone proves a hard limit on the product of these probabilities in some physical computation context? The CAP theorem is absolutely fascinating to me, especially if it has something real to say about the systems we can build in the future. The future looks even more distributed.

> It is pointful for networks that experience partitions. It just doesn't apply to reliable networks.

Is there such a thing as a "reliable" network when thousands or millions of computational nodes are involved? Are the routers and switches which connect such a network 100% available? If an amplification attack saturates some network segment with noise, what then?

As programmers, we desperately want things to work, and it's easy to greet something like CAP with flat out denial. I know I'm always fighting it. "It will never fail." No, it can and will fail.

jamesaguilar · on April 25, 2011

I still don't understand what you mean when you say, "probability of remaining consistent," etc. Either you wrote the service so it system would always remain consistent or you didn't. Similarly with availability. Either the system will always return a result, or it may sometimes hang.

Maybe what you mean is the probability of whichever of C, A, or P you gave up actually becoming a problem? But I cannot imagine a physical law of the form you are referring to applying uniformly to these disparate properties. I wouldn't even know how to formulate it for consistency. For availability and partition tolerance it would just be, "Requests to this service will (availability: hang forever/partition-tolerance: return with errors) at a rate exactly equal to the probability of network failures."

With regards to your last point, there are no reliable networks, at least where I work. That doesn't mean there won't be.

CPlatypus · on April 25, 2011

Actually, C/A/P are more like variables you can multiply together. Even more accurately, they define a three-dimensional space with the CAP theorem (especially in Gilbert and Lynch's formal proof) only saying that you cannot be at the point that represents the maximum of all three. There are definitely tradeoffs or "mode switches" that can be made between C and A, and some even believe that you can give up some P to get more C/A. Personally I believe that the probability of partition is always non-zero in even the best designed and fully provisioned networks, even including those that are econonomically infeasible, and that those who choose to treat a small probability as zero to capture a performance advantage or position themselves as "more consistent" than competitors are being a bit dishonest, but those tradeoffs are being made.

jamesaguilar · on April 26, 2011

> they define a three-dimensional space with the CAP theorem (especially in Gilbert and Lynch's formal proof) . . .

That's weird because I've read the proof and they speak only of boolean instances of C, A, and P in the proof. They give no examples of systems where any of the three variables have values other than zero or one.

arethuza · on April 25, 2011

Out of interest, where do you draw the line between local storage and distributed storage? By local storage do you mean directly a attached storage?

What about FC SANs or iSCSI over a WAN? Are they local or distributed?