I read some scribbling by some nerd working on distributed systems. The problem he mentioned is when you take a task and parallelize it, and then hand off the pieces to a bunch of workers, you aren't done until the last worker finishes. In that case long tail latencies can bite you rather hard. If 99 out of a hundred workers finish their bit in 50-100us and one of them stalls out for 10ms, you gained nothing over a single worker.