Sunday, December 30, 2007

A response to Greg

I started to write this as a comment to Greg's posting, but it got too long.

I think Greg still misunderstood me, though looking back at my posting I can understand why: just enough detail to confuse and not enough to clarify. Oh well, I was rushed.

First, the notion of a root coordinator isn't present in the WS-BP model at all (most certainly NOT OASIS BTP). The WS-BP approach leverages some of the JFDI (REST-based) transaction work we were doing in HP where once again there wasn't a global coordinator. It was much more akin to the weakly consistent replication models that use a "gossip" approach and no single (centralized) consistency manager, rather than the strongly consistent replication protocols that do use algorithms based on a single coordinator. Same reasons: it doesn't scale (number of participants as well as physical locality), it doesn't perform and it isn't working with the application/user (sometimes taking advantage of the application semantics can make it more efficient to implement a good replication protocol, particularly when you look at recovery). That's why I hinted that the transactions crowd can learn from the replication crowd.

As I think I said in the original (original) post (and during my keynote at DOA 2007): there's not necessarily a single coordinator; there will be "domains" that may have coordinators that drive participants within them (but that'll be implementation specific and hidden behind the "service" endpoint), and how these domains are pulled together into a global "transaction" will not necessarily be through a single coordinator at all. There may be a single coordinator to kick start any interactions, but that role could even be taken by the application. Semantic information about the application/service/specific interaction needs to be "injected" into this model.

Global coordination is definitely out. But that doesn't mean that at some point the state of the system will not be such that an external observer could not tell the difference between when one was used and when one was not used (ignoring timing constraints). As I said in the DOA keynote, it's a bit like Heisenberg's Uncertainty Principle at work: you can tell the state of the participants in the business "transaction" (interaction) but not when that state will appear, or you can look at the participant states at the exact same time but not see the same "values". Yes, the analogy breaks down under closer scrutiny, but it's a nice way to try to illustrate the differences and begin the discussion proper ;-)

If we ever get round to updating our book I can write an entire chapter around this and explain it oh so much better with diagrams. Oh and as usual: one size doesn't fit all (which makes this discussion harder to have in a blog!)


Anonymous said...

Hi Mark,

>If we ever get round to updating our book..
Can we expect the updated ed. sometime soon ?

Mark Little said...

We're discussing it. Every time we do, we seem to get closer.