Thursday, August 30, 2007

Heuristics, one-phase commit and compensations

It's a little known fact that as well as being the world's first Web Services transactions product, HP-WST also had some pretty neat non-Web Services capabilities that we're only now starting to revisit. I've been in the process of writing a paper on one of them for what seems like an age, so decided to give a brief outline here. But first a little background to put the rest into context.

One of the nice things we did with HP-WST from the start was keep the Web Services aspects separate from the core transaction engine. This is something we continued with XTS (now the Web Services transactions component of JBossTS). At the time the reason was that Jim and I needed to make parallel progress, with him concentrating on the SOAP stack (and doing some great work with the HP Web Services team at that time) and me on the protocol engine. Another reason for the separation was to try to make debugging of problems a little easier. One of the things you'll know if you've either developed a distributed system or used one, is that distributed debugging can be a PITA. It was bad enough with CORBA, but Web Services take it to another level. So we had this nice clean separation that meant you could actually configure the system (dynamically) to appear to be running the whole Web Services stack when in fact it wasn't going anywhere near the network. If you knew what you were doing (read: undocumented feature) you could configure this "loop-back" to happen either before or after the SOAP messages were created.

Now an important part of HP-WST was the compensation transaction model it supported. This was based on BTP at the time, but the idea still translates to WS-TX: instead of doing the work in the scope of a single transaction that holds on to locks and other resources for a potentially long time, you do the work in a series of smaller transactions that can each be compensated by some other transaction later. The coordinator (Atom or Cohesion in the case of BTP) remembers the list of participants and drives recovery in the event of failures, so even if your application crashes everything should be resolved.

Because of the "local transport" aspect of HP-WST, people were able to write compensations for local applications, completely ignoring the Web Services stack. Some lighthouse customers found that an interesting prospect. In particular when I was giving a presentation to one group in Madrid, we got on to something I'd been prototyping that offered a nice solution to the old problems of heuristics (how do I resolve a non-atomic transaction?) and having multiple one-phase commit participants in the same transaction (how do I resolve a non-atomic transaction?)

In both of these problem scenarios what typically happens is that someone (e.g., a system administrator) has to get to grips with the inconsistent data and figure out what was going on in the rest of the application in order to try to impose consistency. One of the important reasons this can't really happen automatically (at the TM level) is because it required semantic information about the application, that simply isn't available to the transaction system. They compensate manually.

Until then. What we were proposing was allowing developers to register compensation transactions with the coordinator that would be triggered upon certain events, such as heuristic outcomes or one-phase errors. And to do it opaquely as far as the application developer was concerned. Because these compensations are part of the transaction, they'd get logged so that they would be available during recovery. Plus, a developer could also define whether presumed abort, presumed commit or presumed nothing were the best approaches for the individual transaction to use (it has an affect on recovery and failure scenarios).

Nothing really earth shattering. We'd been offering this kind of thing for a long time through nested top-level transactions, for example. But HP-WST pushed it into a wider arena. With this approach you could write your compensations to try to undo the commit of the one-phase resource, for example, or if it can't be undone then write sufficient information to help the administrator resolve it. Likewise if triggered by a specific heuristic: try to compensate directly at the time the error occurs. Obviously nothing is ever guaranteed, but sometimes being able to try to compensate at the moment the problem happens can save you time and money later.

Now where this becomes more interesting is when you consider annotations. Back in 2000 they didn't exist and we were playing with raw XML or explicit declarative approaches (the latter was a problem because we wanted to be able to apply this to existing deployments without requiring them to be re-coded). But annotations and the work that Maciej has been doing, mean that revisiting this could result in something more powerful and certainly more opaque.

And on that note, back to work (and maybe the paper). Hopefully this has been enough to wet your appetite.

Sunday, August 26, 2007

XA versus WS-TX?

I'm really not quite sure what to say about this article. While the author is right that XA is more mature than WS-TX and that transactions are an important tool in an achitect's tool-belt, saying that XA is a replacement for Web Services transactions is a bit like saying that because IIOP is more mature than SOAP we should all be using it. It's true, but it's never going to happen and overlooks what Web Services bring to distributed transactions: interoperability. I've written about that many times, so won't go over that again.

It's nice to hear that Oracle have identified problems with WS-TX. We all have throughout the evolution of the specifications/standard. WS-CAF offered a better solution over all, but didn't get the backing of IBM and MSFT, which is unfortunate: I still think that from an enterprise perspective all of the specifications within WS-CAF have technical advantages over WS-TX.

However, who hasn't identified problems in the way different XA implementations interpret the XA specification? Last time I looked, we had several workarounds for the differences between Oracle 9i and 10g, let alone how they differ between DB2 and SQLServer. Of course many of these are down to bugs in the respective XA implementation or wrong interpretations of the specification, but just saying something is XA compliant doesn't mean it immediately has a level of maturity.

WS-AT (or WS-ACID in WS-CAF), was developed to allow arbitrary two-phase commit participants to be enrolled in a transaction. Quite similar to OTS in that regard. Obviously XA is important, so it should be considered when providing any new transaction standard, but 2PC existed before XA, so it makes sense to not limit yourself if you don't have to. On that note, I hope I'm not alone in remembering the original XAML?!

Edwin's back

Via Greg, I see that Edwin is getting back in the game. I met Edwin a couple of times when we were working with Collaxa on integrating our XTS product with their BPEL product. Unfortunately some database vendor came along and message that one up ;-)

Friday, August 24, 2007

OpenCSA Plenary is coming up

As I mentioned on Infoq, the OpenCSA Plenary is coming in the next few weeks. This will be the first time that people from outside the original authors will be able to give their input on SCA directly to the authors. One way or another it will definitely be interesting. I'd love to be able to go, but it clashes with other things I've had planned for a long time. If you're at all interested in SCA and/or want to give feedback, go along and/or sign up to the various technical committees.

Synchronous versus Asynchronous

Pat makes a very good point. Something that also drives me nuts. This has actually gotten worse in the Web Services world, where people continually talk about asychronous invocations, where they're really talking about synchronous one-way invocations. Believe it or not, there is a significant difference!

Thursday, August 09, 2007

You know you're getting on when ...

1) You go on vacation and visit a toy museum only to find that many of the items on show are things you had when you were a kid.

2) You take your 13 year old son to the airport for his first unaccompanied flight to see his grandparents in Canada.