I work for Red Hat, where I lead JBoss technical direction and research/development. Prior to this I was SOA Technical Development Manager and Director of Standards. I was Chief Architect and co-founder at Arjuna Technologies, an HP spin-off (where I was a Distinguished Engineer). I've been working in the area of reliable distributed systems since the mid-80's. My PhD was on fault-tolerant distributed systems, replication and transactions. I'm also a Professor at Newcastle University and Lyon.
Friday, April 11, 2008
Another blast from the past
I was in Neuchatel this week for some meetings and one of our conversations moved on to failure detection/failure suspecting: the fact that you cannot reliably detect failures until (and unless) those failures are eventually recovered from. Typical "detection" uses timeouts and if you use the wrong value you can end up in a world of pain. That's where failure suspectors come in: the idea is that if you think something has failed then you make sure everyone else agrees with you so even if you are wrong you don't end up with split-brain syndrome. This reminded me of some work I did back in the 90's around quantum mechanics and failure detectors.
No comments:
Post a Comment