Mark Little's WebLog

Sunday, March 11, 2012

Big Data

Data is important in everything we do in life. Whether it's a recipe for your favourite dinner or details of your bank account, we are all data driven. This is nothing new either: humanity, and in fact all life, is critically dependant on data (information) and has been for millions, or billions, of years. Therefore, maintaining that data is also extremely important; it needs to be available (in a reasonable period of time), often shareable, secure and consistent (more or less).

If we cannot get to our data then we could be in trouble (catastrophic). If it takes too long to get at then there may be no real difference between it being available or not. If someone else can get to out data without our permission and possibly modify it without our knowledge, then that could be even worse than not being able to access the information. And of course if the information is maintained in multiple places, perhaps to ensure that if one copy is lost then we have backups, then updates to a copy must be made eventually to the others.

Over the centuries individuals and companies have grown successful and controlling from maintaining data securely for others or even managing it so that everyone has to go to them to obtain it. In our industry several large companies grew primarily because of this model. Other vendors became successful through other aspects related to data management or integration. Data is an enabler for everything in the software industry, whether it's middleware or operating system related.

So data is King. That is the way it has been and always will be. Everything we do uses, manipulates or otherwise controls data in some way, shape or form. The arrival of Cloud, mobile and ubiquitous computing does not change it. In fact ubiquitous computing increases the amount of data we have by several orders of magnitude. The bulk of the world's computing power are embedded systems, i.e., systems designed to do a few dedicated functions, in real-time, using sensors for data I/O. Technically all smart-phones and tablets are embedded systems, not PCs. Major drivers for data in the coming years are smart-phones, tablets, sensors, green technology, eHealth/medical, industrial applications and system "health" monitoring.

Much of the new data coming on stream today contains a location, timestamp or both. There has been a ten fold increase in electronically generated data in 5 years. It is predicted that very soon there will be over a Zetabyte of data (1billion terabytes). That's the equivalent of a stack of DVDs half way from here to Mars! Maintaing that data is important. But being able to use it is more important.

It is now well known that issues with traditional RDBMS implementations, their architectures and assumptions, mean that they are insufficient to manage and maintain this kind of data, at least not by themselves. This has lead to the evolution of BigData, with NoSQL and NewSQL implementations. There are a range of different approaches, including tuple space and graph-based databases, because one size does not fit all. Unlike with the RDBMS which is a good generic workhorse, but not optimisable for specific use cases, these new implementations are targeted specifically at them and conversely would make poor generic solutions.

It is extremely unlikely that a single standard, such as SQL, will evolve that works across all NoSQL implementations. However, there may be a set of standards, one for each category of approach. But for many, SQL will still remain a necessity, even if it means they cannot benefit from all of the advantages NoSQL offers: more people understand SQL and more applications are based on it, than anything else. So bridging these worlds is important. Finally it is worth noting that business analytics and sensor analytics will play a crucial role here.

In our industry we are now seeing an explosion in new database vendors looking to become the standard for this next generation. The current generation of developers are more heavily influenced by mobile, social and ubiquitous computing than by mainframes, so the RDBMS is not their natural first thought when considering how to manage data. However, most of these new companies are small open source startups and recognise the problems inherent with both: customers not trusting their important data to small companies that could go under or be acquired by a competitor; other vendors taking their code and creating competing products from it.

Furthermore, some large vendors who have failed to make an impact in the data space or inroads in enterprise middleware, see this new area and these new companies as opportunities. As a result, relationships are being created between these two category of companies in a symbiotic manner. Many of these NoSQL/NewSQL companies are going to merge, be acquired or fail. In the meantime new approaches and companies will be created. Over the next 5 years this new data space, which will integrate with the current RDBMS area, will coalesce and solidify. There are no obvious winners at this stage, but what is clear is that open source will play a critical role.

Tuesday, February 21, 2012

Clouds for Enterprises (C4E) 2012

Call for Papers: The 2nd International Workshop on

Clouds for Enterprises (C4E) 2012

http://nicta.com.au/people/tosicv/c4e2012/

held at the 16h IEEE International EDOC Conference (EDOC 2012) "The Enterprise Computing Conference"

(http://www.edocconference.org), Beijing, China, 10-14 September 2012

Important dates:

Paper submission: Sunday, 1 April 2012

Notification of acceptance: Monday, 28 May 2012

Camera-ready version: Friday, 15 June 2012

Description:

Cloud computing is an increasingly popular computing paradigm that aims to streamline the on-demand provisioning of software (SaaS), platform (PaaS), infrastructure (IaaS), and data (DaaS) as services. Deploying applications on a cloud can help to achieve scalability, improve flexibility of computing infrastructure, and reduce total cost of ownership. However, a variety of challenges arise when deploying and operating applications and services in complex and dynamic cloud-based environments, which are frequent in enterprises and governments.

Due to the security and privacy concerns with public cloud offerings (which first attracted widespread attention), it seems likely that many enterprises and governments will choose hybrid cloud, community cloud, and (particularly in the near future) private cloud solutions. Multi-tier infrastructures like these not only promise vast opportunities for future business models and new types of integrated business services, but also pose severe technical and organizational problems.

The goal of this one-day workshop is to bring together academic, industrial, and government researchers (from different disciplines), developers, and IT managers interested in cloud computing technologies and/or their consumer-side/provider-side use in enterprises and governments. Through paper presentations and discussions, this workshop will contribute to the inter-disciplinary and multi-perspective exchange of knowledge and ideas, dissemination of results about completed and on-going research projects, as well as identification and analysis of open cloud research and adoption/exploitation issues.

This is the second Clouds for Enterprises (C4E) workshop - the first was held in 2011 at the 13th IEEE Conference on Commerce and Enterprise Computing (CEC'11) on Monday, 5 September 2011 in Luxembourg, Luxembourg. The C4E 2011 workshop program, posted on the workshop Web page http://nicta.com.au/people/tosicv/clouds4enterprises2011/, included the keynote "Blueprinting the Cloud" by Prof. Willem-Jan van den Heuvel, presentations of 3 full and 5 short peer-reviewed workshop papers, and the discussion session "Migrating Enterprise/Government Applications to Clouds: Experiences and Challenges". The workshop proceedings were published by the IEEE and included in the IEEEXplore digital library, together with the proceedings of the main CEC'11 conference and the other co-located workshops. The Clouds for Enterprises (C4E) 2012 workshop will be held at another prestigious IEEE conference - the 16h IEEE International EDOC Conference (EDOC 2012) "The Enterprise Computing Conference" in Beijing, China, 10-14 September 2012. The main theme of the IEEE EDOC 2012 conference is "When Services in Cloud Meet Enterprises", so the C4E 2012 workshop is an excellent fit into and addition to the IEEE EDOC 2012 conference.

This Clouds for Enterprises 2012 workshop invites contributions from both technical (e.g., architecture-related) and business perspectives (with governance issues spanning both perspectives). The topics of interest include, but are not limited to:

Technical Perspective:

- Patterns and best practices in development for cloud-based applications

- Deployment and configuration of cloud services

- Migration of legacy applications to clouds

- Hybrid and multi-tier cloud architectures

- Architectural support for enhancing cloud computing interoperability and portability

- Architectural principles and approaches to cloud computing

- Cloud architectures for adaptivity or robustness

- Evaluation methods for cloud architectures

- Architectural support for dynamic resource management to support computing needs of cloud services

- Cloud architectures of emerging applications, such as mashup of enterprise/government services

- Impact of cloud computing on architecture of software and, more generally, IT systems

Enterprise/Government Application Perspective:

- Case studies and experience reports in development of cloud-based systems in enterprises and governments

- Analyses of cloud initiatives of different governments

- Business aspects of cloud service markets

- Technical and business support for various cloud service market roles, such as brokers, integrators, and certification authorities

- New applications and business models for enterprises/governments leveraging cloud computing

- Economic evaluation of cloud-based enterprises

Governance Perspective:

- Service lifecycle models

- Architectural support for security and privacy

- Architectural support for trust in/by cloud services

- Capacity planning of services running in a cloud

- Architectural support for quality of service (QoS) and service level agreement (SLA) management

- Accountability of cloud services, including mechanisms, algorithms and methods for monitoring, analyzing and reporting service status and usage profiles

- IT Governance and compliance, particularly in hybrid and multi-tier clouds

Review and publication process:

Authors are invited to submit previously unpublished, high-quality papers before

***1 April 2012***.

Papers published or submitted elsewhere will be automatically rejected. All submissions should be made using the EasyChair Web site http://www.easychair.org/conferences/?conf=c4e2012.

Two types of submissions are solicited:ˇ

* Full papers - describing mature research or industrial case studies, up to 8 pages long

* Short papers - describing work in progress or position statements, up to 4 pages long

Papers presenting and analyzing completed projects are particularly welcome. Papers about on-going research projects are also welcome, especially if they contain critical, qualitative and quantitative analysis of already achieved results and remaining open research issues. In addition, papers about experiences and comparative analysis of using cloud computing in enterprises and governments are also welcome. Submissions from industry and government are particularly encouraged. In addition to presentation of peer-reviewed papers this one-day workshop will contain a keynote from an industry expert and an open discussion session on practical issues of using clouds in enterprise/government environments.

Paper submissions should be in the IEEE Computer Society Conference Proceedings paper format. Templates (with guidelines) for this format are availableˇat: http://www.computer.org/portal/web/cscps/formatting (see the blue box on the left-hand side). All submissions should include the author's name, affiliation and contact details. The preferred format is Adobe Portable Document Format (PDF), but Postscript (PS) and Microsoft Word (DOC) will be accepted in exceptional cases.

Inquiries about paper submission should be e-mailed to Dr. Vladimir Tosic (vladat at server: computer.org) and include "Clouds for Enterprises 2012 Inquiry" in the Subject line.

All submissions will be formally peer-reviewed by at least 3 Program Committee members. The authors will be notified of acceptance around

***28 May 2012***.

ˇˇ At least one author of every accepted paper MUST register for the IEEE EDOC 2012 conference and present the paper.

All accepted papers (both full and short) will be published by the IEEE and included in the IEEE Digital Library, together with the proceedings of the other IEEE EDOC 2012 workshops. A follow-up journal issue with improved and extended versions of the best workshop papers is also planned.

Workshop Chairs:

Dr. Vladimir Tosic, NICTA and University of New South Wales and University of Sydney, Australia; E-mail: vladat (at: computer.org) ? primary workshop contact

Dr. Andrew Farrell, University of Auckland, New Zealand; E-mail: ahfarrell (at: gmail.com)

Dr. Karl Michael Gîschka, Vienna University of Technology, Austria; E-mail: Karl.Goeschka (at: tuwien.ac.at)

Dr. Sebastian Hudert, TWT, Germany; E-mail: sebastian.hudert (at: twt-gmbh.de)

Prof. Dr. Hanan Lutfiyya, University of Western Ontario, Canada; E-mail: hanan (at: csd.uwo.ca)

Dr. Michael Parkin, Tilburg University, The Netherlands; E-mail: m.s.parkin (at: uvt.nl)

Workshop Program Committee:

The final list of he workshop Program Committee will be listed soon at the workshop Web site: http://nicta.com.au/people/tosicv/c4e2012.

Sunday, February 19, 2012

HyperCard

A long time ago, and in what may seem to some as a galaxy far,far away, there was no web and no way of traversing resources via hyperlinks. In that time the PC was just taking off and most of us were lucky if we shared a computer with less than 5 people at a time! Back then I shared one of the original classic Macs and came across this wonderful piece of software that was to change the way I thought about the world. HyperCard was something I started to play with just because it was there and really for no other reason, but it quickly became apparent that its core approach of hypermedia was different and compelling. These days I can't recall all of the ways in which I used HyperCard, but I do remember that a few of them helped me in my roleplaying endeavours at the time (ok not exactly work related but sometimes you learn by doing, no matter what it is that you are doing!)

When the Web came along it seemed so obvious the way that it worked. Hyperlinks between resources, whether they're database records (cards) or servers, makes a lot of sense for certain types of application. But extending it to a world wide mesh of disparate resources was a brilliant leap. I'm sure that HyperCard influenced the Web as it influenced several generations of developers. But I'm surprised with myself that I'd forgotten about it over the years. In fact it wasn't until the other day, when I was passing a shop window that happened to have an old Mac in it running HyperCard, that I remembered. It's over 20 years since those days, but we're all living under its influence.

Tuesday, February 14, 2012

Is Java the platform of the future?

I've mentioned before, but I think we are living in a period of time where a bigger explosion of programming languages is occurring than at any time in the past four decades. Having lived through a number of the classic languages such as BASIC, Simula, Pascal, Lisp, Prolog, C, C++ and Java, I can understand why people are fascinated with developing new ones: whether it's compiled versus interpreted, procedural versus functional, languages optimised for web development or embedded devices, I don't believe we'll ever have a single language that's right for all developer requirements.

This Polyglot movement is a reality and it's unlikely to go away any time soon. Fast forward a few years we may see a lot less languages around than today, but they will have been influenced strongly by their predecessors. I do believe that we need to make a distinction between the languages and the platforms that they inevitably spawn. And in this regard I think we need to learn from history now and quickly: unlike in the past we really don't need to reimplement the entire stack in the next cool language. I keep saying that there are core services and capabilities that transcend middleware standards and implementations such as CORBA or Java Enterprise Edition. Well guess what? That also means they transcend the languages in which they were written originally.

This is something that we realised well in the CORBA days, even if there were problems with the architecture itself. The fact that IDL was language neutral obviously meant your application could be constructed from components written in Java, COBOL and C++ without you either having to know or really having to care. Java broke that mould to a degree, and although Web Services are language independent, there's been too much backlash over SOAP, WSDL and friends that we forget this aspect at times. Of course it's an inherent part of REST.

However, if you look at what some are doing with these relatively new languages, there is a push to implement the stack in them from scratch. Now whilst it may make perfect sense to reimplement some components or approaches to take best advantage of some language capabilities, e.g., nginx; I don't think it's the norm. I think the kind of approaches we're seeing with, say, TorqueBox or Immutant where services implemented in one language are exposed to another in a way that makes them appear as if they were implemented natively, makes far more sense. Let's not waste time rehashing things like transactions, messaging and security, but instead concentrate on how best to offer these capabilities to the new polyglot movement that makes them fit in as first class citizens.

And to do this successfully is much more than just a technical issue; it requires an understanding of what the language offers, what the communities expect and working with both to fit in seamlessly. Being a Java programmer trying to push Java services into, say, Ruby, with a Java programmers approaches and understanding, will not guarantee success. You have to understand your users and let them guide you as much as you guide them.

So I still believe that in the future Java will, should and must play an important part in Cloud, mobile, ubiquitous computing etc. It may not be obvious to developers in these languages that they're using Java, but then it doesn't need to be. As long as they have access to all of the services and capabilities they need, in a way that feels entirely natural to them, why should it matter if some of those bits are hosted on or by a Java application server, for instance? The answer is that it shouldn't. And done right it means that these developers benefit from the maturity and reliability of these systems, built up over many years of real world deployments. Far better than the alternative.

Thursday, February 09, 2012

The future of Java

Just a couple of cross posts that are worth giving a wider distribution. First on whether this new polyglot movement is the death of Java, and second how the JCP process has been changing for the better over the years.

Tuesday, January 31, 2012

Blogging versus tweeting?

A few years ago when I was thinking about creating a twitter account I pondered about whether it was worth doing when I was blogging. I didn't think I'd use it much! Since creating the account I've been drawn into twitter more and more, so that today I'm finding the roles reversed: blogging is becoming less frequent whilst tweeting is increasing for me.

I think the reason why is pretty obvious: it is so much easier and quicker to tweet than to write a blog. But there are obvious limits in what you can say with 140 characters, so it's not an either/or situation for me. And yet as a result of using twitter I'm finding myself thinking less and less about blogging. That bit I don't quite understand. Now maybe it has nothing to do with my use of twitter; maybe I'd be blogging less regardless of it because of work, family life etc. Who knows? But I do know I find it interesting how twitter has insinuated itself with my life so quickly and seamlessly.

Sunday, January 01, 2012

Transactions on Android

Every year I try to make time for pet projects, be they learning new languages such as Erlang (one of my 2007 efforts), writing a discrete event simulation package in C++, or one of my best which was writing the world's first pure Java transaction service over Christmas 1996. Most of the time I don't manage to make much progress throughout the year, leaving the bulk of the effort for over the Christmas break.

This year was no different, with "port Arjuna (aka JBossTS) to Android" on my to-do list for months. I've been playing around with Android for quite a while, even collaborating with some friends on writing a game (iPhone too). I know that although it's Java-based, there are enough differences to make porting certain Java applications tricky. But over the years I have found porting transactions to different languages and environments a pretty good way to learn about the language or environment in question.

So as well as doing my usual catch-up on reading material, breaking the back of the Android port was top of my list. Now in the past I'd have higher expectations of what I could accomplish in this time period, but these days I have a family and some things take priority (well, most of the time). But once everyone had opened their presents, let the turkey settle in the stomach and sat down to watch The Great Escape (hey, it's Christmas!) I found time to kick it off.

I started simple in order to remove as many variables from the problem as possible. So I went back to JavaArjuna, the ancestor of JBossTS and all that predated it. It has none of the enhancements that we've added over the years, but places less requirements on the infrastructure. For instance, it was JavaArjuna that I ported to the HP Jornada back in 2001 because it also worked with earlier versions of Java.

As in 2001 it went well and it wasn't long before I had transactions running on my Android device. It was nice to see one of the basic tests running and displaying the typical transaction UIDs, statuses, rolling back, committing, suspending etc. Then I moved on to JBossTS. It wasn't quite as straightforward and there are a few hacks or workarounds in there while I figure out the best way to fix things, but it's done too! I'm pretty pleased by the results and will spend whatever time I have in the coming weeks to address the open issues. And I've definitely learned a lot more about Android.

So overall I think it's been a good pet project for 2011. It also showed me yet again that the architecture and code behind JBossTS that the team's been working on for years is still a highly portable solution. It doesn't matter whether you want transactions on a mainframe, in the cloud, or on a constrained device, JBossTS can do them all!

Saturday, December 31, 2011

PaaS 2.0?

A while ago I has some things to say about people trying to add a version number to SOA. At the time it was 2.0 and I like to think I had a little to do with the fact that it died almost as quickly as it was created. I won't go into details, but the interested reader can catch up on it all later.

Now a friend who got caught in the SOA 2.0 crossfire came to me recently and pointed out that some people are now trying to coin the term 'PaaS 2.0' and asked my opinion. At first I didn't know what to think because the original reasons I was against SOA 2.0 didn't seem to apply here because PaaS is so poorly understood. There are no fundamental architectural principles around which it has grown. There are very few examples that everyone agrees upon. There's not even an accepted definition!

But that's when it hit me! How can you assign a version to something that is so I'll defined? It's not like the Web, for instance, where it made sense to have a 2.0. Ok there's some good stuff from the likes of NIST, but there's no agreed reference architecture for PaaS, so how precisely can you say something is PaaS 2.0? The answer is that you can't. Now that doesn't mean you won't be able to do so eventually, but there are quite a few prerequisites that have to be satisfied before that can occur.

So what does this mean? Am I against PaaS 2.0 as I was with its SOA cousin? Yes I am, but for different reasons. As I outlined above, I think it's wrong to try to version something that it so ill defined. Let's figured out what PaaS 1.0 is first!

Friday, December 23, 2011

Future of Middleware

I think it's fair to say that despite my years in industry I'm still an academic at heart. I like the ability to spend time working on a problem without the usual product deadlines. Of course there's the potential that you come up with something that has little relevance to the real world, but that can be mitigated by staying close to industry through funding, sponsorship or other relationships. Often in industry we don't have the luxury of spending years coming up with the perfect solution and whilst it's for very good reasons, it can be frustrating at times for those involved.

But we all make the best of what we have to work with and I love my current position, despite the fact I get to spend less time researching than I would like. In fact in some ways I now understand what Santosh has been doing for years in directing and pushing others in the right directions, whilst at the same time wanting to get more involved himself but not quite having enough time to do it all.

Therefore, I take any opportunity I can find to dive back into research, write papers, code etc. And attend, and possibly/hopefully present at conferences and workshops that are often dominated by the research community, though obviously with practical overtones. The Middleware conference is one such event that I love to participate with in one way or another. Over the years I've had papers there and been on the program committee, and not once have I been disappointed by the quality of submissions.

So it was great to be asked to write a paper with Santosh and Stuart on the future of middleware for FOME. Truth be told, Santosh did the bulk of the writing and his co-authors provided the disparate data and input that he's excellent at being able to form into a coherent whole. The result is a great paper that I presented in Portugal earlier this month. It went down well and I got a lot of good feedback, both from the academics present as well as industrial participants.

But the real high for me was just being at the workshop and listening to all of the other presentations. I had a wonderful time meeting with others there and getting as immersed in the research atmosphere as it's possible to do in 48 hours. I could cast my mind back many years to when I was in full-time research and compare and contrast with today. I got a lot out of the many conversations I had with researchers, both old and new to the field. I hope I had a positive impact on them too, because I came away invigorated and my mind full of new possibilities.

Sunday, November 20, 2011

Wave sick?

What with HPTS, JUDCon, JAX, QCon, JavaOne and various business meetings, I've been doing a lot of traveling recently. Time spent on a plane usually means my mind wanders a bit, covering various topics some of them unrelated. One of the things I got thinking about though, was definitely influenced by a series of talks I've been giving for a while, including at my JBossWorld keynote: the history of distributed systems.

I covered it at Santosh's retirement event too, but from a very personal perspective, i.e., how Arjuna related to it, and relate it did, often in a big way. So this got me to thinking about the various technology waves I've lived through and helped influence in one way or another. And it was quite "chilling" for me to realise how much I'd forgotten about what's happened over the past third of a century or more! (And that made me feel old too!)

I often take for granted that I lived through the start of the PC era: there were no PCs when I first started to code. In fact I'd been developing applications on a range of devices before IBM came out with the first PC or before Microsoft came with the first version of Word. I moved through the era of the BBC, ZX80, Commodores, Ataris, etc. into the first Sun machines, Apples, PCs, laptops, desktops, PDAs, smartphones, pads and much much more. A huge change in the way we interact with computers and importantly the data they maintain. Many different paradigm shifts!

Looking at the middleware shifts that accompanied some of these hardware changes and in fact were often driven by them, I've ridden a number of equally important waves. RPC, distributed objects, bespoke enterprise middleware architectures and implementations, standards based, a number of times there have been explosions of languages including functional and object-oriented, Java, open source, Web Services, REST, mobile, ubiquitous computing, and of course fault tolerance running throughout with transactions and replication. And I'm probably forgetting other things in between these waves.

It's been a bumpy ride at times. The move from CORBA to J2EE wasn't necessarily as good as is could have been. Web Services were often vilified far more than made sense if you were objective. But overall I've enjoyed the ride so far, more or less. And now with Cloud, mobile and beyond it's looking like the next decade will be at least as interesting as the last three. I'm looking forward to playing my part in it!

Thursday, November 03, 2011

The future PC

I've been thinking a lot about what personal compute devices might look like in the future given the amount of time I've been looking at how things have evolved over the past 30 years. Not so much about what a mainframe or workstation computer might look like (assuming they even exist!) but what replaces your laptop, pad, phone etc. Now of course much of what I will suggest is influenced by how I would do it if I could. However, there's also a smattering of technical advancements in there for good measure.

So my biggest bugbear with my current situation is that I have a laptop (two if I include my personal machine), an iPad and a smartphone (two if I include the one I use for international travel). Each of them holds some subset of my data, with only one (laptop) holding it all. Plus some of that data is held in the cloud so I can share it with people or between devices. This is manageable at the moment, but it's frustrating when I need something on my iPad that's on my laptop or something on my phone that's on the iPad (you get the picture).

What I want is the same information available on all of these devices. In fact, what I want is one device that does it all. I rarely use my phone and pad concurrently, or my pad and laptop. There are exceptions of course, but bear with me. (I may be unique in this and some people might want multiple concurrent devices. But that's still possible in this environment.) What would typically satisfy me would be a way to modify the form factor of my device dynamically over time. Taking a touchscreen smartphone through a pad and then to a laptop with large screen, keyboard and trackpad. At each stage I'd like the best performance, graphically and compute, and the most amount of storage.

Is this possible? Well if you look at how hardware had evolved over the past decades it's not that far off. ARM dominates the smartphone arena and although Intel/AMD will eventually find a way into the market my money is on AMD to get to the laptop and workstation performance before they get to the low power consumption sector in any significant manner. So AMD powered laptops that perform equally with their Intel/AMD cousins aren't far off.

What about main memory? Well you only have to look at how things have evolved recently from 512meg through to 8gig and beyond. It's going to be possible to have 8gig smartphones and tablets soon. And SSDs are getting cheaper and cheaper by the month. Capacity-wise it may take them longer to get to the sizes of spinning disks, but once most laptop manufacturers include SSDs by default, the cost per Gig will plummet as their physical sizes continues to do so too. Putting multiple instances in the same device will be possible to fill the size gap too.

Now you could assume that what I'm outlining is a portable disk drive, but it really isn't. I'm assuming it has storage, of course, but I'm also assuming it has a CPU and probably a GPU. Think plug computer, but much smaller and with much more power: certainly the processing power to rival a laptop and probably the graphical power too. I say 'probably' only because I can see situations where the GPU could be part of the form factor you plug the device in to so that you can do work, e.g., the phone housing or the keyboard/screen.

Ok so there we are: my ideal device is the size of a gum packet (much smaller and you'll lose it) and can be plugged into a range of different deployment chassis. Now all I have to do is wait!

Friday, October 28, 2011

HPTS 2011

I'm back from HPTS and as usual it was a great snapshot of the major things happening or about to happen in our industry. In past years we've had Java transactions, ubiquitous computing, transactions for the Internet and the impacts of large scale on data consistency. We've also had discussions on the possible impact of future (back then) technologies such as SSD and flash on databases and transactions.

This year was a mix too, with the main theme of cloud. (Though it turned out that cloud wasn't that prevalent). I think the best way to summarise the workshop would be concentrating on high performance and large scale (up as well as out). With talks from the likes of Amazon, Facebook and Microsoft we covered the gamut of incredibly large numbers of users (700 million) through eventual consistency (which we learnt may eventually not be enough!)

Even 25 years after it started it's good to see many of the original people behind transaction processing and databases still turning up. (I've only been going since the 90s.) This includes Stonebraker (aka Mr Alphabet Soup), who gave his usual divisive and entertaining talks on the subject of SQL and how to do it right this time, and of course Pat, who instructed us never to believe a #%^#%-ing word he says as he's now back on the side of ACID and 2PC! (All in good fun, of course.)

Now it's important to realise that you never go to HPTS just for the talks. The people are more than 50% of the equation for this event and it was good to see a lot of mixing and mingling. We had a lot of students here this time, so if past events are anything to go by I am sure we will see the influence of HPTS on their future work. And I suppose that just leaves me to upload the various presentations to the web site!

Thursday, October 13, 2011

Where have all the postings gone?

I know I've been blogging a lot this year, yet when I look at the entry count for this blog it's not as high as I had expected. Then I realised that most of my attention has been directed at JBoss. So if you're wondering where I am from time to time, it's over here.

RIP Dennis Ritchie

It's safe to say that no programming language has had as big an impact on my career as C. It's also safe to say that no operating system has had as big an impact on my career as Unix. So for these reasons and many others it is sad to hear about the passing of Dennis Ritchie. I met him once, many years ago, when he visited the University and spoke about a range of things, including C and Plan 9. He was a great speaker, someone who helped shape the world we live in, and a nice man. Yet another sad day.

Sunday, October 09, 2011

Bluestone and Steve Jobs

Amongst all of the various articles and tributes to Steve Jobs I came across this from Bob, who is one of the people that has influenced my career significantly (and positively!) over the years. So it's interesting to read the influence Steve had on Bob, Bluestone, HP and hence Arjuna, JBoss and now Red Hat (not forgetting the other companies with which Bob's been involved over the years)! Thanks Bob and thanks, indirectly, Steve.

Thursday, October 06, 2011

A sad day for Apple and the world

I'm an Apple user and have been for many years (since the 1990's). I've had desktops, laptops, ipods, iphones and of course various software. I've been an admirer of Apple and Steve Jobs for just as long, so it's really sad to hear that he has passed away today. The world is a little bit darker now, but I hope his legacy lives on. My thoughts go out to his family.

Sunday, September 11, 2011

September 11th

It's that time of year again but of course this time it's 10 years on. Time to reminisce and remember those who weren't so lucky. I've been thinking about this day for a while and wondering about all of those things that I managed to do in the last decade that I wouldn't have been able to if I'd made a slightly different choice back then. They include many of the things I mentioned previously, such as HP, Arjuna Technologies, JBoss, Red Hat, standards involvement etc.

But they all pale into insignificance when I look at my 9 year old son! And then there's nothing more I can really say except thanks.

Sunday, September 04, 2011

The impact of Arjuna

I've mentioned before that I have the privilege of speaking at Santosh's retirement ceremony. I've also said on several occasions how much Santosh and the Arjuna project have influenced my life over the years. So I decided to speak about the transition of Arjuna from a research project that was originally just the vehicle for several of us to get our PhDs, through to today when it's at the heart of the most downloaded application server in the world.

Fairly obviously I have lived through the transition over the past 25 years. And despite having parted ways with my company in 2005, I've been able to continue to work with them, as well as obviously shepherding the transaction system through JBoss and Red Hat. However, it wasn't until I started to write my presentation that everything we've done over the years came back to me. (I suppose that being so close to things sometimes makes you forget.) I found it really hard to cram 25 years into a 60 minute session, so many things had to be left out or confined to a single bullet. For a start, when Arjuna was still a research effort it managed to help at least a dozen of us get PhDs, was the basis for over 50 papers and technical reports, and influenced distributed systems research and companies from IBM to Sun Microsystems.

But it's when you look beyond the research that the real impact becomes apparent. For a start, in 1994 we used it to implement a distributed student registration system that is still not matched by the one now provided by a certain large business management software purveyor. In 1995 the OTS was being developed and that was already influenced by Arjuna, since Graeme was now at Transarc. It wasn't too long before we began to implement an OTS compliant transaction system using Arjuna and this was my first dealing with standards. We also got involved with IBM, Alcatel and others in defining standards for extended transactions through the CORBA Activity Service (which would later be the basis for the various Web Services transactions efforts.) At about the same time Stuart was driving the workflow submission with Nortel and working on OpenFlow.

Then in 1996 Sun released Oak, later to become Java. We all started to use it in a number of areas, including games, a browser (great way to learn HTML) and a web server. I looked at end-to-end transactions and then decided that an even better way to learn the language would be to implement Arjuna in Java. Over two weeks at Christmas 1996 JavaArjuna was born (later to become JTSArjuna when I ported the OTS.) This was before J2EE, before JTA and before JTS. So not only was this the worlds first 100% pure Java JTS, it was the worlds first 100% pure Java transaction service.

It was round then that we created a company to market the Java and C++ implementations. We were acquired by Bluestone, which was subsequently acquired by HP and Arjuna went into their product suites to compete against BEA and IBM (there was no sign of Oracle middleware in those days!) While our time at HP was limited, we still managed to work on two Web Service transactions standards efforts as well as produce the worlds first such product. We also branched out into high performance messaging and building an ORB.

When HP decided it couldn't make a go of software, we created another startup to concentrate on transactions and messaging. We had several successful years, making sales to the likes of TIBCO and WebMethods, creating two new Web Service standards committees in OASIS and finalising two of them (BTP and WS-TX). We also found a market by replacing the transactions and messaging components in JBoss 3 with our own. And within all this, there was still time to write many papers, give many presentations and more worlds firsts, such as XTS.

As I said earlier, in 2005 we sold transactions to JBoss and I bid farewell to Arjuna the company, though obviously Arjuna the technology stayed pretty close! Over the intervening 6 years and an acquisition by Red Hat, we've seen Arjuna (aka JBossTS) incorporated into every version of AS as well as all of our platforms and many projects, even if they're not written in Java. The teams have branched out into REST as well as Blacktie, to offer XATMI support. There's also work on software transactional memory using JBossTS and now, with the move of Red Hat into the cloud, it's available in OpenShift and beyond.

Even this blog is way too short to cover everything that has happened on this 25 year long journey. I haven't been able to cover other aspects such as OpenFlow and messaging, or the impact of the people who have passed through the Arjuna project and Arjuna companies. I've also only hinted at how all of the research we did at the University or in industry has influenced others over the years. I think in order to really do the Arjuna story justice I need to write a book!

Monday, August 29, 2011

Enterprise middleware and PaaS

I wanted to say more about why existing enterprise middleware stacks can be (should be) the basis for realistic PaaS implementations. If I get time, I may write a paper and submit it to a journal or conference but until then, this will have to do. I'm talking about this at JavaOne this year too, so a presentation may well come out soon.

Sunday, August 21, 2011

Fault tolerance

There was a time when people in our industry were very careful about using terms such as fault tolerance, transactions and high availability, to name just three. Back before the Internet really kicked off (really when the web came along), if you were emailing someone then they tended to either be in academia and in which case they'd be summarily shot for misusing a term, or they'd be in the DoD and in which case they'd probably be shot too! If you were publishing papers or your thoughts for wider review, you tended to have to wait for a year to see publication and that was if reviewers didn't shoot you down for misusing terms, and in which case you had to start all over again. So it paid to think long and hard before you did the equivalent of hitting submit.

Today we live in a world of instant publishing and less and less peer review. It's also unfortunate that despite the fact more and more papers, article and journals are online, it seems that less and less people are spending the time to research things and read up on state of the art, even if that art was produced decades earlier. I'm not sure if this is because people simply don't have time, simply don't care, don't understand what others have written, or something else entirely.

You might ask what it is that has prompted me to write this entry? Well on this particular occasion it's people using the term 'fault tolerance' in places where it may be accurate when considering the meaning of the words in the English language, but not when looking at the scientific meaning, which is often very different. For instance, let's look at one scientific definition of the term (software) 'fault tolerance'.

"Fault tolerance is intended to preserve the delivery of correct service in the presence of active faults. It is generally implemented by error detection and subsequent system recovery.
Error detection originates an error signal or message within the system. An error that is present but not detected is a latent error. There exist two classes of error detection techniques: (a) concurrent error detection, which takes place during service delivery; and (b) preemptive error detection, which takes place while service delivery is suspended; it checks the system for latent errors and dormant faults.
Recovery transforms a system state that contains one or more errors and (possibly) faults into a state without detected errors and faults that can be activated again. Recovery consists of error handling and fault handling. Error handling eliminates errors from the system state. It may take two forms: (a) rollback, where the state transformation consists of returning the system back to a saved state that existed prior to error detection; that saved state is a checkpoint, (b) rollforward, where the state without detected errors is a new state."

There's a lot in this relatively simple definition. For a start, it's clear that recovery is an inherent part, and that includes error handling as well as fault handling, neither of which are trivial to accomplish, especially when you are dealing with state. Even error detection can seem easy to solve if you don't understand the concepts. Over the past 4+ decades all of this and more has driven the development of protocols behind transaction processing, failure suspectors, strong and weak replication protocols, etc.

So it's both annoying and frustrating to see people talking about fault tolerance as if it's as easy to accomplish as, say, throwing a few extra servers at the problem or restarting a process if it fails. Annoying in that there are sufficient freely available texts out there to cover all of the details. Frustrating in that the users of implementations based on these assumptions are not aware of the problems that will occur when failures happen. As with those situations I've come across over the years where people don't believe they need transactions, the fact that failures are not frequent tends to lull you into a false sense of security!

Now before anyone suggests that this is me being a luddite, I should point out that I'm a scientist and I recognise fully that theories and practices in many areas of science, e.g., physics, are developed based on observations and can change when they prove to not be sufficient to describe the things you see. So for instance, unlike those who in Galileo's time continued to believe the Earth was the centre of the Universe despite a lot of data to the contrary, I accept that theories, rules and laws laid down decades ago may have to be changed today. The problem I have in this case though, is that nothing I have seen or heard in the area of 'fault tolerance' gives me an indication that this is the situation currently!