Monday, December 31, 2012

Adventures in Pi Land Part Two

In the first part of this blog we looked at the initial setup of the Raspberry Pi. In this one we'll look first at building and running Fuse Fabric, followed by vert.x. So first Fabric and before I go on it's worth mentioning that this is still not complete and I need to check with the team in the new year to figure get more details on the problems, which could very well be due to my setup on the Pi.

Initially we had 256 Meg of swap and with maven2 installed the Fabric build docs tell us to use the m2 profile option (and the right MAVEN_OPTS). Unfortunately the initial build attempt failed with maven2, so I installed maven 3.0.4 which gets us a little further:

[ERROR] Failed to execute goal org.apache.maven.plugins:maven-surefire-plugin:2.12:test (default-test) on project archetype-builder: Error while executing forked tests.; nested exception is Cannot run program "/bin/sh" (in directory "/home/pi/fusesource/fuse/tooling/archetype-builder"): error=12, Cannot allocate memory -> [Help 1]

So we increase swap to 1024 Meg by again editing the /etc/dphys-swapfile as mentioned before. This helped, but even with these options, the Pi crashed after about 6 hours of building. (Yes, the Pi crashed!) And as you can see from the screen shot, prior to this Java was taking up a lot of the processor time.

And if not Java, then writing to swap:

After playing with more configuration parameters (with failures 6+ hours later) and ordering a faster SD card (which has not turned up at the time of writing), I decided to try the the latest official binary of Fabric from the download site (fuse-fabric-7.0.1.fuse-084) - no point struggling to get it to build if it won't run anyway. Working through the Getting Started tutorial did not produce the expected output, which needs investigation once I get back to work and can talk with the team. For instance, when trying to use camel ...

And over 7 minutes later the same command gave:

And running top showed the processes were fairly idle:

I spent a few more hours trying other options, but nothing really seemed to make a difference or indicate whether the issues were with the Pi or elsewhere. But I still consider this a success, since I learnt a few things more about the capabilities and limitations of the Pi that I hadn't before, as well as about Fabric (which was the original project I had set myself). And I will return to this and see it through the completion soon.

So onwards to vert.x. Initially I decided not to build the latest source, but to just see if the official binary would run. Now vert.x requires JDK 7, so we need to download and that from Oracle. Next, downloading and installing and following the installation guide brings us to:

Success! It took over a minute for the server.js example to initialise (since I'm connecting remotely to the Pi, I changed localhost to the actual IP address). But HelloWorld eventually worked.

We're on a roll here! Next we moved on to the JavaScript Web Application Tutorial - I want to test as much of vert.x as possible and implicitly the Raspberry Pi's boundaries. This tutorial needs MongoDB. Unfortunately you can't just install the Linux distribution version as it's not suitable for the Pi. I looked around to see if there was a build I could get elsewhere and found one, but unfortunately that needs the hardware floating point version of Wheezy, which we can't use if we are using Java 6 or 7.

Someone said that you can build an older version of MongoDB from source which I tried. To do this I made sure we were still working with 1024 Meg of swap. But unfortunately after about 10 hours, the build failed:

Fortunately a little digging around found a couple of other options and I went with the first, i.e., Still with 1024 Meg of swap, after 12 hours of building and similarly for installation, I ended up with a running version of MongoDB on the Raspberry Pi!

I was then able to progress through the example, but still had a few failures due to incorrect module versions mentioned in the walkthrough. Thanks to Tim for giving the right numbers and he's going to update the pages asap. There are still a few niggles with the example failing during authentication, but I can live with them for now and will get to the bottom of them eventually.

But what about building vert.x from scratch on the Pi? Well that was a lot easier than I had thought given everything else that's happened so far. Building vert.x head from source and getting Jython 2.5.2,  we initially get an error:

But if you use the nightly build of gradle:

Which eventually results in ...

Success again!

So what does all of this mean? Well apart from the obvious areas where I still need to see success, I think it's been a great exercise in learning more about the Pi. I've also learnt a lot more about the various projects I wanted to play with this festive season. I'm happy, even if it did take quite a few more days to do than I expected originally!

Adventures in Pi Land Part One

Over this Christmas vacation I set myself a number of pet projects that I'd normally not have time to do during the rest of the year. Over the last 6 months or so I've been playing with the Raspberry Pi, but not really pushing it a lot - more a case of playing around with it. So I decided that I'd try and make all of my projects over Christmas relate to the Pi in one way or another. This blog and the follow up, will relate what happened.

OK, so before we really get going it's worth looking at the Pi setup. In keeping with its background, setting up the Pi is pretty simple and you can find details in a number of places including the official Pi site. But I'll include my configuration here for completeness. First, I've been using one of the original Model B instances, i.e., one with 256 Meg of memory and not the newly updated version with 512 Meg. As a result, if you've got a newer version then you may be able to tweak a few settings, such as the swap space.

Because I'm playing with JDK 6 and 7, I used the soft-float variant of Wheezy. After burning that to an SD card, remember to use rasp-config to get back the entire disk space, or you'll find an 8Gig SD card only appears to have a few hundred Meg free! And don't forget to use the right kind of SD card - faster is better. I run my Pi headless (no free monitor or keyboard these days), so initially I had it connected to my router via an ethernet cable and then immediately configured wifi. How you do this will depend upon the wifi adapter you use, but I'm happy with the Edimax EW-7811Un and you can get information about how to update the kernel with the right driver from a number of places.

Once wifi was up and going, I changed swap size for the Pi. In the past this wasn't an issue, but then I hadn't been about to build Arjuna, Fuse, vert.x and MongoDb! You can modify swap by editing /etc/dphys-swapfile and then running /etc/init.d/dphys-swapfile stop followed by /etc/init.d/dphys-swapfile start. Initially I started off with 256 Meg of swap, but as you'll see later, this wasn't always sufficient! Finally let's start by adding openjdk 6 (sudo apt-get install openjdk-6-jre openjdk-6-jdk) followed by git and maven (sudo apt-get install maven2 git).

So this brings us to a base from which we can proceed with the real projects. The first one, which was building Arjuna/JBossTS/Narayana, was pretty straightforward compared to the others and has been documented elsewhere. Which means in the next instalment we'll look Fuse Fabric, vert.x and because of that project, MongoDB.

Wednesday, December 26, 2012

Thunderbirds are go!

Over the year a number of people who were influential in my life have died. So it's with great sadness that I just heard that Gerry Andersen passed away today! Growing up in the later 60's and early 70's, programs such as Space 1999, Thunderbirds, Joe 90, Stingray and Captain Scarlet were the height of TV watching for kids. Each episode was an event, probably on the same scale as blockbuster films are today for kids. Having two children growing up over the past 2 decades, I can say that I've had to watch quite a lot of children's TV in the 90's and beyond, and yet none of them seem to have been as influential on my kids or their friends as Gerry Andersen's various efforts had on us.

Some kids watching them today may laugh at the special effects or the "wooden" acting, but I think that's due to the expectations that films like Lord of the Rings or Avatar have set. But relatively speaking, Gerry was the king of his era and those programs will live on in the memories of many people, myself included. It's truly a sad day. And I can't think of a more fitting way to say thank you and honour that memory than to watch my boxset of Stingray!

Monday, December 24, 2012

A busy busy year

I'm on holiday and for the first time in about 6 months I have some time to reflect on the past year. And one word sums it up: busy. One way or another work has been tying me up, flying me around, or just consuming my energy. Fortunately I love my work, or I'm sure I'd have suffered a lot more stress than I have. Most of the time work presents problems that I have to solve, and throughout my life I've loved problem solving! And the people I work with are great and friendly too, which helps a lot.

Now what got me to reflecting on the past year was simply when I took a look at what I've blogged about here over the last 12 months. It's not a lot, compared to previous years and yet I had thought it was much more. However, when you take into account the blogs I write on and other articles I write, such as for InfoQ, it starts to make sense. And of course there's twitter: despite my initial reservations about using it, I think this year has been the one where I've really put a lot more effort into being there and tracking what others are saying and doing. I believe there's a correlation between the amount I've tweeted and the reducing in the blogging I've done, at least here.

So what about 2013? Well I've deliberately tried not to think that far ahead, but I already know it promises to be as busy as 2012. Bring it on!

Sunday, December 09, 2012

Farewell Kohei

I was shocked earlier this week when I found out that Kohei Honda passed away suddenly. I've known Kohei personally for a few years but longer in terms of his influence on the field of distributed systems. The work that we've been doing around Savara, WS-Choreography, Pi Calculus and beyond, has been better for his involvement. We were talking about new bodies of collaborative work only in the last few weeks, which makes his death even more shocking and personal. He will be missed and my best wishes go out to his family and his other friends and colleagues.

Farewell Sir Patrick Moore

I haven't blogged for a while because I haven't really had time or the inclination to do so. However, when I saw that Sir Patrick Moore has just died I couldn't help put fingers to keyboard! As a child growing up in the 70's, The Sky At Night was a wonderful program to follow. Not only was this the Space Age, but science fiction was rich and influential as a result, as well as the fact that in the UK we had just 2 TV channels to choose from (both of which started in the morning and ended in the late evenings). Way before Star Wars, Close Encounters etc. this small program hidden away at night on the BBC was my view screen into the much larger universe beyond my four walled world. And Patrick Moore was the epitome of the scientists who were pushing back the curtain to reveal the secrets beyond.

To say that The Sky At Night and it's presenter influenced me would be a bit like saying that the air around us influences us: both were pivotal when I was in my most formative years and I know many people who were similarly impacted by them. So it is with great sadness that I heard of his death; my condolences go out to his family and friends. And maybe tonight I'll get my telescope out and think of him and that time over 35 years ago when he captured my attention and helped shape me in some small way. Thank you Sir Patrick, you will be sorely missed!

Sunday, October 28, 2012

Cloud and Shannon's Limit

I've been on the road (or air) so much over the past few months that some things I had thought I'd blogged about turn out to be either dreams or only to have hit twitter. One of them is Shannon's Limit and its impact on the Cloud, which I've been discussing in presentations for about 18 months or so. There's a lot of information out there on Shannon's Limit, but it's something I came across in the mid 1980's as part of my physics undergraduate degree. Unfortunately the book I learned from is no longer published so apart from a couple of texts that are accessible via Google I can't really recommend all of them (they may be good, but I simply don't have the context to say that with certainty). However, if you're looking for a very simple, yet accurate, discussion of what Shannon's Limit says, it can be found here.

So what has this got to do with the Cloud? In the context of the Cloud then put simply, Shannon's Limit shows that the Cloud (public or private) only really works well today because not everyone is using it. Bandwidth and capacity are limited by the properties of the media we use to communicate between clients and services, no matter where those services reside. But for cloud, the limitation is the physical interconnects over which we try to route our interactions and data. Unfortunately no matter how quickly your cloud provider can improve their back end equipment, the network to and from those cloud servers will rarely change or improve, and if it does it will happen at comparatively glacial speeds.

What this means is that for the cloud to continue to work and grow with the increasing number of people who want to use it, we need to have more intelligence in the intervening connections between (and including) the client and service (or peers). This includes not just gateways and routers, but probably more importantly mobile devices. Many people are now using mobile hardware (phones, pads etc.) to connect to cloud services so adding intelligence there makes a lot of sense.

Mobile also has another role to play in the evolution of the cloud. As I've said before, and presented elsewhere, ubiquitous computing is a reality today. I remember back in 2000 when we (HP) and IBM were talking about it, but back then we were too early. Today there are billions of processors, hundreds of millions of pads, 6 billion phones etc. Most of these devices are networked. Most of them are more powerful than machines we used a decade ago for developing software or running critical services. And many of them are idle most of the time! It is this group of processors that is the true cloud and needs to be encompassed within anything we do in the future around "cloud".

Friday, October 26, 2012

NoSQL and transactions

I've been thinking about ACID and non-ACID transactions for a number of years. I've spent almost as long working in the industry and standards trying to evolve them to cater for environments where strict ACID transactions are too much. Throughout all of this I've been convinced that transactions are the right abstraction for many of the fault tolerance, reliability and consistency requirements. Over the years transactions have received bad press in some quarters, sometimes from people who don't understand them, over use them, or don't really want to have to implement them. At times various waves of technology have either helped or hindered the adoption of transactions outside of the traditional database; for instance some NoSQL efforts eschew transactions entirely (ACID and extended) citing CAP when it's not always right to do so.

I think a good transactions implementation should be at the core of all middleware platforms and databases, because if it's well thought out then it won't add overhead when it's not needed and yet provides obvious benefits when it is. It should be able to offer a wide range of transaction models (well at least more than one) and a model that makes it easier to reason about the correctness and consistency of applications and services developed with it.

At the moment most NoSQL or BigData solutions either ignore transactions or support ACID or limited ACID (only in the scope of a single instance). But it's nice to see a change occurring, such as seen with Google's Spanner work. And as they say in the paper: "We believe it  is better to have application programmers deal with performance problems due to over use of transactions as bottlenecks arise, rather than always coding around the lack of transactions."

And whilst I agree with my long time friend, colleague and co-author on RDBMS versus the efficacy of new approaches, I don't think transactions are to be confined to the history books or traditional back-end data stores. There's more research and development that needs to happen, but transactions (ACID and extended) should form a core component within this new infrastructure. Preconceived notions based on overuse or misunderstanding of transactions shouldn't disuade their use in the future if it really makes sense - which I obviously think it does.

Wednesday, September 19, 2012

Travel woes

I've been doing a lot of international travel in the last few weeks, with more to come. It can be annoying at times on flights, what with the people who knock you with their bags if you're sat on the aisle; passengers who put so much stuff under their seats that it encroaches on your leg space; those who recline their seats when you're trying to eat; then there are the passengers who bring suitcases on board big enough to live in (people, it's not carry-on if you have to wheel it in or need help picking it up!); or those kids who cry all flight and kick the back of your seat.

But the passengers who annoy me the most are those idiots who throw bags into the overhead lockers and rely on the door to keep things in! Then when someone else opens it, guess who the bags land on?! And when the guilty party simply states "Oh, I didn't realise", it really doesn't help! Look, if you didn't realise then you really should go back to school and learn about gravity! The next person who does that is likely to get more than harsh words from me.

Tuesday, September 04, 2012

Coming or going?

I've been dreading September because it represents the busiest travel schedule that I can recall having for many years. After this week it seems that I am away from the country every week for the next 7 weeks, stopping back in the UK to make sure my family remember what I look like! I'm at JavaOne, StrangeLoop, a JBUG, a Cloud-TM meeting, new hire orientation for our Fuse acquisition, a Customer Advisory Board meeting and a Quarterly Business Review. On average I think I'll have 2 days a week at home and will see the inside of planes more than the inside of my home. Ouch!

Sunday, August 26, 2012

Farewell Neil Armstrong

As soon as I heard about the death of Neil Armstrong I felt like I had to say something:

But I wanted to say a bit more. I was only 3 when we first landed on the moon. I'm told you shouldn't really be able to remember things that far back, or when you're that young, but I do: we had a black-and-white TV and I recall sitting on the floor of the living room watching the landing. Whether it would have happened with or without that moment, from then on I always had science, astronomy and space flight in my mind. Whether it was reading about black holes, rockets, time dilation or science fiction, or going to university and studying physics and astrophysics, they all pushed me in the same direction.

Landing on the moon was a pivotal event for the world and also for me personally. And Neil Armstrong was the focus of that event. I never met him, but for the past 40+ years I've felt his influence on many of the things I've done in my life. Thanks Neil!

Sunday, August 12, 2012

JavaOne 2012

I just got my schedule for JavaOne and was also informed that I'll be on their "Featured Speaker"
carousel at the top of the JavaOne 2012 home page. Here's my schedule in case anyone wants to meet up or listen to a session:

Session ID: CON4385
Session Title: Dependability Challenges for Java Middleware
Venue / Room: Parc 55 - Cyril Magnin II/III
Date and Time: 10/1/12, 15:00 - 16:00

Session ID: CON10656
Session Title: JavaEE.Next(): Java EE 7, 8, and Beyond
Venue / Room: Parc 55 - Cyril Magnin II/III
Date and Time: 10/3/12, 16:30 - 17:30

Session ID: CON4367
Session Title: Java Everywhere: Ready for Mobile and Cloud
Venue / Room: Parc 55 - Market Street
Date and Time: 10/3/12, 11:30 - 12:30

Monday, August 06, 2012

Tower of Babel

I've been spending the weekend stripping my garage ready for it to be demolished: we're having an extension built which means a new garage. During this I came across some bags that had been stored for over 10 years and was pleasantly surprised when I looked within: lots of old school and university books on physics, chemistry, maths and computer science. The latter contained a number of language books that I'd used since I started with computers way back in the late 1970s and it got me thinking about what languages I've learnt and used over the years. So at least for my own edification, here they are in roughly chronological order (ignoring domain specific languages, such as SQL):

Basic - various dialects such as Commodore, zx80, BBC.
6502 machine code.
Lisp, Forth, Prolog, Logo.
68000 machine code and others ...
Pascal-w, Concurrent Euclid, Occam, Ada, Smalltalk-80.
C++, Simula.
Java, Python.
D, Erlang.
Io, Ruby, Ceylon (still a work in progress), Scala, Clojure.

There are probably others I've forgotten about. Truth be told, over the years I've forgotten much of several of the ones above as well! But now I've found the books again, I'm going to refresh my memory.

Thursday, August 02, 2012

Gossip and Twitter

I had to explain gossip protocols to a group of students the other day. Now in the past when I've done this I've gone through some worked examples, using people and rumour mongering, followed by a more formal analysis of the algorithms underlying the various protocols. However, this time I decided to use something that most people have direct experience with: Twitter. Using some well documented examples of how gossip and rumours spread via Twitter, as well as a real-time experiment with the students, it seemed to go down much easier. So I think I'll stick with Twitter as a good example of how gossip protocols can work. Of course it's not the full picture, but it's a good way of broaching the subject.

HP missed the Android boat

I'm just back from my annual vacation to visit the in-laws in Canada. Apart from the usual things I do there, such as fishing, diving and relaxing by the pool under 30 Centigrade temperatures with not a single cloud in the sky, I usually end up spending some time at technical support for the extended family. This time one of the things I ended up doing was something I wanted to do for myself earlier this year: install Android on an HP TouchPad. When HP ditched the TouchPad I tried to get hold of one of them when they were cheap (about $100); not for WebOS but because the hardware was pretty good. Unfortunately I couldn't get hold of one, but my mother-in-law did and she's suffered under the lack of capabilities and apps ever since.

So I installed ICS on the TouchPad relatively easily and the rest, as they say, is history. Apart from the camera not working (hopefully there'll be a patch eventually), the conclusion from my in-law is that it's a completely new device. And after having used it myself for a few days, I have to agree. Even 8+ months after it was released, the TouchPad ran Android as smoothly as some of the newer devices I've experienced. I think it's a real shame that HP decided to get out of the tablet business (at least for now) with an attitude that it either had to be WebOS or nothing. I can also understand the business reasons why they wanted to get value out of the Palm acquisition. But I do think they missed a great opportunity to create a wonderful Android tablet.

Monday, June 18, 2012

Worried about Big Data

I've been spending quite a lot of time thinking about Big Data over the past year or two and I'm seeing a worrying trend. I understand the arguments made against traditional databases and I won't reiterate them here. Suffice it to say that I understand the issues behind transactions, persistence, scalability etc. I know all about ACID, BASE and CAP. I've spent over two decades looking at extended transactions, weak consistency, replication etc. So I'm pretty sure I can say that I understand the problems with large scale data (size and physical locality). I know that one size doesn't fit all, having spent years arguing that point.

As an industry, we've been working with big data for years. A bit like time, it's all relative. Ten years ago, a terabyte would've been considered big. Ten years before that it was a handful of gigabytes. At each point over the years we've struggled with existing data solutions and made compromises or rearchitected them. New approaches, such as weak consistency were developed. Large scale replication protocols, once the domain of research, became the industrial reality.

However, throughout this period there were constants in terms of transactions, fault tolerance and reliability. For example, whatever you can say against a traditional database, if it's been around for long enough then it'll represent one of the most reliable and performant bits of software you'll use. Put your data in one and it'll remain consistent across failures and concurrent access with a high degree of probability. And several implementations can cope with several terabytes of informations.

We often take these things for granted and forget that they are central to the way in which our systems work (ok you could argue chicken-and-egg). They make it extremely simple to develop complex applications. They typically optimise for the failure case, though, adding some overhead to enable recovery. There are approaches which optimise for the failure free environment, but they impose and overhead on the user who typically has a lot more work to do in the hopefully rare case of failures.

So what's this trend I mentioned at the start around big data? Well it's the fact that some of the more popular implementations haven't even thought about fault tolerance, let alone transactions of whatever flavour. Yes they can have screaming fast performance, but what happens when there's a crash or something goes wrong? Of course transactions, for example, aren't the solution to every problem, but if you understand what they're trying to achieve then at some point somewhere in your big data solution you'd better have an answer. And "roll your own" or "DIY" isn't sufficient.

This lack of automatic or assistive fault tolerance is worrying. I've seen it before in other areas of our industry or research and it rarely ends well! And the argument about it not being possible to provide consistency (whatever flavour) and fault tolerance at the same time as performance doesn't really cut it in my book. As a developer I'd rather trade a bit of performance, especially these days when cores, network, memory and disk speed are all increasing. And again, these are all things we learnt through 40 years of maintaining data in various storage implementations, albeit mostly SQL in recent times. I really hope we don't ignore this experience in the rush towards the next evolution.

Sunday, June 17, 2012

Software engineering and passion

I was speaking with some 16 year old students from my old school recently and one of them told me that he wanted to go to university to become a software engineer. He's acing all of his exams, especially maths and sciences as well as those topics that aren't really of interest. So definitely a good candidate. However, when I asked what he had done in the area of computing so far, particularly programming, the answer was nothing.

This got me thinking. By the time I was his age, I'd been programming for almost four years, written games, a basic word processor and even a login password grabbing "utility". And that's not even touching on the electronics work I'd done. Now you could argue that teaching today is very different than it was 30 years go, but very little of what I did was under the direction of a teacher. Much of it was extra curricula and I did it because I loved it and was passionate enough to make time for it.

Now maybe I've been lucky, but when thinking about all of the people I've worked with over the years and work with today, I'd say that they all share that passion for software engineering. Whether they've only been in the industry for a few years or for several decades, the passion is there for all to see. Therefore, I wonder if this student had what it takes to be a good engineer. But as I said, maybe I'm just lucky in the people with whom I've been able to work, as I'm sure there are those software engineers for whom it really is just a day job and they are still good at that job. But I'd still hate to not have the passion and enthusiasm for this work!

Sunday, June 10, 2012

When did we stop remembering?

Over the past year or so I've been reading articles and papers, or watching recorded presentations, on fault tolerance and distributed systems, produced over the last couple of years. And whilst some of it has been good, a common theme throughout has been the lack of reflection on the large body of work that has been done in this area for the past four decades or more! I've mentioned this issue in the past and had hoped that it was a passing trend. Unfortunately I just finished watching a video from someone earlier this year at the "cutting edge" of this space who described all of the issues with distributed systems, fault tolerance and reliability; not once did he mention Lamport's Time, Clocks and Ordering of Events in a Distributed System (yet he discussed the same issues as if they were "new"), failure suspectors, the work of Gray, Bernstein and others. The list goes on! If this had been a presentation in the 1970's or 80's then it would have been OK. But in the 2nd decade of the 21st century, where most work in the software arena has been digitised and is searchable, there is no excuse!

Monday, May 21, 2012

Jim Gray

It's been over 5 years since I first blogged about Jim going missing at sea. At each subsequent HPTS we've had a few things to say to remember him and come together to hope that we'd see him again. Even at the remembrance event, where people spoke eloquently about Jim, his family and his work, there was always the thought that he would turn up again. However, with this recent news, it seems inevitable that although hope is never completely lost, it is less and less likely. I first met Jim in the late 1990s, though I'd known about him since the mid 1980s when I started my PhD. I'll miss him and I know that everyone who met him feels the same. My thoughts go out to his family.

Friday, April 27, 2012

Java Forum at the Titanic Centre

I wrote on my JBoss blog about a trip I made this week to present about JBoss in Belfast. Well it was a great event with around 200+ people there to hear what the speakers had to say. Well worth the trip, even if the flight out was delayed by 6 hours! But what really made the trip was the Titanic Centre: I've never presented on a stage like this before - it was a replica of the staircase from the Titanic! I've included a couple of pictures I took to give an idea of what it was like, but these pictures don't really do it justice.

Java or the JVM

In my previous post about Java as the platform of the future I may not have been too clear on what I actually meant when I said that developers in other languages would (should) be using Java under the covers. What I thought I had made clear was that if you've got a perfectly good wheel, why reinvent it if it can be relatively easily and opaquely re-tasked for another vehicle? Some concrete examples: if you've got a high performance message service implemented in Java, you don't need to reimplement it in Ruby in order for those developers to be able to take advantage of those capabilities. Likewise for transactions and other core services.

Friday, April 06, 2012

Transactions and parallelism and actors, oh my!

In just 4 years time I'll have spent 3 decades researching and developing transactional systems. I've written enough about this over the years to not want to dive in to it again, but suffice to say that I've had the pleasure of investigating a lot of uses for transactions and their variations. Over the years we've looked at how transactions are a great building block for fault tolerant distributed systems, most notably through Arjuna which with the benefit of hindsight was visionary in a number of ways. A decade ago using transactions outside of the database as a structuring mechanism was more research than anything else, as was using them in massively parallel systems (multi-processor machines were rare).

However, today things have changed. As I've said several times before, computing environments today are inherently multi-core, with true threading and concurrency, with all that that entails. Unfortunately our programming languages, frameworks and teaching methods have not necessarily kept pace with these changes, often resulting in applications and systems that are inherently unreliable or brittle in the presence of concurrent access and worse still, unable to recover from the resultant failures that may occur.

Now of course you can replicate services to increase their availability in the event of a failure. Maybe use N-version programming to reduce or remove the chances that a bug in one approach impacts all of the replicas. But whereas strongly consistent replication is relatively easy to understand, it has limitations which have resulted in weak consistency protocols that trade off things like performance and ease of use for application level consistency (e.g., your application may now need to be aware that data is stale.) This is why transactions, either by themselves on in conjunction with replication, have been and continue to be a good tool in the arsenal of architects.

We have seen transactions used in other frameworks and approaches, such as the actor model and software transactional memory, sometimes trading off one or more of the traditional ACID properties. But whichever approach is taken, the underlying fundamental reason for using transactions remains: they are a useful, straightforward and simple mechanism for creating fault tolerant services and individual objects that work well for arbitrary degrees of parallelism. They're not just useful for manipulating data in a database and neither are they to be considered purely the domain of distributed systems. Of course there are areas where transactions would be overkill or where some implementations might be too much of an overhead. But we have moved into an era where transaction implementations are lighter weight and more flexible than they needed to be in the past. So considering them from the outset of an application's development is no longer something that should be eschewed.

Back to the Dark Ages

The other day, due to severe snowstorms (for the UK), we ended up snowed in and without power or heating for days. During this time I discovered a few things. For a start, having gas central heating that is initiated with an electric started is a major flaw in the design! Next, laptop batteries really don't last long. And a 3G phone without access to 3G (even the phone masts were without power!) is a great brick.

But I think the most surprising thing for me was how ill prepared I was to deal with the lack of electricity. Now don't get me wrong - we've had power outages before, so had a good stock of candles, blankets, torches and batteries. But previous outages have been infrequent and lasted only a few hours, maybe up to a day. And fortunately they've tended to be in the summer months (not sure why). So going without wasn't too bad.

However, not this time and I've been trying to understand why. I think it's a combination of things. The duration for one, but also the fact that it happened during the week when I had a lot to do at work. Missing a few hours connectivity is OK because there are always things I can do (do better) when there are no interruptions from email or the phone. But extend that into days and it becomes an issue, especially when alternative solutions don't work, such as using my 3G phone for connectivity or to read backup emails.

What is interesting is that coincidentally we're doing a check of our processes for coping with catastrophic events. Now whilst I think that this power outage hardly counts as such an event, it does drive home that my own personal ability to cope is lacking. After spending a few hours thinking about this (I did have plenty of time, after all!) I'm sure there are things I can do better in the future, but probably the one place that remains beyond my control is lack of network (3G as a backup has shown itself to be limiting). I'm not sure I can justify a satellite link! So maybe I just take this as a weak link and hope it doesn't happen again. But we may be investing in a generator if this happens again.

Sunday, March 11, 2012

Big Data

Data is important in everything we do in life. Whether it's a recipe for your favourite dinner or details of your bank account, we are all data driven. This is nothing new either: humanity, and in fact all life, is critically dependant on data (information) and has been for millions, or billions, of years. Therefore, maintaining that data is also extremely important; it needs to be available (in a reasonable period of time), often shareable, secure and consistent (more or less).

If we cannot get to our data then we could be in trouble (catastrophic). If it takes too long to get at then there may be no real difference between it being available or not. If someone else can get to out data without our permission and possibly modify it without our knowledge, then that could be even worse than not being able to access the information. And of course if the information is maintained in multiple places, perhaps to ensure that if one copy is lost then we have backups, then updates to a copy must be made eventually to the others.

Over the centuries individuals and companies have grown successful and controlling from maintaining data securely for others or even managing it so that everyone has to go to them to obtain it. In our industry several large companies grew primarily because of this model. Other vendors became successful through other aspects related to data management or integration. Data is an enabler for everything in the software industry, whether it's middleware or operating system related.

So data is King. That is the way it has been and always will be. Everything we do uses, manipulates or otherwise controls data in some way, shape or form. The arrival of Cloud, mobile and ubiquitous computing does not change it. In fact ubiquitous computing increases the amount of data we have by several orders of magnitude. The bulk of the world's computing power are embedded systems, i.e., systems designed to do a few dedicated functions, in real-time, using sensors for data I/O. Technically all smart-phones and tablets are embedded systems, not PCs. Major drivers for data in the coming years are smart-phones, tablets, sensors, green technology, eHealth/medical, industrial applications and system "health" monitoring.

Much of the new data coming on stream today contains a location, timestamp or both. There has been a ten fold increase in electronically generated data in 5 years. It is predicted that very soon there will be over a Zetabyte of data (1billion terabytes). That's the equivalent of a stack of DVDs half way from here to Mars! Maintaing that data is important. But being able to use it is more important.

It is now well known that issues with traditional RDBMS implementations, their architectures and assumptions, mean that they are insufficient to manage and maintain this kind of data, at least not by themselves. This has lead to the evolution of BigData, with NoSQL and NewSQL implementations. There are a range of different approaches, including tuple space and graph-based databases, because one size does not fit all. Unlike with the RDBMS which is a good generic workhorse, but not optimisable for specific use cases, these new implementations are targeted specifically at them and conversely would make poor generic solutions.

It is extremely unlikely that a single standard, such as SQL, will evolve that works across all NoSQL implementations. However, there may be a set of standards, one for each category of approach. But for many, SQL will still remain a necessity, even if it means they cannot benefit from all of the advantages NoSQL offers: more people understand SQL and more applications are based on it, than anything else. So bridging these worlds is important. Finally it is worth noting that business analytics and sensor analytics will play a crucial role here.

In our industry we are now seeing an explosion in new database vendors looking to become the standard for this next generation. The current generation of developers are more heavily influenced by mobile, social and ubiquitous computing than by mainframes, so the RDBMS is not their natural first thought when considering how to manage data. However, most of these new companies are small open source startups and recognise the problems inherent with both: customers not trusting their important data to small companies that could go under or be acquired by a competitor; other vendors taking their code and creating competing products from it.

Furthermore, some large vendors who have failed to make an impact in the data space or inroads in enterprise middleware, see this new area and these new companies as opportunities. As a result, relationships are being created between these two category of companies in a symbiotic manner. Many of these NoSQL/NewSQL companies are going to merge, be acquired or fail. In the meantime new approaches and companies will be created. Over the next 5 years this new data space, which will integrate with the current RDBMS area, will coalesce and solidify. There are no obvious winners at this stage, but what is clear is that open source will play a critical role.

Tuesday, February 21, 2012

Clouds for Enterprises (C4E) 2012

Call for Papers: The 2nd International Workshop on
Clouds for Enterprises (C4E) 2012
held at the 16h IEEE International EDOC Conference (EDOC 2012) "The Enterprise Computing Conference"
(, Beijing, China, 10-14 September 2012

Important dates:
Paper submission: Sunday, 1 April 2012
Notification of acceptance: Monday, 28 May 2012
Camera-ready version: Friday, 15 June 2012

Cloud computing is an increasingly popular computing paradigm that aims to streamline the on-demand provisioning of software (SaaS), platform (PaaS), infrastructure (IaaS), and data (DaaS) as services. Deploying applications on a cloud can help to achieve scalability, improve flexibility of computing infrastructure, and reduce total cost of ownership. However, a variety of challenges arise when deploying and operating applications and services in complex and dynamic cloud-based environments, which are frequent in enterprises and governments.
Due to the security and privacy concerns with public cloud offerings (which first attracted widespread attention), it seems likely that many enterprises and governments will choose hybrid cloud, community cloud, and (particularly in the near future) private cloud solutions. Multi-tier infrastructures like these not only promise vast opportunities for future business models and new types of integrated business services, but also pose severe technical and organizational problems.
The goal of this one-day workshop is to bring together academic, industrial, and government researchers (from different disciplines), developers, and IT managers interested in cloud computing technologies and/or their consumer-side/provider-side use in enterprises and governments. Through paper presentations and discussions, this workshop will contribute to the inter-disciplinary and multi-perspective exchange of knowledge and ideas, dissemination of results about completed and on-going research projects, as well as identification and analysis of open cloud research and adoption/exploitation issues.
This is the second Clouds for Enterprises (C4E) workshop - the first was held in 2011 at the 13th IEEE Conference on Commerce and Enterprise Computing (CEC'11) on Monday, 5 September 2011 in Luxembourg, Luxembourg. The C4E 2011 workshop program, posted on the workshop Web page, included the keynote "Blueprinting the Cloud" by Prof. Willem-Jan van den Heuvel, presentations of 3 full and 5 short peer-reviewed workshop papers, and the discussion session "Migrating Enterprise/Government Applications to Clouds: Experiences and Challenges". The workshop proceedings were published by the IEEE and included in the IEEEXplore digital library, together with the proceedings of the main CEC'11 conference and the other co-located workshops. The Clouds for Enterprises (C4E) 2012 workshop will be held at another prestigious IEEE conference - the 16h IEEE International EDOC Conference (EDOC 2012) "The Enterprise Computing Conference" in Beijing, China, 10-14 September 2012. The main theme of the IEEE EDOC 2012 conference is "When Services in Cloud Meet Enterprises", so the C4E 2012 workshop is an excellent fit into and addition to the IEEE EDOC 2012 conference.
This Clouds for Enterprises 2012 workshop invites contributions from both technical (e.g., architecture-related) and business perspectives (with governance issues spanning both perspectives). The topics of interest include, but are not limited to:
Technical Perspective:
- Patterns and best practices in development for cloud-based applications
- Deployment and configuration of cloud services
- Migration of legacy applications to clouds
- Hybrid and multi-tier cloud architectures
- Architectural support for enhancing cloud computing interoperability and portability
- Architectural principles and approaches to cloud computing
- Cloud architectures for adaptivity or robustness
- Evaluation methods for cloud architectures
- Architectural support for dynamic resource management to support computing needs of cloud services
- Cloud architectures of emerging applications, such as mashup of enterprise/government services
- Impact of cloud computing on architecture of software and, more generally, IT systems
Enterprise/Government Application Perspective:
- Case studies and experience reports in development of cloud-based systems in enterprises and governments
- Analyses of cloud initiatives of different governments
- Business aspects of cloud service markets
- Technical and business support for various cloud service market roles, such as brokers, integrators, and certification authorities
- New applications and business models for enterprises/governments leveraging cloud computing
- Economic evaluation of cloud-based enterprises
Governance Perspective:
- Service lifecycle models
- Architectural support for security and privacy
- Architectural support for trust in/by cloud services
- Capacity planning of services running in a cloud
- Architectural support for quality of service (QoS) and service level agreement (SLA) management
- Accountability of cloud services, including mechanisms, algorithms and methods for monitoring, analyzing and reporting service status and usage profiles
- IT Governance and compliance, particularly in hybrid and multi-tier clouds

Review and publication process:
Authors are invited to submit previously unpublished, high-quality papers before
***1 April 2012***.
Papers published or submitted elsewhere will be automatically rejected. All submissions should be made using the EasyChair Web site
Two types of submissions are solicited:ˇ
* Full papers - describing mature research or industrial case studies, up to 8 pages long
* Short papers - describing work in progress or position statements, up to 4 pages long
Papers presenting and analyzing completed projects are particularly welcome. Papers about on-going research projects are also welcome, especially if they contain critical, qualitative and quantitative analysis of already achieved results and remaining open research issues. In addition, papers about experiences and comparative analysis of using cloud computing in enterprises and governments are also welcome. Submissions from industry and government are particularly encouraged. In addition to presentation of peer-reviewed papers this one-day workshop will contain a keynote from an industry expert and an open discussion session on practical issues of using clouds in enterprise/government environments.
Paper submissions should be in the IEEE Computer Society Conference Proceedings paper format. Templates (with guidelines) for this format are availableˇat: (see the blue box on the left-hand side). All submissions should include the author's name, affiliation and contact details. The preferred format is Adobe Portable Document Format (PDF), but Postscript (PS) and Microsoft Word (DOC) will be accepted in exceptional cases.
Inquiries about paper submission should be e-mailed to Dr. Vladimir Tosic (vladat at server: and include "Clouds for Enterprises 2012 Inquiry" in the Subject line.
All submissions will be formally peer-reviewed by at least 3 Program Committee members. The authors will be notified of acceptance around
***28 May 2012***.
ˇˇ At least one author of every accepted paper MUST register for the IEEE EDOC 2012 conference and present the paper.
All accepted papers (both full and short) will be published by the IEEE and included in the IEEE Digital Library, together with the proceedings of the other IEEE EDOC 2012 workshops. A follow-up journal issue with improved and extended versions of the best workshop papers is also planned.

Workshop Chairs:
Dr. Vladimir Tosic, NICTA and University of New South Wales and University of Sydney, Australia; E-mail: vladat (at: ? primary workshop contact
Dr. Andrew Farrell, University of Auckland, New Zealand; E-mail: ahfarrell (at:
Dr. Karl Michael Gîschka, Vienna University of Technology, Austria; E-mail: Karl.Goeschka (at:
Dr. Sebastian Hudert, TWT, Germany; E-mail: sebastian.hudert (at:
Prof. Dr. Hanan Lutfiyya, University of Western Ontario, Canada; E-mail: hanan (at:
Dr. Michael Parkin, Tilburg University, The Netherlands; E-mail: m.s.parkin (at:

Workshop Program Committee:
The final list of he workshop Program Committee will be listed soon at the workshop Web site:

Sunday, February 19, 2012


A long time ago, and in what may seem to some as a galaxy far,far away, there was no web and no way of traversing resources via hyperlinks. In that time the PC was just taking off and most of us were lucky if we shared a computer with less than 5 people at a time! Back then I shared one of the original classic Macs and came across this wonderful piece of software that was to change the way I thought about the world. HyperCard was something I started to play with just because it was there and really for no other reason, but it quickly became apparent that its core approach of hypermedia was different and compelling. These days I can't recall all of the ways in which I used HyperCard, but I do remember that a few of them helped me in my roleplaying endeavours at the time (ok not exactly work related but sometimes you learn by doing, no matter what it is that you are doing!)

When the Web came along it seemed so obvious the way that it worked. Hyperlinks between resources, whether they're database records (cards) or servers, makes a lot of sense for certain types of application. But extending it to a world wide mesh of disparate resources was a brilliant leap. I'm sure that HyperCard influenced the Web as it influenced several generations of developers. But I'm surprised with myself that I'd forgotten about it over the years. In fact it wasn't until the other day, when I was passing a shop window that happened to have an old Mac in it running HyperCard, that I remembered. It's over 20 years since those days, but we're all living under its influence.

Tuesday, February 14, 2012

Is Java the platform of the future?

I've mentioned before, but I think we are living in a period of time where a bigger explosion of programming languages is occurring than at any time in the past four decades. Having lived through a number of the classic languages such as BASIC, Simula, Pascal, Lisp, Prolog, C, C++ and Java, I can understand why people are fascinated with developing new ones: whether it's compiled versus interpreted, procedural versus functional, languages optimised for web development or embedded devices, I don't believe we'll ever have a single language that's right for all developer requirements.

This Polyglot movement is a reality and it's unlikely to go away any time soon. Fast forward a few years we may see a lot less languages around than today, but they will have been influenced strongly by their predecessors. I do believe that we need to make a distinction between the languages and the platforms that they inevitably spawn. And in this regard I think we need to learn from history now and quickly: unlike in the past we really don't need to reimplement the entire stack in the next cool language. I keep saying that there are core services and capabilities that transcend middleware standards and implementations such as CORBA or Java Enterprise Edition. Well guess what? That also means they transcend the languages in which they were written originally.

This is something that we realised well in the CORBA days, even if there were problems with the architecture itself. The fact that IDL was language neutral obviously meant your application could be constructed from components written in Java, COBOL and C++ without you either having to know or really having to care. Java broke that mould to a degree, and although Web Services are language independent, there's been too much backlash over SOAP, WSDL and friends that we forget this aspect at times. Of course it's an inherent part of REST.

However, if you look at what some are doing with these relatively new languages, there is a push to implement the stack in them from scratch. Now whilst it may make perfect sense to reimplement some components or approaches to take best advantage of some language capabilities, e.g., nginx; I don't think it's the norm. I think the kind of approaches we're seeing with, say, TorqueBox or Immutant where services implemented in one language are exposed to another in a way that makes them appear as if they were implemented natively, makes far more sense. Let's not waste time rehashing things like transactions, messaging and security, but instead concentrate on how best to offer these capabilities to the new polyglot movement that makes them fit in as first class citizens.

And to do this successfully is much more than just a technical issue; it requires an understanding of what the language offers, what the communities expect and working with both to fit in seamlessly. Being a Java programmer trying to push Java services into, say, Ruby, with a Java programmers approaches and understanding, will not guarantee success. You have to understand your users and let them guide you as much as you guide them.

So I still believe that in the future Java will, should and must play an important part in Cloud, mobile, ubiquitous computing etc. It may not be obvious to developers in these languages that they're using Java, but then it doesn't need to be. As long as they have access to all of the services and capabilities they need, in a way that feels entirely natural to them, why should it matter if some of those bits are hosted on or by a Java application server, for instance? The answer is that it shouldn't. And done right it means that these developers benefit from the maturity and reliability of these systems, built up over many years of real world deployments. Far better than the alternative.

Thursday, February 09, 2012

The future of Java

Just a couple of cross posts that are worth giving a wider distribution. First on whether this new polyglot movement is the death of Java, and second how the JCP process has been changing for the better over the years.

Tuesday, January 31, 2012

Blogging versus tweeting?

A few years ago when I was thinking about creating a twitter account I pondered about whether it was worth doing when I was blogging. I didn't think I'd use it much! Since creating the account I've been drawn into twitter more and more, so that today I'm finding the roles reversed: blogging is becoming less frequent whilst tweeting is increasing for me.

I think the reason why is pretty obvious: it is so much easier and quicker to tweet than to write a blog. But there are obvious limits in what you can say with 140 characters, so it's not an either/or situation for me. And yet as a result of using twitter I'm finding myself thinking less and less about blogging. That bit I don't quite understand. Now maybe it has nothing to do with my use of twitter; maybe I'd be blogging less regardless of it because of work, family life etc. Who knows? But I do know I find it interesting how twitter has insinuated itself with my life so quickly and seamlessly.

Sunday, January 01, 2012

Transactions on Android

Every year I try to make time for pet projects, be they learning new languages such as Erlang (one of my 2007 efforts), writing a discrete event simulation package in C++, or one of my best which was writing the world's first pure Java transaction service over Christmas 1996. Most of the time I don't manage to make much progress throughout the year, leaving the bulk of the effort for over the Christmas break.

This year was no different, with "port Arjuna (aka JBossTS) to Android" on my to-do list for months. I've been playing around with Android for quite a while, even collaborating with some friends on writing a game (iPhone too). I know that although it's Java-based, there are enough differences to make porting certain Java applications tricky. But over the years I have found porting transactions to different languages and environments a pretty good way to learn about the language or environment in question.

So as well as doing my usual catch-up on reading material, breaking the back of the Android port was top of my list. Now in the past I'd have higher expectations of what I could accomplish in this time period, but these days I have a family and some things take priority (well, most of the time). But once everyone had opened their presents, let the turkey settle in the stomach and sat down to watch The Great Escape (hey, it's Christmas!) I found time to kick it off.

I started simple in order to remove as many variables from the problem as possible. So I went back to JavaArjuna, the ancestor of JBossTS and all that predated it. It has none of the enhancements that we've added over the years, but places less requirements on the infrastructure. For instance, it was JavaArjuna that I ported to the HP Jornada back in 2001 because it also worked with earlier versions of Java.

As in 2001 it went well and it wasn't long before I had transactions running on my Android device. It was nice to see one of the basic tests running and displaying the typical transaction UIDs, statuses, rolling back, committing, suspending etc. Then I moved on to JBossTS. It wasn't quite as straightforward and there are a few hacks or workarounds in there while I figure out the best way to fix things, but it's done too! I'm pretty pleased by the results and will spend whatever time I have in the coming weeks to address the open issues. And I've definitely learned a lot more about Android.

So overall I think it's been a good pet project for 2011. It also showed me yet again that the architecture and code behind JBossTS that the team's been working on for years is still a highly portable solution. It doesn't matter whether you want transactions on a mainframe, in the cloud, or on a constrained device, JBossTS can do them all!