Sunday, March 11, 2012

Big Data

Data is important in everything we do in life. Whether it's a recipe for your favourite dinner or details of your bank account, we are all data driven. This is nothing new either: humanity, and in fact all life, is critically dependant on data (information) and has been for millions, or billions, of years. Therefore, maintaining that data is also extremely important; it needs to be available (in a reasonable period of time), often shareable, secure and consistent (more or less).

If we cannot get to our data then we could be in trouble (catastrophic). If it takes too long to get at then there may be no real difference between it being available or not. If someone else can get to out data without our permission and possibly modify it without our knowledge, then that could be even worse than not being able to access the information. And of course if the information is maintained in multiple places, perhaps to ensure that if one copy is lost then we have backups, then updates to a copy must be made eventually to the others.

Over the centuries individuals and companies have grown successful and controlling from maintaining data securely for others or even managing it so that everyone has to go to them to obtain it. In our industry several large companies grew primarily because of this model. Other vendors became successful through other aspects related to data management or integration. Data is an enabler for everything in the software industry, whether it's middleware or operating system related.

So data is King. That is the way it has been and always will be. Everything we do uses, manipulates or otherwise controls data in some way, shape or form. The arrival of Cloud, mobile and ubiquitous computing does not change it. In fact ubiquitous computing increases the amount of data we have by several orders of magnitude. The bulk of the world's computing power are embedded systems, i.e., systems designed to do a few dedicated functions, in real-time, using sensors for data I/O. Technically all smart-phones and tablets are embedded systems, not PCs. Major drivers for data in the coming years are smart-phones, tablets, sensors, green technology, eHealth/medical, industrial applications and system "health" monitoring.

Much of the new data coming on stream today contains a location, timestamp or both. There has been a ten fold increase in electronically generated data in 5 years. It is predicted that very soon there will be over a Zetabyte of data (1billion terabytes). That's the equivalent of a stack of DVDs half way from here to Mars! Maintaing that data is important. But being able to use it is more important.

It is now well known that issues with traditional RDBMS implementations, their architectures and assumptions, mean that they are insufficient to manage and maintain this kind of data, at least not by themselves. This has lead to the evolution of BigData, with NoSQL and NewSQL implementations. There are a range of different approaches, including tuple space and graph-based databases, because one size does not fit all. Unlike with the RDBMS which is a good generic workhorse, but not optimisable for specific use cases, these new implementations are targeted specifically at them and conversely would make poor generic solutions.

It is extremely unlikely that a single standard, such as SQL, will evolve that works across all NoSQL implementations. However, there may be a set of standards, one for each category of approach. But for many, SQL will still remain a necessity, even if it means they cannot benefit from all of the advantages NoSQL offers: more people understand SQL and more applications are based on it, than anything else. So bridging these worlds is important. Finally it is worth noting that business analytics and sensor analytics will play a crucial role here.

In our industry we are now seeing an explosion in new database vendors looking to become the standard for this next generation. The current generation of developers are more heavily influenced by mobile, social and ubiquitous computing than by mainframes, so the RDBMS is not their natural first thought when considering how to manage data. However, most of these new companies are small open source startups and recognise the problems inherent with both: customers not trusting their important data to small companies that could go under or be acquired by a competitor; other vendors taking their code and creating competing products from it.

Furthermore, some large vendors who have failed to make an impact in the data space or inroads in enterprise middleware, see this new area and these new companies as opportunities. As a result, relationships are being created between these two category of companies in a symbiotic manner. Many of these NoSQL/NewSQL companies are going to merge, be acquired or fail. In the meantime new approaches and companies will be created. Over the next 5 years this new data space, which will integrate with the current RDBMS area, will coalesce and solidify. There are no obvious winners at this stage, but what is clear is that open source will play a critical role.

Tuesday, February 21, 2012

Clouds for Enterprises (C4E) 2012

Call for Papers: The 2nd International Workshop on
Clouds for Enterprises (C4E) 2012
http://nicta.com.au/people/tosicv/c4e2012/
held at the 16h IEEE International EDOC Conference (EDOC 2012) "The Enterprise Computing Conference"
(http://www.edocconference.org), Beijing, China, 10-14 September 2012

Important dates:
Paper submission: Sunday, 1 April 2012
Notification of acceptance: Monday, 28 May 2012
Camera-ready version: Friday, 15 June 2012

Description:
Cloud computing is an increasingly popular computing paradigm that aims to streamline the on-demand provisioning of software (SaaS), platform (PaaS), infrastructure (IaaS), and data (DaaS) as services. Deploying applications on a cloud can help to achieve scalability, improve flexibility of computing infrastructure, and reduce total cost of ownership. However, a variety of challenges arise when deploying and operating applications and services in complex and dynamic cloud-based environments, which are frequent in enterprises and governments.
Due to the security and privacy concerns with public cloud offerings (which first attracted widespread attention), it seems likely that many enterprises and governments will choose hybrid cloud, community cloud, and (particularly in the near future) private cloud solutions. Multi-tier infrastructures like these not only promise vast opportunities for future business models and new types of integrated business services, but also pose severe technical and organizational problems.
The goal of this one-day workshop is to bring together academic, industrial, and government researchers (from different disciplines), developers, and IT managers interested in cloud computing technologies and/or their consumer-side/provider-side use in enterprises and governments. Through paper presentations and discussions, this workshop will contribute to the inter-disciplinary and multi-perspective exchange of knowledge and ideas, dissemination of results about completed and on-going research projects, as well as identification and analysis of open cloud research and adoption/exploitation issues.
This is the second Clouds for Enterprises (C4E) workshop - the first was held in 2011 at the 13th IEEE Conference on Commerce and Enterprise Computing (CEC'11) on Monday, 5 September 2011 in Luxembourg, Luxembourg. The C4E 2011 workshop program, posted on the workshop Web page http://nicta.com.au/people/tosicv/clouds4enterprises2011/, included the keynote "Blueprinting the Cloud" by Prof. Willem-Jan van den Heuvel, presentations of 3 full and 5 short peer-reviewed workshop papers, and the discussion session "Migrating Enterprise/Government Applications to Clouds: Experiences and Challenges". The workshop proceedings were published by the IEEE and included in the IEEEXplore digital library, together with the proceedings of the main CEC'11 conference and the other co-located workshops. The Clouds for Enterprises (C4E) 2012 workshop will be held at another prestigious IEEE conference - the 16h IEEE International EDOC Conference (EDOC 2012) "The Enterprise Computing Conference" in Beijing, China, 10-14 September 2012. The main theme of the IEEE EDOC 2012 conference is "When Services in Cloud Meet Enterprises", so the C4E 2012 workshop is an excellent fit into and addition to the IEEE EDOC 2012 conference.
This Clouds for Enterprises 2012 workshop invites contributions from both technical (e.g., architecture-related) and business perspectives (with governance issues spanning both perspectives). The topics of interest include, but are not limited to:
Technical Perspective:
- Patterns and best practices in development for cloud-based applications
- Deployment and configuration of cloud services
- Migration of legacy applications to clouds
- Hybrid and multi-tier cloud architectures
- Architectural support for enhancing cloud computing interoperability and portability
- Architectural principles and approaches to cloud computing
- Cloud architectures for adaptivity or robustness
- Evaluation methods for cloud architectures
- Architectural support for dynamic resource management to support computing needs of cloud services
- Cloud architectures of emerging applications, such as mashup of enterprise/government services
- Impact of cloud computing on architecture of software and, more generally, IT systems
Enterprise/Government Application Perspective:
- Case studies and experience reports in development of cloud-based systems in enterprises and governments
- Analyses of cloud initiatives of different governments
- Business aspects of cloud service markets
- Technical and business support for various cloud service market roles, such as brokers, integrators, and certification authorities
- New applications and business models for enterprises/governments leveraging cloud computing
- Economic evaluation of cloud-based enterprises
Governance Perspective:
- Service lifecycle models
- Architectural support for security and privacy
- Architectural support for trust in/by cloud services
- Capacity planning of services running in a cloud
- Architectural support for quality of service (QoS) and service level agreement (SLA) management
- Accountability of cloud services, including mechanisms, algorithms and methods for monitoring, analyzing and reporting service status and usage profiles
- IT Governance and compliance, particularly in hybrid and multi-tier clouds

Review and publication process:
Authors are invited to submit previously unpublished, high-quality papers before
***1 April 2012***.
Papers published or submitted elsewhere will be automatically rejected. All submissions should be made using the EasyChair Web site http://www.easychair.org/conferences/?conf=c4e2012.
Two types of submissions are solicited:ˇ
* Full papers - describing mature research or industrial case studies, up to 8 pages long
* Short papers - describing work in progress or position statements, up to 4 pages long
Papers presenting and analyzing completed projects are particularly welcome. Papers about on-going research projects are also welcome, especially if they contain critical, qualitative and quantitative analysis of already achieved results and remaining open research issues. In addition, papers about experiences and comparative analysis of using cloud computing in enterprises and governments are also welcome. Submissions from industry and government are particularly encouraged. In addition to presentation of peer-reviewed papers this one-day workshop will contain a keynote from an industry expert and an open discussion session on practical issues of using clouds in enterprise/government environments.
Paper submissions should be in the IEEE Computer Society Conference Proceedings paper format. Templates (with guidelines) for this format are availableˇat: http://www.computer.org/portal/web/cscps/formatting (see the blue box on the left-hand side). All submissions should include the author's name, affiliation and contact details. The preferred format is Adobe Portable Document Format (PDF), but Postscript (PS) and Microsoft Word (DOC) will be accepted in exceptional cases.
Inquiries about paper submission should be e-mailed to Dr. Vladimir Tosic (vladat at server: computer.org) and include "Clouds for Enterprises 2012 Inquiry" in the Subject line.
All submissions will be formally peer-reviewed by at least 3 Program Committee members. The authors will be notified of acceptance around
***28 May 2012***.
ˇˇ At least one author of every accepted paper MUST register for the IEEE EDOC 2012 conference and present the paper.
All accepted papers (both full and short) will be published by the IEEE and included in the IEEE Digital Library, together with the proceedings of the other IEEE EDOC 2012 workshops. A follow-up journal issue with improved and extended versions of the best workshop papers is also planned.

Workshop Chairs:
Dr. Vladimir Tosic, NICTA and University of New South Wales and University of Sydney, Australia; E-mail: vladat (at: computer.org) ? primary workshop contact
Dr. Andrew Farrell, University of Auckland, New Zealand; E-mail: ahfarrell (at: gmail.com)
Dr. Karl Michael Gîschka, Vienna University of Technology, Austria; E-mail: Karl.Goeschka (at: tuwien.ac.at)
Dr. Sebastian Hudert, TWT, Germany; E-mail: sebastian.hudert (at: twt-gmbh.de)
Prof. Dr. Hanan Lutfiyya, University of Western Ontario, Canada; E-mail: hanan (at: csd.uwo.ca)
Dr. Michael Parkin, Tilburg University, The Netherlands; E-mail: m.s.parkin (at: uvt.nl)

Workshop Program Committee:
The final list of he workshop Program Committee will be listed soon at the workshop Web site: http://nicta.com.au/people/tosicv/c4e2012.

Sunday, February 19, 2012

HyperCard

A long time ago, and in what may seem to some as a galaxy far,far away, there was no web and no way of traversing resources via hyperlinks. In that time the PC was just taking off and most of us were lucky if we shared a computer with less than 5 people at a time! Back then I shared one of the original classic Macs and came across this wonderful piece of software that was to change the way I thought about the world. HyperCard was something I started to play with just because it was there and really for no other reason, but it quickly became apparent that its core approach of hypermedia was different and compelling. These days I can't recall all of the ways in which I used HyperCard, but I do remember that a few of them helped me in my roleplaying endeavours at the time (ok not exactly work related but sometimes you learn by doing, no matter what it is that you are doing!)

When the Web came along it seemed so obvious the way that it worked. Hyperlinks between resources, whether they're database records (cards) or servers, makes a lot of sense for certain types of application. But extending it to a world wide mesh of disparate resources was a brilliant leap. I'm sure that HyperCard influenced the Web as it influenced several generations of developers. But I'm surprised with myself that I'd forgotten about it over the years. In fact it wasn't until the other day, when I was passing a shop window that happened to have an old Mac in it running HyperCard, that I remembered. It's over 20 years since those days, but we're all living under its influence.

Tuesday, February 14, 2012

Is Java the platform of the future?

I've mentioned before, but I think we are living in a period of time where a bigger explosion of programming languages is occurring than at any time in the past four decades. Having lived through a number of the classic languages such as BASIC, Simula, Pascal, Lisp, Prolog, C, C++ and Java, I can understand why people are fascinated with developing new ones: whether it's compiled versus interpreted, procedural versus functional, languages optimised for web development or embedded devices, I don't believe we'll ever have a single language that's right for all developer requirements.

This Polyglot movement is a reality and it's unlikely to go away any time soon. Fast forward a few years we may see a lot less languages around than today, but they will have been influenced strongly by their predecessors. I do believe that we need to make a distinction between the languages and the platforms that they inevitably spawn. And in this regard I think we need to learn from history now and quickly: unlike in the past we really don't need to reimplement the entire stack in the next cool language. I keep saying that there are core services and capabilities that transcend middleware standards and implementations such as CORBA or Java Enterprise Edition. Well guess what? That also means they transcend the languages in which they were written originally.

This is something that we realised well in the CORBA days, even if there were problems with the architecture itself. The fact that IDL was language neutral obviously meant your application could be constructed from components written in Java, COBOL and C++ without you either having to know or really having to care. Java broke that mould to a degree, and although Web Services are language independent, there's been too much backlash over SOAP, WSDL and friends that we forget this aspect at times. Of course it's an inherent part of REST.

However, if you look at what some are doing with these relatively new languages, there is a push to implement the stack in them from scratch. Now whilst it may make perfect sense to reimplement some components or approaches to take best advantage of some language capabilities, e.g., nginx; I don't think it's the norm. I think the kind of approaches we're seeing with, say, TorqueBox or Immutant where services implemented in one language are exposed to another in a way that makes them appear as if they were implemented natively, makes far more sense. Let's not waste time rehashing things like transactions, messaging and security, but instead concentrate on how best to offer these capabilities to the new polyglot movement that makes them fit in as first class citizens.

And to do this successfully is much more than just a technical issue; it requires an understanding of what the language offers, what the communities expect and working with both to fit in seamlessly. Being a Java programmer trying to push Java services into, say, Ruby, with a Java programmers approaches and understanding, will not guarantee success. You have to understand your users and let them guide you as much as you guide them.

So I still believe that in the future Java will, should and must play an important part in Cloud, mobile, ubiquitous computing etc. It may not be obvious to developers in these languages that they're using Java, but then it doesn't need to be. As long as they have access to all of the services and capabilities they need, in a way that feels entirely natural to them, why should it matter if some of those bits are hosted on or by a Java application server, for instance? The answer is that it shouldn't. And done right it means that these developers benefit from the maturity and reliability of these systems, built up over many years of real world deployments. Far better than the alternative.

Thursday, February 09, 2012

The future of Java

Just a couple of cross posts that are worth giving a wider distribution. First on whether this new polyglot movement is the death of Java, and second how the JCP process has been changing for the better over the years.

Tuesday, January 31, 2012

Blogging versus tweeting?

A few years ago when I was thinking about creating a twitter account I pondered about whether it was worth doing when I was blogging. I didn't think I'd use it much! Since creating the account I've been drawn into twitter more and more, so that today I'm finding the roles reversed: blogging is becoming less frequent whilst tweeting is increasing for me.

I think the reason why is pretty obvious: it is so much easier and quicker to tweet than to write a blog. But there are obvious limits in what you can say with 140 characters, so it's not an either/or situation for me. And yet as a result of using twitter I'm finding myself thinking less and less about blogging. That bit I don't quite understand. Now maybe it has nothing to do with my use of twitter; maybe I'd be blogging less regardless of it because of work, family life etc. Who knows? But I do know I find it interesting how twitter has insinuated itself with my life so quickly and seamlessly.

Sunday, January 01, 2012

Transactions on Android

Every year I try to make time for pet projects, be they learning new languages such as Erlang (one of my 2007 efforts), writing a discrete event simulation package in C++, or one of my best which was writing the world's first pure Java transaction service over Christmas 1996. Most of the time I don't manage to make much progress throughout the year, leaving the bulk of the effort for over the Christmas break.

This year was no different, with "port Arjuna (aka JBossTS) to Android" on my to-do list for months. I've been playing around with Android for quite a while, even collaborating with some friends on writing a game (iPhone too). I know that although it's Java-based, there are enough differences to make porting certain Java applications tricky. But over the years I have found porting transactions to different languages and environments a pretty good way to learn about the language or environment in question.

So as well as doing my usual catch-up on reading material, breaking the back of the Android port was top of my list. Now in the past I'd have higher expectations of what I could accomplish in this time period, but these days I have a family and some things take priority (well, most of the time). But once everyone had opened their presents, let the turkey settle in the stomach and sat down to watch The Great Escape (hey, it's Christmas!) I found time to kick it off.

I started simple in order to remove as many variables from the problem as possible. So I went back to JavaArjuna, the ancestor of JBossTS and all that predated it. It has none of the enhancements that we've added over the years, but places less requirements on the infrastructure. For instance, it was JavaArjuna that I ported to the HP Jornada back in 2001 because it also worked with earlier versions of Java.

As in 2001 it went well and it wasn't long before I had transactions running on my Android device. It was nice to see one of the basic tests running and displaying the typical transaction UIDs, statuses, rolling back, committing, suspending etc. Then I moved on to JBossTS. It wasn't quite as straightforward and there are a few hacks or workarounds in there while I figure out the best way to fix things, but it's done too! I'm pretty pleased by the results and will spend whatever time I have in the coming weeks to address the open issues. And I've definitely learned a lot more about Android.

So overall I think it's been a good pet project for 2011. It also showed me yet again that the architecture and code behind JBossTS that the team's been working on for years is still a highly portable solution. It doesn't matter whether you want transactions on a mainframe, in the cloud, or on a constrained device, JBossTS can do them all!