Monday, June 18, 2012

Expand and Update your BigInsights

Roughly a year ago, you were reading about how MIT was harvesting big insights from big data, about discovering useful information by turning huge amounts of data into gists that are visually understandable. At that time, I was also in awe of IBM's Watson and how InfoSphere BigInsights is used to create a Smarter Planet.

But that was a year ago... Since then, many things happened.
On the Open Source side, the components integrated in the BigInsights platform experienced a consistent progress; take Hadoop for instance, who matured enough to reach version 1.0.
Meanwhile, InfoSphere BigInsights grew just as much: last Friday releasing BigInsights version 1.4. A nice tradition I wish other companies would copy from IBM is that the BigInsights v1.4 GA was simultaneous to its availability for deployment on public clouds: anyone can create their own Hadoop cluster (based on BigInsights v1.4) on the Cloud in less than 30 minutes.

BigInsights v1.4 available for deployment on public clouds
BigInsights v1.4 available for deployment on public clouds

The very same day, the 61 participants at the Big Data Developer Day hands-on labs used BigInsights v1.4 — delivered via Cloud instances.


The new version brings simple administration and management capabilities, rich developer tools, powerful analytic functions, up-to-date Apache Hadoop and associated projects, as well as many enterprise features and enhancements. The new capabilities are aimed to improving flexibility, consumability, and manageability.
Here are some updates I cherry-picked based on my interests and needs:

Update to open source component levels

Here are the new versions of the open source components shipped with BigInsights 1.4:

  • Hadoop 1.0.0
  • Flume 0.9.4
  • HBase 0.90.5
  • Hive 0.8.0
  • Oozie 2.3.1
  • Nutch 1.4
  • Pig 0.9.1
  • Zookeeper 3.3.4

Consumability and Usability

  • Text analytics development: Provide an enhanced user experience for developing text analytics applications, with improved navigation of result views, enhanced sorting and filtering, enhanced pattern discovery and progress reporting.
  • Developer tools: Better built-in support for text analytics, support for local mode map-reduce development, improved deployment of applications, and automatic creation of JDBC connections to Hive data sources.
  • BigSheets and web console: New chart customization features make it effortless to access and manipulate data the way you want. New sheets, macros, and readers make it possible to access more data and analyze it in new ways, giving you more control and improved browsing capabilities for HDFS and NFS files. The addition of new application input parameter types make applications even easier to run.


Version 1.4 of InfoSphere BigInsights brings support for Cloudera Distributions of Apache Hadoop (CDH). This allows enterprises to either run BigInsights with the IBM provided Apache Hadoop distribution or to deploy to a Cloudera CDH cluster. On the other hand, Cloudera CDH users can now take advantage of enterprise-class features such as text analytics, user-friendly data manipulation and exploration, and developer tools available in BigInsights.


If you want to join the Big Data Developer Day participants, you might want to get a
first-hand experience by either getting the free version — InfoSphere BigInsights Basic Edition or whet your appetite with the Hadoop and Big Data courses on

No comments:

Post a Comment