Monday, December 10, 2012

Looking for a passion for Ruby, cloud and big data

Back in 2010…

…Antonio was announcing that IBM was looking for top notch student hackers.

After a process that provided many skill-building lessons, Henrique and myself joined the team at the IBM Toronto Software Lab.

Today…

…it's time again to look for the right attitude and for a matching potential.
We are looking for 2 students that are passionate about technology and their craft!

I am biased but I believe that these are the best positions that a computer science student can want. Since I flew in straight from Romania, I grew in an Agile environment with Ruby and Rails—from zero to deployment—with cloud computing, big data and related technologies.
Our team is best described as a "Startup within IBM". You get the best of both worlds: a highly challenging environment where you get to try to do absolutely everything and a stable environment where you can focus on developing great ideas without being distracted by the instability of a startup.

If…

  • you are keen on using bleeding edge technologies
  • you have a hacker mentality
  • you are willing to work in a start-up like setting but enjoy stability and resources of a leading technology company

then you will gain experience that is unmatched. You will work hand in hand with some of the best talent in the industry architecting and building the coolest tech. Oh, and this internship is for a period of 16 months, and yes: it's a paid internship.

If you are looking for a rewarding challenge, and if your eyebrow raised in a positive way while reading till here, please get in touch (marius dot butuc at ca dot ibm dot com gmail dot com) and tell me a bit about yourself. I will provide you with the information on how to apply through the official IBM channels and we'll take it from there.

Friday, September 14, 2012

IBM Mobile Database for Android

Today IBM Mobile Database – the new database offering for Android mobile devices – went GA. IBM Mobile Database offers a tight integration between a customer’s mobile solution and their existing DB2 or Informix environment. It is being offered as a free-of-charge web download.

The new offering makes it easier for mobile developers to develop and assemble applications for Android devices. Together with the solidDB offering, IBM Mobile Database provides the capability to synchronize data with DB2 and Informix databases.

Mobile, but fit for the Enterprise

The major highlights include:

  • 6 MB footprint, in-memory database to fit on mobile devices
  • Full-featured relational DB with standard SQL API, procedures, triggers
  • Fast and reliable access to enterprise data offline
  • Enterprise level data security
  • Persistent data storage and Automatic recovery
  • Transactional storage also during connection loss
  • Built-in replication capabilities allowing synchronization with IBM databases
  • Flexible options such as partitioning data or creating views to customize data for each device or user

What Android mobile app would you build to put these features to good use?

Wednesday, August 8, 2012

Moodle localization: to enrol or to enroll?

Ever wondered how would language localization affect a Moodle powered online educational site with more than 38,000 active users?

Moodle Trivia

We all know Moodle—the FLOSS CMS/LMS (a.k.a. e-learning software platform)—was first baked all the way in Australia. And you could see that Moodle was Australian at heart long before it grew to be the large international community that it is today.

One of these Australian fingerprints is the language it chooses to speak… err… spell.
The language pack installed and used by default by Moodle is English (en) and it prefers the British English spelling over the American English one.

Going international

No big deal, you might say. Well, the numbers can decide, so let's take a specific example: BigDataUniversity.com.

Big Data University—the online educational site about big data—has more than 38,000 users (at the time of writing). Registering, they follow free big data-related courses on subjects from Hadoop Fundamentals to Streams Computing or to Text Analytics.
Quite the vibrant, diverse community, and it's all powered by Moodle.

enrol - British English

One detail worth noting: more than 1 out of 4 users are from US. Now imagine the number of users that were eager to enroll in a course, only to find enrolment options… and how many emails reporting the enrol vs. enroll "misspelling" we got so far.

And this is not an isolated case. The Moodle forums are proof of how many users asked for an elegant way to change this spelling. This question is so popular, it even made it to the Moodle FAQ.

The elegant way to change enrol to enroll

The elegant solution is to change the default language pack used by Moodle:

  1. Install the English - United States (en_us) language pack in
    Settings » Site administration » Language » Language packs.
  2. Set the new language pack as the default language for the site via
    Settings » Site administration » Language » Language settings.

    Moodle: changing your preferred language
    Note that this change will only affect the new accounts, while existing users will retain their language setting. If they want to use American English, they can change it in
    Settings » My profile settings » Edit profile » Preferred language
  3. …and it works!
    enroll - American English

Sometimes it's the small details that will make your users happy.

Monday, July 30, 2012

Install and Configure S3cmd on Mac OS X

The big, big picture

I've been recently tinkering with the best practices for the backup and recovery of DB2 data servers. So I noticed that many people mainly focus on their backup strategy to ensure the safety and availability of the data. But the best backup strategy is not of much use if not leveraged by a strong recovery strategy. An efficient recovery strategy must enable you to quickly restore access to data after a software, hardware, or user failure.

So my goal is to turn the two DB2 servers powering BigDataUniversity.com into an example of DB2 best practices for backup and restore, by having:

  • an effective backup strategy
    • providing continuous availability during backup,
    • and, ideally, the ability to perform the backup process on the HADR Standby instead of Primary (hello, utopia!)
  • a rapid recovery strategy
    • including continuous availability during restore
  • and a strategy for backup retention.

s3cmd makes your life easy

Part of the optimization is persisting the backups on S3 and leveraging s3cmd to automate their deployment. Inside the Ubuntu powered Amazon instances, everything went smooth. But I first looked into ruby-s3cmd as soon as I wanted to experiment things from my Mac. The ruby gem felt more like bogging me down, so I went back to s3cmd, and simply installed it:

  1. first get s3cmd
  2. then switch to root just for the install: sudo python setup.py install
  3. at last, run s3cmd --configure to set up your AWS S3 credentials

…and enjoy!

Monday, June 18, 2012

Expand and Update your BigInsights

Roughly a year ago, you were reading about how MIT was harvesting big insights from big data, about discovering useful information by turning huge amounts of data into gists that are visually understandable. At that time, I was also in awe of IBM's Watson and how InfoSphere BigInsights is used to create a Smarter Planet.

But that was a year ago... Since then, many things happened.
On the Open Source side, the components integrated in the BigInsights platform experienced a consistent progress; take Hadoop for instance, who matured enough to reach version 1.0.
Meanwhile, InfoSphere BigInsights grew just as much: last Friday releasing BigInsights version 1.4. A nice tradition I wish other companies would copy from IBM is that the BigInsights v1.4 GA was simultaneous to its availability for deployment on public clouds: anyone can create their own Hadoop cluster (based on BigInsights v1.4) on the Cloud in less than 30 minutes.

BigInsights v1.4 available for deployment on public clouds
BigInsights v1.4 available for deployment on public clouds

The very same day, the 61 participants at the Big Data Developer Day hands-on labs used BigInsights v1.4 — delivered via Cloud instances.

***

The new version brings simple administration and management capabilities, rich developer tools, powerful analytic functions, up-to-date Apache Hadoop and associated projects, as well as many enterprise features and enhancements. The new capabilities are aimed to improving flexibility, consumability, and manageability.
Here are some updates I cherry-picked based on my interests and needs:

Update to open source component levels

Here are the new versions of the open source components shipped with BigInsights 1.4:

  • Hadoop 1.0.0
  • Flume 0.9.4
  • HBase 0.90.5
  • Hive 0.8.0
  • Oozie 2.3.1
  • Nutch 1.4
  • Pig 0.9.1
  • Zookeeper 3.3.4

Consumability and Usability

  • Text analytics development: Provide an enhanced user experience for developing text analytics applications, with improved navigation of result views, enhanced sorting and filtering, enhanced pattern discovery and progress reporting.
  • Developer tools: Better built-in support for text analytics, support for local mode map-reduce development, improved deployment of applications, and automatic creation of JDBC connections to Hive data sources.
  • BigSheets and web console: New chart customization features make it effortless to access and manipulate data the way you want. New sheets, macros, and readers make it possible to access more data and analyze it in new ways, giving you more control and improved browsing capabilities for HDFS and NFS files. The addition of new application input parameter types make applications even easier to run.

Flexibility

Version 1.4 of InfoSphere BigInsights brings support for Cloudera Distributions of Apache Hadoop (CDH). This allows enterprises to either run BigInsights with the IBM provided Apache Hadoop distribution or to deploy to a Cloudera CDH cluster. On the other hand, Cloudera CDH users can now take advantage of enterprise-class features such as text analytics, user-friendly data manipulation and exploration, and developer tools available in BigInsights.

***

If you want to join the Big Data Developer Day participants, you might want to get a
first-hand experience by either getting the free version — InfoSphere BigInsights Basic Edition or whet your appetite with the Hadoop and Big Data courses on BigDataUniversity.com.

Thursday, June 7, 2012

Why am I envious of Big Data geeks in the Valley?

  1. Hacking your way through Big Data and taking pride with Hadoop on your tool belt?
  2. Also living in the Silicon Valley? Or among the Hadoop Summit attendees?

Did you just answer yes to both questions?!?

Well then, I'm a bit envious of you because next week, Silicon Valley is the place to be for us, Hadoop and Big Data geeks. Next week, the concentration of Big Data talent will be so high in the Valley that I tend to agree with Leon on this one:

I am always envious of the people who live in Silicon Valley. It is not the California weather that I crave though it is nice. If you like technology there never seems to be a shortage of meetups, conferences and all around interesting events.

First, on June 13th and 14th, Yahoo and Hortonworks team up to bring you a Hadoop Summit with an attractive agenda and speaker list. So many tracks, so many speakers, so many talks, that I’m not even going to mention them myself; just head over and whet your appetite. :)

The very next day, June 15th, IBM is inviting people over for a free (breakfast and lunch provided) Big Data Developer Day. The day will blend in both hands on labs and interactive discussions with the opportunity to meet other technical folks and exchange knowledge.

If you are interested in

  • Hadoop scripting,
  • real time in-memory analytics,
  • Big Data for social media,
  • log analytics,
  • Big Data in general

June 15th is your full day of time well spent with senior technical leaders of our Big Data development team.

So, if you are in the Valley and you have the day available to invest it in your Hadoop and Big Data skills, I recommend you register as the number of participants is limited.

Monday, February 13, 2012

Ruby 1.9.3 via RVM on Mac OSX Lion: Success Story

Ruby 1.9.3: freakin' fast bro!
That fast!

After a long time away from Rails and Ruby, I roll up my sleeves today and try to figure out what I’ve missed lately:

  • first, Rails 3.2 proves to be engineered to work faster in dev mode, by now incorporating the Active Reload gem by default.
  • then reading some more, Ruby 1.9.3 bubbles up — it’s freaking fast bro!

Great news, time for our good old friend RVM:

  1. $ rvm get head
  2. $ rvm reload
  3. $ rvm install 1.9.3

But wait…

The Problem

Installing Ruby from source to: /Users/marius/.rvm/rubies/ruby-1.9.3-p0, this may take a while depending on your cpu(s)...

ruby-1.9.3-p0 - #fetching 
ruby-1.9.3-p0 - #extracted to /Users/marius/.rvm/src/ruby-1.9.3-p0 (already extracted)
Fetching yaml-0.1.4.tar.gz to /Users/marius/.rvm/archives
Extracting yaml-0.1.4.tar.gz to /Users/marius/.rvm/src
Configuring yaml in /Users/marius/.rvm/src/yaml-0.1.4.
Compiling yaml in /Users/marius/.rvm/src/yaml-0.1.4.
Installing yaml to /Users/marius/.rvm/usr
ruby-1.9.3-p0 - #configuring 
ERROR: Error running ' ./configure --prefix=/Users/marius/.rvm/rubies/ruby-1.9.3-p0 --enable-shared --disable-install-doc --with-libyaml-dir=/Users/marius/.rvm/usr ', please read /Users/marius/.rvm/log/ruby-1.9.3-p0/configure.log
ERROR: There has been an error while running configure. Halting the installation.

The installer fails with an error message including checking whether the C compiler works... no even with Xcode 4.2 available.

The Wrong Path

Checked configure.log, found all the questions on SO and went for the fast solution:

$ rvm reinstall 1.9.3 --with-gcc=clang

which didn’t work for me!

The Solution

Following mpapis’ advice, I downloaded the GCC Installer for OSX 10.7+, v2, by Kenneth Reitz, and a simple

$ rvm reinstall 1.9.3

did the trick! So I was happy to jump to the next step:

$ rvm --default use 1.9.3

The Bonus

Downloading and installing the massive Xcode tool suite (2.5GB!!!) is a huge hassle if you just want GCC and related tools.

The osx-gcc-installer includes the essential compilers:

  • GCC
  • LLVM
  • Clang
  • Developer CLI Tools (purge, etc)
  • DevSDK (headers, etc)

Therefore, since I’m not planning to use Xcode for other reasons, I simply removed it:

$ sudo /Developer/Library/uninstall-devtools –mode=all

If the Rails install fails with a message along these lines:

ERROR:  Error installing rails:
 ERROR: Failed to build gem native extension.

        /Users/marius/.rvm/rubies/ruby-1.9.3-p0/bin/ruby extconf.rb
creating Makefile

make
sh: make: command not found
don't worry, just run the GCC Installer once again.
Success!
©the-chosen-pessimist