featured work


featured work


Text-To-Speech Expand to full view

C++ jobs on Hadoop

1. Execute C++ on Hadoop

Hello in this article / tutorial you will see how to setup and run Hadoop jobs in native C++.
This can be quite tricky but don't worry i will help you through all the steps to get you up and running.

First you need an Hadoop environment to get up and running. I recommend downloading Cloudera's virtual machine that you can run on Virtualbox. 

You can go ahead and download the virtual machine here (large download 4.2 gigabyte)

2. Preparing our environment

If you downloaded the VM from the previous chapter. We need to make some preperations to it before you will be able to get MR4C up and running. Please follow the bottom list from top to bottom in order to get everything set up correctly.

Here is the bash commands list:

||First update the system:
yum -y update

||Then install the developement tools
yum groupinstall "Development Tools"

||Then check gcc:
gcc --version

||If the version is  < 4.6.3 use the foloving section to install newer, if the version is > 4.6.3 skip this:
||============================================== GCC 4.6.3


||1. build & install gmp:

tar jxf gmp-4.3.2.tar.bz2 &&cd gmp-4.3.2/
./configure --prefix=/usr/local/gmp
make &&make install
cd ..

||2. build & install mpfr:
tar jxf mpfr-2.4.2.tar.bz2 ;cd mpfr-2.4.2/
./configure --prefix=/usr/local/mpfr -with-gmp=/usr/local/gmp
make &&make install
cd ..

||3. build & install mpc:
tar xzf mpc-0.8.1.tar.gz ;cd mpc-0.8.1
./configure --prefix=/usr/local/mpc -with-mpfr=/usr/local/mpfr -with-gmp=/usr/local/gmp
make &&make install
cd ..

||4. build & install gcc4.6.3

tar jxf gcc-4.6.3.tar.bz2 ;cd gcc-4.6.3
./configure --prefix=/usr/local/gcc -enable-threads=posix -disable-checking -disable-multilib -enable-languages=c,c++ -with-gmp=/usr/local/gmp -with-mpfr=/usr/local/mpfr/ -with-mpc=/usr/local/mpc/

||Make sure there are no errors and proceed with:

export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:/usr/local/mpc/lib:/usr/local/gmp/lib:/usr/local/mpfr/lib/
make && make install

|| Now you need to add configuration, create a new file

nano /etc/ld.so.conf.d/gcc.4.6.3.conf

|| And paste the following into it


|| Save the file, now run this commands:

mv /usr/bin/gcc /usr/bin/gcc_old
mv /usr/bin/g++ /usr/bin/g++_old
mv /usr/bin/c++ /usr/bin/c++_old
ln -s -f /usr/local/gcc/bin/gcc /usr/bin/gcc
ln -s -f /usr/local/gcc/bin/g++ /usr/bin/g++
ln -s -f /usr/local/gcc/bin/c++ /usr/bin/c++
cp /usr/local/gcc/lib64/libstdc++.so.6.0.16 /usr/lib64/.
mv /usr/lib64/libstdc++.so.6 /usr/lib64/libstdc++.so.6.bak
ln -s -f /usr/lib64/libstdc++.so.6.0.16 /usr/lib64/libstdc++.so.6

|| NOTE: If you get error about the frist command(renaming gcc to gcc_old) it means that the directory is different and you should use /usr/local/bin/ instead of /usr/bin/ . Example:

mv /usr/local/bin/gcc /usr/local/bin/gcc_old
mv /usr/local/bin/g++ /usr/bin/g++_old
mv /usr/local/bin/c++ /usr/local/bin/c++_old
ln -s -f /usr/local/gcc/bin/gcc /usr/local/bin/gcc
ln -s -f /usr/local/gcc/bin/g++ /usr/local/bin/g++
ln -s -f /usr/local/gcc/bin/c++ /usr/local/bin/c++
cp /usr/local/gcc/lib64/libstdc++.so.6.0.16 /usr/lib64/.
mv /usr/lib64/libstdc++.so.6 /usr/lib64/libstdc++.so.6.bak
ln -s -f /usr/lib64/libstdc++.so.6.0.16 /usr/lib64/libstdc++.so.6

||============================================== GCC 4.6.3
|| If you run now gcc --version you should get gcc 4.6.3
|| Proceed with installing java and other dependencies

wget https://download.fedoraproject.org/pub/epel/6/x86_64/epel-release-6-8.noarch.rpm
rpm -ivh epel-release-6-8.noarch.rpm
yum install ant
yum install java-1.7.0-openjdk java-1.7.0-openjdk-devel
yum install cppunit cppunit-devel
yum install libpng
yum install libtiff
yum install git

|| Install log4cxx from

|| Install jansson 

git clone https://github.com/akheron/jansson.git
cd ~/jansson
autoreconf -i
make install

|| Install proj and gdal


|| Install apache-ivy
git clone --recursive https://github.com/apache/ant-ivy
cd ant-ivy
ant jar

|| You will have to copy ivy.jar to /root/.ant/lib

|| After all that is done you can start installing mr4c

git clone --recursive https://github.com/google/mr4c
cd mr4c
cd test

|| Done.

|| NOTE: do not copy any line with || , that is a comment, as for the rest, every line is different command you need to execute
|| If you want to deploy it in ubuntu you can install the dependencies like this

apt-get install ant python-software-properties liblog4cxx10 liblog4cxx10-dev build-essential g++ libjansson libjansson-dev libcppunit-dev libcppunit binutils libproj-dev gdal-bin

sudo add-apt-repository ppa:webupd8team/java
sudo apt-get update
sudo apt-get install oracle-java8-installer
sudo apt-get install oracle-java8-set-default


3. Running our first job!!

Its finally time to get some fun jobs up and running. Please join me in the following tutorial.

Print Friendly and PDF

Please authenticate to bookmark

1 Votes (9) Average Rating
Date Created 2015-09-11
Author : fb-fac3b0ok 

There are no comments yet!

Sorry! You need to register and loggin before you can comment.