Compiling Map Reduce programs using eclipse

As part of our class we run Mapreduce programs in the AWS machines which we connect using putty. So we have only command line interface to create these programs. Creating map reduce java program using this interface is difficult and also time taking as we have to write/copy the code and then compile the code, create a jar using command line interface. During this process there is a chance of compilation errors due to this we have to reopen, edit and then make changes. So we felt using eclipse to compile and create the jar will make life better.

In this post I will explain how to achieve this:

1.I created a jar with cloudera 4.7.1 repos using maven. Click on the below link.Extract the jar from the zip file.

HadoopJars-0.0.1-SNAPSHOT-jar-with-dependencies-without-pig

2.Open eclipse click on File-> New->Java Project and create a new Java project

Hadoop Install

3.Set jar file in the build path by right clicking on the project, click Properties, select Java Build path and click on Libraries. Click on Add External jar select the HadoopJars-0.0.1-SNAPSHOT-jar-with-dependencies.jar under the path you saved it before and click ok.

Set build path
4.Create required classes that you want to compile like mapper, reducer and driver classes by right clicking the project, click on New->class and write the code.
5.Right click on the project and click export

6.Select Jar file under Java

Select add jar

7.Select the directory and provide jar name

Jar file name

8.Click finish.
9.Copy the jar to the hadoop environment using winSCP
10.Run the job using the below command

hadoop jar MRExample.jar MRDriver in out

If class is created inside a package use:

hadoop jar MRExample.jar packagename.MRDriver in out

 

Leave a Reply