As part of our class we run Mapreduce programs in the AWS machines which we connect using putty. So we have only command line interface to create these programs. Creating map reduce java program using this interface is difficult and also time taking as we have to write/copy the code and then compile the code, create a jar using command line interface. During this process there is a chance of compilation errors due to this we have to reopen, edit and then make changes. So we felt using eclipse to compile and create the jar will make life better.
In this post I will explain how to achieve this:
1.I created a jar with cloudera 4.7.1 repos using maven. Click on the below link.Extract the jar from the zip file.
2.Open eclipse click on File-> New->Java Project and create a new Java project
3.Set jar file in the build path by right clicking on the project, click Properties, select Java Build path and click on Libraries. Click on Add External jar select the HadoopJars-0.0.1-SNAPSHOT-jar-with-dependencies.jar under the path you saved it before and click ok.
4.Create required classes that you want to compile like mapper, reducer and driver classes by right clicking the project, click on New->class and write the code.
5.Right click on the project and click export
6.Select Jar file under Java
7.Select the directory and provide jar name
9.Copy the jar to the hadoop environment using winSCP
10.Run the job using the below command
hadoop jar MRExample.jar MRDriver in out
If class is created inside a package use:
hadoop jar MRExample.jar packagename.MRDriver in out