Send executable jar to hadoop cluster and run as “hadoop jar”

I commonly make a executable jar package with a main method and run by the commandline "hadoop jar Some.jar ClassWithMain input output"

In this main method, Job and Configuration may be configured and Configuration class has a setter to specify mapper or reducer class like conf.setMapperClass(Mapper.class).

However, In the case of submitting job remotely, I should set jar and Mapper or more classes to use hadoop client api.


I want to programmatically transfer jar in client to remote hadoop cluster and execute this jar like "hadoop jar" command to make main method specify mapper and reducer.

So how can I deal with this problem?


hadoop is only a shell script. Eventually, hadoop jar will invoke org.apache.hadoop.util.RunJar. What hadoop jar do is helping you set up the CLASSPATH. So you can use it directly.

For example,

String input = "...";
String output = "...";
    new String[]{"Some.jar", "ClassWithMain", input, output});

However, you need to set the CLASSPATH correctly before you use it. A convenient way to get the correct CLASSPATH is hadoop classpath. Type this command and you will get the full CLASSPATH.

Then set up the CLASSPATH before you run your java application. For example,

export CLASSPATH=$(hadoop classpath):$CLASSPATH
java -jar YourJar.jar

Need Your Help

How do I connect my data access methods to my UI? entity-framework data-binding datasource data-access-layer

I am seriously at a loss here. The three things that will not change in this project are the fact that we are using the Entity Framework to do our data access, the fact that we want thorough unit t...

About UNIX Resources Network

Original, collect and organize Developers related documents, information and materials, contains jQuery, Html, CSS, MySQL, .NET, ASP.NET, SQL, objective-c, iPhone, Ruby on Rails, C, SQL Server, Ruby, Arrays, Regex, ASP.NET MVC, WPF, XML, Ajax, DataBase, and so on.