Hadoop support execution of jar file. For an executable jar file in normal java execution, one can specify the main class in the command line, as covered in my previous post: switch between main classes in a jar file.
However, the rules are a bit different for executable jar file running with hadoop. Basically the following rules hold (I tested on Hadoop 1.0.3),
- If a jar file contains a main class specified in its manifest file, hadoop will take the main class even if the command specify another main class. This is different from normal java execution where we can specify a main class to overwrite the one in the manifest file.
- If a jar file does not contain a main class in manifest file, hadoop allows us to specify the main class.
At eclipse, when one export a project as runnable jar file, it always ask for a main class at Launch configuration, shown as below,
The main class selected will be put in the manifest file. Below is the content of the META-INF/MANIFEST.MF file in my helloworld project where the main class is set to HelloWorld.
One can browse the jar file using a file extractor, open the manifest file using a file editor, and simply delete the last line to remove the main class configuration, and save the changes to the jar file when prompted. This will create a runnable jar file without main class.
The modified jar file can then be used in Hadoop with user supplied main class configuration, as shown in the sample command below,
$ hadoop jar hello.jar hello.HelloWorld