You are viewing a plain text version of this content. The canonical link for it is here.
Posted to common-user@hadoop.apache.org by Nikolaos Romanos Katsipoulakis <po...@gmail.com> on 2012/06/05 08:20:26 UTC
Web Service Interface for triggering a Hadoop Job
Hello everybody.
I want to trigger the execution of an ItemSimilarityJob (mahout 0.7
snapshot) from a web service
interface. Hence, I want to implement a class that will contain an
ItemSimilarityJob object and whenever
I get a WS request, it will invoke the ItemSimilarityJob object's run
method. Is this possible?
And how is it done?
I am posting the code that I have written below:
public class Main {
public static void main(String[] args) throws IOException {
Configuration jobConf = new Configuration();
jobConf.addResource(new Path("/etc/hadoop/conf/core-site.xml"));
jobConf.addResource(new Path("/etc/hadoop/conf/hdfs-site.xml"));
jobConf.addResource(new Path("/etc/hadoop/conf/mapred-site.xml"));
ItemSimilarityJob myJob = new ItemSimilarityJob();
String[] args1 = { "-Dmapred.input.dir=input/input.txt",
"-Dmapred.output.dir=output", "--similarityClassname",
"SIMILARITY_COOCCURRENCE" };
try {
myJob.main(args1);
}catch(Exception e) {
System.err.println(e.getMessage());
}
}
}
The output I get is:
Jun 5, 2012 9:14:46 AM org.apache.mahout.common.AbstractJob parseArguments
SEVERE: Unexpected mapred.output.dir=output while processing
Job-Specific Options:
usage: <command> [Generic Options] [Job-Specific Options]
Generic Options:
-archives <paths> comma separated archives to be unarchived
on the compute machines.
-conf <configuration file> specify an application configuration file
-D <property=value> use value for given property
-files <paths> comma separated files to be copied to the
map reduce cluster
-fs <local|namenode:port> specify a namenode
-jt <local|jobtracker:port> specify a job tracker
-libjars <paths> comma separated jar files to include in
the classpath.
-tokenCacheFile <tokensFile> name of the file with the tokens
Unexpected mapred.output.dir=output while processing Job-Specific Options:
Usage:
[--input <input> --output <output> --similarityClassname
<similarityClassname>
--maxSimilaritiesPerItem <maxSimilaritiesPerItem> --maxPrefsPerUser
<maxPrefsPerUser> --minPrefsPerUser <minPrefsPerUser> --booleanData
<booleanData> --threshold <threshold> --help --tempDir <tempDir>
--startPhase
<startPhase> --endPhase <endPhase>]
Job-Specific Options:
--input (-i) input Path to job
input
directory.
--output (-o) output The directory
pathname for
output.
--similarityClassname (-s) similarityClassname Name of
distributed
similarity
measures
class to
instantiate,
alternatively
use one
of the
predefined
similarities
([SIMILARITY_COOCCURRE
NCE,
SIMILARITY_LOGLIKELIHO
OD,
SIMILARITY_TANIMOTO_CO
EFFICIENT,
SIMILARITY_CITY_BLOCK,
SIMILARITY_COSINE,
SIMILARITY_PEARSON_COR
RELATION,
SIMILARITY_EUCLIDEAN_D
ISTANCE])
--maxSimilaritiesPerItem (-m) maxSimilaritiesPerItem try to cap
the number
of similar
items per
item to this
number
(default: 100)
--maxPrefsPerUser (-mppu) maxPrefsPerUser max number of
preferences to
consider per
user,
users with more
preferences
will be
sampled down
(default: 1000)
--minPrefsPerUser (-mp) minPrefsPerUser ignore users
with
less
preferences than
this
(default: 1)
--booleanData (-b) booleanData Treat input as
without pref
values
--threshold (-tr) threshold discard item
pairs
with a
similarity
value below this
--help (-h) Print out help
--tempDir tempDir Intermediate
output
directory
--startPhase startPhase First phase
to run
--endPhase endPhase Last phase to
run
Why do I get the above output?
Thank you in advance.
Nick K.
Re: Web Service Interface for triggering a Hadoop Job
Posted by Nitin Pawar <ni...@gmail.com>.
you may want to check this on mahout user group
Jun 5, 2012 9:14:46 AM org.apache.mahout.common.**AbstractJob parseArguments
SEVERE: Unexpected mapred.output.dir=output while processing Job-Specific
Options:
this looks like command line argument parsing error
On Tue, Jun 5, 2012 at 11:50 AM, Nikolaos Romanos Katsipoulakis <
popanik@gmail.com> wrote:
> Hello everybody.
> I want to trigger the execution of an ItemSimilarityJob (mahout 0.7
> snapshot) from a web service
> interface. Hence, I want to implement a class that will contain an
> ItemSimilarityJob object and whenever
> I get a WS request, it will invoke the ItemSimilarityJob object's run
> method. Is this possible?
> And how is it done?
> I am posting the code that I have written below:
>
> public class Main {
>
> public static void main(String[] args) throws IOException {
> Configuration jobConf = new Configuration();
> jobConf.addResource(new Path("/etc/hadoop/conf/core-**site.xml"));
> jobConf.addResource(new Path("/etc/hadoop/conf/hdfs-**site.xml"));
> jobConf.addResource(new Path("/etc/hadoop/conf/mapred-**
> site.xml"));
> ItemSimilarityJob myJob = new ItemSimilarityJob();
> String[] args1 = { "-Dmapred.input.dir=input/**input.txt",
> "-Dmapred.output.dir=output", "--similarityClassname",
> "SIMILARITY_COOCCURRENCE" };
> try {
> myJob.main(args1);
> }catch(Exception e) {
> System.err.println(e.**getMessage());
> }
> }
>
> }
>
> The output I get is:
>
> Jun 5, 2012 9:14:46 AM org.apache.mahout.common.**AbstractJob
> parseArguments
> SEVERE: Unexpected mapred.output.dir=output while processing Job-Specific
> Options:
> usage: <command> [Generic Options] [Job-Specific Options]
> Generic Options:
> -archives <paths> comma separated archives to be unarchived
> on the compute machines.
> -conf <configuration file> specify an application configuration file
> -D <property=value> use value for given property
> -files <paths> comma separated files to be copied to the
> map reduce cluster
> -fs <local|namenode:port> specify a namenode
> -jt <local|jobtracker:port> specify a job tracker
> -libjars <paths> comma separated jar files to include in
> the classpath.
> -tokenCacheFile <tokensFile> name of the file with the tokens
> Unexpected mapred.output.dir=output while processing Job-Specific Options:
> Usage:
> [--input <input> --output <output> --similarityClassname
> <similarityClassname>
> --maxSimilaritiesPerItem <maxSimilaritiesPerItem> --maxPrefsPerUser
> <maxPrefsPerUser> --minPrefsPerUser <minPrefsPerUser> --booleanData
> <booleanData> --threshold <threshold> --help --tempDir <tempDir>
> --startPhase
> <startPhase> --endPhase <endPhase>]
> Job-Specific Options:
> --input (-i) input Path to job input
> directory.
> --output (-o) output The directory
> pathname for
> output.
> --similarityClassname (-s) similarityClassname Name of
> distributed
> similarity
> measures
> class to
> instantiate,
> alternatively use
> one
> of the predefined
> similarities
>
> ([SIMILARITY_COOCCURRE
> NCE,
>
> SIMILARITY_LOGLIKELIHO
> OD,
>
> SIMILARITY_TANIMOTO_CO
> EFFICIENT,
>
> SIMILARITY_CITY_BLOCK,
> SIMILARITY_COSINE,
>
> SIMILARITY_PEARSON_COR
> RELATION,
>
> SIMILARITY_EUCLIDEAN_D
> ISTANCE])
> --maxSimilaritiesPerItem (-m) maxSimilaritiesPerItem try to cap the
> number
> of similar items
> per
> item to this
> number
> (default: 100)
> --maxPrefsPerUser (-mppu) maxPrefsPerUser max number of
> preferences to
> consider per user,
> users with more
> preferences will
> be
> sampled down
> (default: 1000)
> --minPrefsPerUser (-mp) minPrefsPerUser ignore users with
> less preferences
> than
> this (default: 1)
> --booleanData (-b) booleanData Treat input as
> without pref
> values
> --threshold (-tr) threshold discard item pairs
> with a similarity
> value below this
> --help (-h) Print out help
> --tempDir tempDir Intermediate
> output
> directory
> --startPhase startPhase First phase to run
> --endPhase endPhase Last phase to run
>
> Why do I get the above output?
>
> Thank you in advance.
>
> Nick K.
>
--
Nitin Pawar