You are viewing a plain text version of this content. The canonical link for it is here.
Posted to common-user@hadoop.apache.org by prithvi dammalapati <d....@gmail.com> on 2013/04/22 19:11:34 UTC

Hadoop Streaming job error - Need help urgent

I have the following hadoop code to find the betweenness centrality of a
graph

    java_home=/usr/lib/jvm/java-1.7.0-openjdk-amd64
    hadoop_home=/usr/local/hadoop/hadoop-1.0.4
    hadoop_lib=$hadoop_home/hadoop-core-1.0.4.jar
    hadoop_bin=$hadoop_home/bin/hadoop
    hadoop_config=$hadoop_home/conf

hadoop_streaming=$hadoop_home/contrib/streaming/hadoop-streaming-1.0.4.jar
    #task specific parameters
    source_code=BetweennessCentrality.java
    jar_file=BetweennessCentrality.jar
    main_class=mslab.BetweennessCentrality
    num_of_node=38012
    num_of_mapper=100
    num_of_reducer=8
    input_path=/data/dblp_author_conf_adj.txt
    output_path=dblp_bc_N$(($num_of_node))_M$((num_of_mapper))
    rm build -rf
    mkdir build
    $java_home/bin/javac -d build -classpath .:$hadoop_lib
src/mslab/$source_code
    rm $jar_file -f
    $java_home/bin/jar -cf $jar_file -C build/ .
    $hadoop_bin --config $hadoop_config fs -rmr $output_path
    $hadoop_bin --config $hadoop_config jar $jar_file $main_class
$num_of_node       $num_of_mapper

    rm brandes_mapper

    g++ src/mslab/mapred_brandes.cpp -O3 -o brandes_mapper
    $hadoop_bin --config $hadoop_config jar $hadoop_streaming -D
mapred.task.timeout=0 -D
mapred.job.name="BC_N$((num_of_node))_M$((num_of_mapper))"
-D mapred.reduce.tasks=$num_of_reducer -input
input_BC_N$((num_of_node))_M$((num_of_mapper)) -output $output_path -file
brandes_mapper -file src/mslab/BC_reducer.py -file
src/mslab/MapReduceUtil.py -file input_path -mapper "./brandes_mapper
$input_path $num_of_node" -reducer "./BC_reducer.py"

When I run this code in a shell script, i get the following errors:

    Warning: $HADOOP_HOME is deprecated.
    File: /home/hduser/Downloads/mgmf/trunk/input_path does not exist, or
is not readable.
    Streaming Command Failed!

but the file exits at the specified path

    /Downloads/mgmf/trunk/data$ ls
    dblp_author_conf_adj.txt

I have also added the input file into HDFS using

    /usr/local/hadoop$ bin/hadoop dfs -copyFromLocal /source /destination

Can someone help me solve this problem?


Any help is appreciated,
Thanks
Prithvi