You are viewing a plain text version of this content. The canonical link for it is here.
Posted to common-user@hadoop.apache.org by Pankil Doshi <fo...@gmail.com> on 2009/06/22 22:13:23 UTC

Hadoop Vaidya tool

Hello ,

I am trying to use Hadoop Vaidya tool . Its available with version 0.20.0.
But I see following error.Can anyone Guide me on that. I have pseudo mode
cluster i/e single node cluster for testing..

*cmd I submit is *" sh
/home/hadoop/Desktop/hadoop-0.20.0/contrib/vaidya/bin/vaidya.sh -jobconf
hdfs://localhost:9000/logs/job_200906221335_0001_conf.xml  -joblog
hdfs://localhost:9000/logs/ "

*Error :-*
Exception:java.net.MalformedURLException: unknown protocol:
hdfsjava.net.MalformedURLException: unknown protocol: hdfs
    at java.net.URL.<init>(URL.java:590)
    at java.net.URL.<init>(URL.java:480)
    at java.net.URL.<init>(URL.java:429)
    at
org.apache.hadoop.vaidya.postexdiagnosis.PostExPerformanceDiagnoser.readJobInformation(PostExPerformanceDiagnoser.java:124)
    at
org.apache.hadoop.vaidya.postexdiagnosis.PostExPerformanceDiagnoser.<init>(PostExPerformanceDiagnoser.java:112)
    at
org.apache.hadoop.vaidya.postexdiagnosis.PostExPerformanceDiagnoser.main(PostExPerformanceDiagnoser.java:220)

Can anyone guide me on that..

Regards
Pankil

Re: Hadoop Vaidya tool

Posted by Vitthal Gogate <go...@yahoo-inc.com>.
Hello Pratik, -joblog also should be a specific job history file path not a
directory. Usually, I copy the job conf xml file and job history log file to
a local file system and then use a file:// protocol (although hdfs:// should
also work) e.g, 

Sh /home/hadoop/Desktop/hadoop-0.20.0/contrib/vaidya/bin/vaidya.sh -jobconf
file://localhost/logs/job_200906221335_0001_conf.xml  -joblog
file://localhost/logs/job_00906221335_0001_jobxxx

I discovered few problems with the tool in hadoop 20 for some specific
scenarios such as map_only jobs etc. Following Jiras fix the problems,

If you download latest hadoop (trunk), then 5582 is already part of it, else
with hadoop 20, you can apply following Jiras in sequence.

https://issues.apache.org/jira/browse/HADOOP-5582
https://issues.apache.org/jira/browse/HADOOP-5950

1. Hadoop Vaidya being standalone tool, you may not need to change your
existing installed version of hadoop, but rater separately download the
hadoop trunk, apply patch 5950, re-build and replace the
$HADOOP_HOME/contrib/vaidya/hadoop-0.20.0-vaidya.jar file in your existing
hadoop 20 installation with the one newly built.

2. Also if you have big job (i.e. Lots of map/reduce tasks), you may face
out of memory problem while analyzing it. In which case you can edit the
$HADOOP_HOME/contrib/vaidya/bin/vaidya.sh and add -Xmx1024m option on the
java command line before class path.

Hope it helps

Thanks & Regards, Suhas


On 6/22/09 1:13 PM, "Pankil Doshi" <fo...@gmail.com> wrote:

> Hello ,
> 
> I am trying to use Hadoop Vaidya tool . Its available with version 0.20.0.
> But I see following error.Can anyone Guide me on that. I have pseudo mode
> cluster i/e single node cluster for testing..
> 
> *cmd I submit is *" sh
> /home/hadoop/Desktop/hadoop-0.20.0/contrib/vaidya/bin/vaidya.sh -jobconf
> hdfs://localhost:9000/logs/job_200906221335_0001_conf.xml  -joblog
> hdfs://localhost:9000/logs/ "
> 
> *Error :-*
> Exception:java.net.MalformedURLException: unknown protocol:
> hdfsjava.net.MalformedURLException: unknown protocol: hdfs
>     at java.net.URL.<init>(URL.java:590)
>     at java.net.URL.<init>(URL.java:480)
>     at java.net.URL.<init>(URL.java:429)
>     at
> org.apache.hadoop.vaidya.postexdiagnosis.PostExPerformanceDiagnoser.readJobInf
> ormation(PostExPerformanceDiagnoser.java:124)
>     at
> org.apache.hadoop.vaidya.postexdiagnosis.PostExPerformanceDiagnoser.<init>(Pos
> tExPerformanceDiagnoser.java:112)
>     at
> org.apache.hadoop.vaidya.postexdiagnosis.PostExPerformanceDiagnoser.main(PostE
> xPerformanceDiagnoser.java:220)
> 
> Can anyone guide me on that..
> 
> Regards
> Pankil

--Regards Suhas
[Getting stated w/ Grid]
http://twiki.corp.yahoo.com/view/GridDocumentation/GridDocAbout
[Search HADOOP/PIG Information]
http://ucdev20.yst.corp.yahoo.com/griduserportal/griduserportal.php