You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@nutch.apache.org by webdev1977 <we...@gmail.com> on 2011/08/29 16:58:59 UTC

SSHD for Nutch 1.3 in Pseudo Distributed mode

Do I NEED SSHD for Nutch 1.3 in Pseudo Distributed mode? 

I am running on a windows server using cygwin (obviously :-) 

I can not get haddop/nutch to run in deploy mode and I am not sure if it has
something to do with ssh or not.  When I run start-all.sh it gives me some
ssh usage errors and also says it is staring the jobtracker and namenode.

In the hadoop log it complains about not being able to write file:
hdfs://localhost:9000/cygdrive/r/EnterpriseSearch/hadoop/mapreduce/system/jobtracker.info.

I have configred core-site.xml, hdfs-site.xml and mapred-site.xml



--
View this message in context: http://lucene.472066.n3.nabble.com/SSHD-for-Nutch-1-3-in-Pseudo-Distributed-mode-tp3292907p3292907.html
Sent from the Nutch - User mailing list archive at Nabble.com.

Re: SSHD for Nutch 1.3 in Pseudo Distributed mode

Posted by lewis john mcgibbney <le...@gmail.com>.
In general the Cygwin active users/devs are few and far between on this list
(I think)

You can try this tutorial and see where you get, but I fear that if you run
into problems the help you receive may be limited

http://hayesdavis.net/2008/06/14/running-hadoop-on-windows/


On Thu, Sep 1, 2011 at 8:10 PM, webdev1977 <we...@gmail.com> wrote:

> Is it generally not recommended that cygwin is used to run hadoop?  There
> is
> no way I am getting a linux box :-(
>
> --
> View this message in context:
> http://lucene.472066.n3.nabble.com/SSHD-for-Nutch-1-3-in-Pseudo-Distributed-mode-tp3292907p3302240.html
> Sent from the Nutch - User mailing list archive at Nabble.com.
>



-- 
*Lewis*

Re: SSHD for Nutch 1.3 in Pseudo Distributed mode

Posted by webdev1977 <we...@gmail.com>.
Is it generally not recommended that cygwin is used to run hadoop?  There is
no way I am getting a linux box :-(

--
View this message in context: http://lucene.472066.n3.nabble.com/SSHD-for-Nutch-1-3-in-Pseudo-Distributed-mode-tp3292907p3302240.html
Sent from the Nutch - User mailing list archive at Nabble.com.

Re: SSHD for Nutch 1.3 in Pseudo Distributed mode

Posted by Markus Jelsma <ma...@openindex.io>.
Indeed. Might be an cygwin thing.

> that shouldn't be happening. You are best to undertake some Hadoop
> processing on some other data to ensure that you Hadoop
> installation/configuration is working correctly.
> 
> On Thu, Sep 1, 2011 at 7:33 PM, webdev1977 <we...@gmail.com> wrote:
> > I FINALLY got sshd to work.  Turns out I had a bum installation of cygwin
> > and
> > openssh.  I figured as much when I would run ssh <machine name> and all
> > it would do is give me the usage statement!
> > 
> > Now if I could just get the job to run in hadoop :-(.  It is stuck.. and
> > has
> > been on this:
> > INFO mapred.JobClient: map 0% reduce 0%.  It is trying to fetch ONE url
> > and has been stuck like this for hours now.  Funny thing is that the
> > hadoop logs
> > look like it is doing *something* it is just horribly slow!
> > 
> > 
> > --
> > View this message in context:
> > http://lucene.472066.n3.nabble.com/SSHD-for-Nutch-1-3-in-Pseudo-Distribut
> > ed-mode-tp3292907p3302131.html Sent from the Nutch - User mailing list
> > archive at Nabble.com.

Re: SSHD for Nutch 1.3 in Pseudo Distributed mode

Posted by lewis john mcgibbney <le...@gmail.com>.
that shouldn't be happening. You are best to undertake some Hadoop
processing on some other data to ensure that you Hadoop
installation/configuration is working correctly.

On Thu, Sep 1, 2011 at 7:33 PM, webdev1977 <we...@gmail.com> wrote:

> I FINALLY got sshd to work.  Turns out I had a bum installation of cygwin
> and
> openssh.  I figured as much when I would run ssh <machine name> and all it
> would do is give me the usage statement!
>
> Now if I could just get the job to run in hadoop :-(.  It is stuck.. and
> has
> been on this:
> INFO mapred.JobClient: map 0% reduce 0%.  It is trying to fetch ONE url and
> has been stuck like this for hours now.  Funny thing is that the hadoop
> logs
> look like it is doing *something* it is just horribly slow!
>
>
> --
> View this message in context:
> http://lucene.472066.n3.nabble.com/SSHD-for-Nutch-1-3-in-Pseudo-Distributed-mode-tp3292907p3302131.html
> Sent from the Nutch - User mailing list archive at Nabble.com.
>



-- 
*Lewis*

Re: SSHD for Nutch 1.3 in Pseudo Distributed mode

Posted by webdev1977 <we...@gmail.com>.
I FINALLY got sshd to work.  Turns out I had a bum installation of cygwin and
openssh.  I figured as much when I would run ssh <machine name> and all it
would do is give me the usage statement!

Now if I could just get the job to run in hadoop :-(.  It is stuck.. and has
been on this:
INFO mapred.JobClient: map 0% reduce 0%.  It is trying to fetch ONE url and
has been stuck like this for hours now.  Funny thing is that the hadoop logs
look like it is doing *something* it is just horribly slow! 


--
View this message in context: http://lucene.472066.n3.nabble.com/SSHD-for-Nutch-1-3-in-Pseudo-Distributed-mode-tp3292907p3302131.html
Sent from the Nutch - User mailing list archive at Nabble.com.

Re: SSHD for Nutch 1.3 in Pseudo Distributed mode

Posted by Markus Jelsma <ma...@openindex.io>.
It's a Hadoop question indeed. I'm also not sure if ssh is a requirement for a 
pseudo enviroment. But why not install it anyway? Having sshd doesn't hurt and 
it's always a convenience, i can't think of any machine without sshd ;)

> If it complains about SSH errors then I would ensure that you are logged
> into your SSH client e.g. ssh -v localhost, prior to executing any hadoop
> scripts. This would make sense.
> 
> Further to this, unless you are actually experiencing Nutch related
> problems on a pseudo or cluster setup then probably the best place to go
> is the hadoop user lists. This is only a thought, but it would make most
> sense.
> 
> On Mon, Aug 29, 2011 at 3:58 PM, webdev1977 <we...@gmail.com> wrote:
> > Do I NEED SSHD for Nutch 1.3 in Pseudo Distributed mode?
> > 
> > I am running on a windows server using cygwin (obviously :-)
> > 
> > I can not get haddop/nutch to run in deploy mode and I am not sure if it
> > has
> > something to do with ssh or not.  When I run start-all.sh it gives me
> > some ssh usage errors and also says it is staring the jobtracker and
> > namenode.
> > 
> > In the hadoop log it complains about not being able to write file:
> > hdfs://localhost:9000/cygdrive/r/EnterpriseSearch/hadoop/mapreduce/system
> > / jobtracker.info.
> > 
> > I have configred core-site.xml, hdfs-site.xml and mapred-site.xml
> > 
> > 
> > 
> > --
> > View this message in context:
> > http://lucene.472066.n3.nabble.com/SSHD-for-Nutch-1-3-in-Pseudo-Distribut
> > ed-mode-tp3292907p3292907.html Sent from the Nutch - User mailing list
> > archive at Nabble.com.

Re: SSHD for Nutch 1.3 in Pseudo Distributed mode

Posted by lewis john mcgibbney <le...@gmail.com>.
If it complains about SSH errors then I would ensure that you are logged
into your SSH client e.g. ssh -v localhost, prior to executing any hadoop
scripts. This would make sense.

Further to this, unless you are actually experiencing Nutch related problems
on a pseudo or cluster setup then probably the best place to go is the
hadoop user lists. This is only a thought, but it would make most sense.



On Mon, Aug 29, 2011 at 3:58 PM, webdev1977 <we...@gmail.com> wrote:

> Do I NEED SSHD for Nutch 1.3 in Pseudo Distributed mode?
>
> I am running on a windows server using cygwin (obviously :-)
>
> I can not get haddop/nutch to run in deploy mode and I am not sure if it
> has
> something to do with ssh or not.  When I run start-all.sh it gives me some
> ssh usage errors and also says it is staring the jobtracker and namenode.
>
> In the hadoop log it complains about not being able to write file:
> hdfs://localhost:9000/cygdrive/r/EnterpriseSearch/hadoop/mapreduce/system/
> jobtracker.info.
>
> I have configred core-site.xml, hdfs-site.xml and mapred-site.xml
>
>
>
> --
> View this message in context:
> http://lucene.472066.n3.nabble.com/SSHD-for-Nutch-1-3-in-Pseudo-Distributed-mode-tp3292907p3292907.html
> Sent from the Nutch - User mailing list archive at Nabble.com.
>



-- 
*Lewis*