You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@nutch.apache.org by cha <ch...@metrixline.com> on 2007/04/19 13:36:19 UTC

Cannot crawl from Server

Hi,

I am able to crawl successfully 700+ urls of my company  from my local
installation of nutch. But when the same configuartion is been used on
server,to my surprise it is crawling only one url..

I dont know what happening here?? Is something I need to setup in
nutch-site.xml ?? Is some entry to be made in etc/host file at server.Does
it make any difference for crawling??

Please throw some light on this.

Awaiting,
Cha

-- 
View this message in context: http://www.nabble.com/Cannot-crawl-from-Server-tf3606715.html#a10076512
Sent from the Nutch - User mailing list archive at Nabble.com.


RE: Cannot crawl from Server

Posted by Gal Nitzan <ga...@gmail.com>.
Obviously something has changed. Maybe environment variables or file 
permissions or missing config?

I usually just copy the whole folder but my test/dev environment is also 
linux...

I think you should just keep on digging until you find the culprit :(

Gal.


> -----Original Message-----
> From: cha [mailto:chandresh.rana@metrixline.com]
> Sent: Thursday, April 19, 2007 2:36 PM
> To: nutch-user@lucene.apache.org
> Subject: Cannot crawl from Server
>
>
> Hi,
>
> I am able to crawl successfully 700+ urls of my company  from my local
> installation of nutch. But when the same configuartion is been used on
> server,to my surprise it is crawling only one url..
>
> I dont know what happening here?? Is something I need to setup in
> nutch-site.xml ?? Is some entry to be made in etc/host file at server.Does
> it make any difference for crawling??
>
> Please throw some light on this.
>
> Awaiting,
> Cha
>
> --
> View this message in context: http://www.nabble.com/Cannot-crawl-from-
> Server-tf3606715.html#a10076512
> Sent from the Nutch - User mailing list archive at Nabble.com.