You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@nutch.apache.org by al...@aim.com on 2011/08/16 22:23:30 UTC

fetcher runs without error with no internet connection

Hello,

After running bin/nutch fetch $segment for 2 days, internet connection was lost, but nutch did not give any errors. Usually I was seeing Unknown host exception before. 
Any ideas what happened and is it OK to stop the fetch and run it again on the same (old) segment? This is nutch -1.2

Thanks.
Alex.

Re: fetcher runs without error with no internet connection

Posted by Markus Jelsma <ma...@openindex.io>.
DNS? DSL? 

A common practice to solve overloading a DNS-server is to host your own DNS-
server. Bind is a good choice. You can also try a local DNS-caching server.

> It is the DNS problem, because it was giving a lot of UnknownHost
> exception. I decreased thread number to 5, but still DSL fails
> periodically. I wondered what is the common internet connection for
> fetching about 3500 domains. I currently have DSL with 3 Mps.
> 
> Thanks.
> Alex.
> 
> 
> 
> -----Original Message-----
> From: Markus Jelsma <ma...@openindex.io>
> To: user <us...@nutch.apache.org>
> Sent: Mon, Aug 29, 2011 5:19 pm
> Subject: Re: fetcher runs without error with no internet connection
> 
> 
> I didn't say you have a DNS-problem only that these exception may occur if
> the DNS can't keep up with the requests you make. Make sure you have a DNS
> problem before trying to solve a problem that doesn't exist. It's normal
> to have these exceptions once in a while.
> 
> Solving DNS issues are beyond the scope of this list. You may, however, opt
> for some DNS caching in your network.
> 
> > What is the solution to the issue with DNS server?
> > 
> > 
> > 
> > 
> > 
> > -----Original Message-----
> > From: Markus Jelsma <ma...@openindex.io>
> > To: user <us...@nutch.apache.org>
> > Sent: Tue, Aug 23, 2011 12:32 pm
> > Subject: Re: fetcher runs without error with no internet connection
> > 
> > 
> > If you fetch too hard, your DNS-server may not be able to keep up.
> > 
> > > Hi Lewis,
> > > 
> > > I stopped fetcher and started it on the same segment again.
> > > But before doing that I turned off modem and fetcher started giving
> > > Unknown.Host exception. It was not giving any error, with dsl failure,
> > > i.e. I was not able to connect to any sites. Again this is nutch-1.2.
> > > 
> > > Thanks.
> > > Alex.
> > > 
> > > 
> > > 
> > > 
> > > 
> > > -----Original Message-----
> > > From: lewis john mcgibbney <le...@gmail.com>
> > > To: user <us...@nutch.apache.org>
> > > Sent: Tue, Aug 23, 2011 6:37 am
> > > Subject: Re: fetcher runs without error with no internet connection
> > > 
> > > 
> > > Hi Alex,
> > > 
> > > Did you get anywhere with this?
> > > 
> > > What condition led to you seeing unknown host exception?
> > > 
> > > Unless segment gets corrupted, I would assume you could fetch again.
> > > Hopefully you can confirm this.
> > > 
> > > On Tue, Aug 16, 2011 at 9:23 PM, <al...@aim.com> wrote:
> > > > Hello,
> > > > 
> > > > After running bin/nutch fetch $segment for 2 days, internet
> > > > connection was lost, but nutch did not give any errors. Usually I
> > > > was seeing Unknown host exception before.
> > > > Any ideas what happened and is it OK to stop the fetch and run it
> > > > again on the same (old) segment? This is nutch -1.2
> > > > 
> > > > Thanks.
> > > > Alex.

Re: fetcher runs without error with no internet connection

Posted by al...@aim.com.
It is the DNS problem, because it was giving a lot of UnknownHost exception. I decreased thread number to 5, but still DSL fails periodically. 
I wondered what is the common internet connection for fetching about 3500 domains. I currently have DSL with 3 Mps.

Thanks.
Alex.

 

-----Original Message-----
From: Markus Jelsma <ma...@openindex.io>
To: user <us...@nutch.apache.org>
Sent: Mon, Aug 29, 2011 5:19 pm
Subject: Re: fetcher runs without error with no internet connection


I didn't say you have a DNS-problem only that these exception may occur if the 
DNS can't keep up with the requests you make. Make sure you have a DNS problem 
before trying to solve a problem that doesn't exist. It's normal to have these 
exceptions once in a while.

Solving DNS issues are beyond the scope of this list. You may, however, opt 
for some DNS caching in your network.

> What is the solution to the issue with DNS server?
> 
> 
> 
> 
> 
> -----Original Message-----
> From: Markus Jelsma <ma...@openindex.io>
> To: user <us...@nutch.apache.org>
> Sent: Tue, Aug 23, 2011 12:32 pm
> Subject: Re: fetcher runs without error with no internet connection
> 
> 
> If you fetch too hard, your DNS-server may not be able to keep up.
> 
> > Hi Lewis,
> > 
> > I stopped fetcher and started it on the same segment again.
> > But before doing that I turned off modem and fetcher started giving
> > Unknown.Host exception. It was not giving any error, with dsl failure,
> > i.e. I was not able to connect to any sites. Again this is nutch-1.2.
> > 
> > Thanks.
> > Alex.
> > 
> > 
> > 
> > 
> > 
> > -----Original Message-----
> > From: lewis john mcgibbney <le...@gmail.com>
> > To: user <us...@nutch.apache.org>
> > Sent: Tue, Aug 23, 2011 6:37 am
> > Subject: Re: fetcher runs without error with no internet connection
> > 
> > 
> > Hi Alex,
> > 
> > Did you get anywhere with this?
> > 
> > What condition led to you seeing unknown host exception?
> > 
> > Unless segment gets corrupted, I would assume you could fetch again.
> > Hopefully you can confirm this.
> > 
> > On Tue, Aug 16, 2011 at 9:23 PM, <al...@aim.com> wrote:
> > > Hello,
> > > 
> > > After running bin/nutch fetch $segment for 2 days, internet connection
> > > was lost, but nutch did not give any errors. Usually I was seeing
> > > Unknown host exception before.
> > > Any ideas what happened and is it OK to stop the fetch and run it again
> > > on the same (old) segment? This is nutch -1.2
> > > 
> > > Thanks.
> > > Alex.

 

Re: fetcher runs without error with no internet connection

Posted by Markus Jelsma <ma...@openindex.io>.
I didn't say you have a DNS-problem only that these exception may occur if the 
DNS can't keep up with the requests you make. Make sure you have a DNS problem 
before trying to solve a problem that doesn't exist. It's normal to have these 
exceptions once in a while.

Solving DNS issues are beyond the scope of this list. You may, however, opt 
for some DNS caching in your network.

> What is the solution to the issue with DNS server?
> 
> 
> 
> 
> 
> -----Original Message-----
> From: Markus Jelsma <ma...@openindex.io>
> To: user <us...@nutch.apache.org>
> Sent: Tue, Aug 23, 2011 12:32 pm
> Subject: Re: fetcher runs without error with no internet connection
> 
> 
> If you fetch too hard, your DNS-server may not be able to keep up.
> 
> > Hi Lewis,
> > 
> > I stopped fetcher and started it on the same segment again.
> > But before doing that I turned off modem and fetcher started giving
> > Unknown.Host exception. It was not giving any error, with dsl failure,
> > i.e. I was not able to connect to any sites. Again this is nutch-1.2.
> > 
> > Thanks.
> > Alex.
> > 
> > 
> > 
> > 
> > 
> > -----Original Message-----
> > From: lewis john mcgibbney <le...@gmail.com>
> > To: user <us...@nutch.apache.org>
> > Sent: Tue, Aug 23, 2011 6:37 am
> > Subject: Re: fetcher runs without error with no internet connection
> > 
> > 
> > Hi Alex,
> > 
> > Did you get anywhere with this?
> > 
> > What condition led to you seeing unknown host exception?
> > 
> > Unless segment gets corrupted, I would assume you could fetch again.
> > Hopefully you can confirm this.
> > 
> > On Tue, Aug 16, 2011 at 9:23 PM, <al...@aim.com> wrote:
> > > Hello,
> > > 
> > > After running bin/nutch fetch $segment for 2 days, internet connection
> > > was lost, but nutch did not give any errors. Usually I was seeing
> > > Unknown host exception before.
> > > Any ideas what happened and is it OK to stop the fetch and run it again
> > > on the same (old) segment? This is nutch -1.2
> > > 
> > > Thanks.
> > > Alex.

Re: fetcher runs without error with no internet connection

Posted by al...@aim.com.
What is the solution to the issue with DNS server?

 

 

-----Original Message-----
From: Markus Jelsma <ma...@openindex.io>
To: user <us...@nutch.apache.org>
Sent: Tue, Aug 23, 2011 12:32 pm
Subject: Re: fetcher runs without error with no internet connection


If you fetch too hard, your DNS-server may not be able to keep up.

> Hi Lewis,
> 
> I stopped fetcher and started it on the same segment again.
> But before doing that I turned off modem and fetcher started giving
> Unknown.Host exception. It was not giving any error, with dsl failure,
> i.e. I was not able to connect to any sites. Again this is nutch-1.2.
> 
> Thanks.
> Alex.
> 
> 
> 
> 
> 
> -----Original Message-----
> From: lewis john mcgibbney <le...@gmail.com>
> To: user <us...@nutch.apache.org>
> Sent: Tue, Aug 23, 2011 6:37 am
> Subject: Re: fetcher runs without error with no internet connection
> 
> 
> Hi Alex,
> 
> Did you get anywhere with this?
> 
> What condition led to you seeing unknown host exception?
> 
> Unless segment gets corrupted, I would assume you could fetch again.
> Hopefully you can confirm this.
> 
> On Tue, Aug 16, 2011 at 9:23 PM, <al...@aim.com> wrote:
> > Hello,
> > 
> > After running bin/nutch fetch $segment for 2 days, internet connection
> > was lost, but nutch did not give any errors. Usually I was seeing
> > Unknown host exception before.
> > Any ideas what happened and is it OK to stop the fetch and run it again
> > on the same (old) segment? This is nutch -1.2
> > 
> > Thanks.
> > Alex.

 

Re: fetcher runs without error with no internet connection

Posted by Markus Jelsma <ma...@openindex.io>.
If you fetch too hard, your DNS-server may not be able to keep up.

> Hi Lewis,
> 
> I stopped fetcher and started it on the same segment again.
> But before doing that I turned off modem and fetcher started giving
> Unknown.Host exception. It was not giving any error, with dsl failure,
> i.e. I was not able to connect to any sites. Again this is nutch-1.2.
> 
> Thanks.
> Alex.
> 
> 
> 
> 
> 
> -----Original Message-----
> From: lewis john mcgibbney <le...@gmail.com>
> To: user <us...@nutch.apache.org>
> Sent: Tue, Aug 23, 2011 6:37 am
> Subject: Re: fetcher runs without error with no internet connection
> 
> 
> Hi Alex,
> 
> Did you get anywhere with this?
> 
> What condition led to you seeing unknown host exception?
> 
> Unless segment gets corrupted, I would assume you could fetch again.
> Hopefully you can confirm this.
> 
> On Tue, Aug 16, 2011 at 9:23 PM, <al...@aim.com> wrote:
> > Hello,
> > 
> > After running bin/nutch fetch $segment for 2 days, internet connection
> > was lost, but nutch did not give any errors. Usually I was seeing
> > Unknown host exception before.
> > Any ideas what happened and is it OK to stop the fetch and run it again
> > on the same (old) segment? This is nutch -1.2
> > 
> > Thanks.
> > Alex.

Re: fetcher runs without error with no internet connection

Posted by al...@aim.com.
Hi Lewis,

I stopped fetcher and started it on the same segment again. 
But before doing that I turned off modem and fetcher started giving Unknown.Host exception.
It was not giving any error, with dsl failure, i.e. I was not able to connect to any sites. Again this is nutch-1.2.

Thanks.
Alex.

 

 

-----Original Message-----
From: lewis john mcgibbney <le...@gmail.com>
To: user <us...@nutch.apache.org>
Sent: Tue, Aug 23, 2011 6:37 am
Subject: Re: fetcher runs without error with no internet connection


Hi Alex,

Did you get anywhere with this?

What condition led to you seeing unknown host exception?

Unless segment gets corrupted, I would assume you could fetch again.
Hopefully you can confirm this.

On Tue, Aug 16, 2011 at 9:23 PM, <al...@aim.com> wrote:

> Hello,
>
> After running bin/nutch fetch $segment for 2 days, internet connection was
> lost, but nutch did not give any errors. Usually I was seeing Unknown host
> exception before.
> Any ideas what happened and is it OK to stop the fetch and run it again on
> the same (old) segment? This is nutch -1.2
>
> Thanks.
> Alex.
>



-- 
*Lewis*

 

Re: fetcher runs without error with no internet connection

Posted by lewis john mcgibbney <le...@gmail.com>.
Hi Alex,

Did you get anywhere with this?

What condition led to you seeing unknown host exception?

Unless segment gets corrupted, I would assume you could fetch again.
Hopefully you can confirm this.

On Tue, Aug 16, 2011 at 9:23 PM, <al...@aim.com> wrote:

> Hello,
>
> After running bin/nutch fetch $segment for 2 days, internet connection was
> lost, but nutch did not give any errors. Usually I was seeing Unknown host
> exception before.
> Any ideas what happened and is it OK to stop the fetch and run it again on
> the same (old) segment? This is nutch -1.2
>
> Thanks.
> Alex.
>



-- 
*Lewis*