You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@nutch.apache.org by al...@aim.com on 2011/08/16 22:23:30 UTC
fetcher runs without error with no internet connection
Hello,
After running bin/nutch fetch $segment for 2 days, internet connection was lost, but nutch did not give any errors. Usually I was seeing Unknown host exception before.
Any ideas what happened and is it OK to stop the fetch and run it again on the same (old) segment? This is nutch -1.2
Thanks.
Alex.
Re: fetcher runs without error with no internet connection
Posted by Markus Jelsma <ma...@openindex.io>.
DNS? DSL?
A common practice to solve overloading a DNS-server is to host your own DNS-
server. Bind is a good choice. You can also try a local DNS-caching server.
> It is the DNS problem, because it was giving a lot of UnknownHost
> exception. I decreased thread number to 5, but still DSL fails
> periodically. I wondered what is the common internet connection for
> fetching about 3500 domains. I currently have DSL with 3 Mps.
>
> Thanks.
> Alex.
>
>
>
> -----Original Message-----
> From: Markus Jelsma <ma...@openindex.io>
> To: user <us...@nutch.apache.org>
> Sent: Mon, Aug 29, 2011 5:19 pm
> Subject: Re: fetcher runs without error with no internet connection
>
>
> I didn't say you have a DNS-problem only that these exception may occur if
> the DNS can't keep up with the requests you make. Make sure you have a DNS
> problem before trying to solve a problem that doesn't exist. It's normal
> to have these exceptions once in a while.
>
> Solving DNS issues are beyond the scope of this list. You may, however, opt
> for some DNS caching in your network.
>
> > What is the solution to the issue with DNS server?
> >
> >
> >
> >
> >
> > -----Original Message-----
> > From: Markus Jelsma <ma...@openindex.io>
> > To: user <us...@nutch.apache.org>
> > Sent: Tue, Aug 23, 2011 12:32 pm
> > Subject: Re: fetcher runs without error with no internet connection
> >
> >
> > If you fetch too hard, your DNS-server may not be able to keep up.
> >
> > > Hi Lewis,
> > >
> > > I stopped fetcher and started it on the same segment again.
> > > But before doing that I turned off modem and fetcher started giving
> > > Unknown.Host exception. It was not giving any error, with dsl failure,
> > > i.e. I was not able to connect to any sites. Again this is nutch-1.2.
> > >
> > > Thanks.
> > > Alex.
> > >
> > >
> > >
> > >
> > >
> > > -----Original Message-----
> > > From: lewis john mcgibbney <le...@gmail.com>
> > > To: user <us...@nutch.apache.org>
> > > Sent: Tue, Aug 23, 2011 6:37 am
> > > Subject: Re: fetcher runs without error with no internet connection
> > >
> > >
> > > Hi Alex,
> > >
> > > Did you get anywhere with this?
> > >
> > > What condition led to you seeing unknown host exception?
> > >
> > > Unless segment gets corrupted, I would assume you could fetch again.
> > > Hopefully you can confirm this.
> > >
> > > On Tue, Aug 16, 2011 at 9:23 PM, <al...@aim.com> wrote:
> > > > Hello,
> > > >
> > > > After running bin/nutch fetch $segment for 2 days, internet
> > > > connection was lost, but nutch did not give any errors. Usually I
> > > > was seeing Unknown host exception before.
> > > > Any ideas what happened and is it OK to stop the fetch and run it
> > > > again on the same (old) segment? This is nutch -1.2
> > > >
> > > > Thanks.
> > > > Alex.
Re: fetcher runs without error with no internet connection
Posted by al...@aim.com.
It is the DNS problem, because it was giving a lot of UnknownHost exception. I decreased thread number to 5, but still DSL fails periodically.
I wondered what is the common internet connection for fetching about 3500 domains. I currently have DSL with 3 Mps.
Thanks.
Alex.
-----Original Message-----
From: Markus Jelsma <ma...@openindex.io>
To: user <us...@nutch.apache.org>
Sent: Mon, Aug 29, 2011 5:19 pm
Subject: Re: fetcher runs without error with no internet connection
I didn't say you have a DNS-problem only that these exception may occur if the
DNS can't keep up with the requests you make. Make sure you have a DNS problem
before trying to solve a problem that doesn't exist. It's normal to have these
exceptions once in a while.
Solving DNS issues are beyond the scope of this list. You may, however, opt
for some DNS caching in your network.
> What is the solution to the issue with DNS server?
>
>
>
>
>
> -----Original Message-----
> From: Markus Jelsma <ma...@openindex.io>
> To: user <us...@nutch.apache.org>
> Sent: Tue, Aug 23, 2011 12:32 pm
> Subject: Re: fetcher runs without error with no internet connection
>
>
> If you fetch too hard, your DNS-server may not be able to keep up.
>
> > Hi Lewis,
> >
> > I stopped fetcher and started it on the same segment again.
> > But before doing that I turned off modem and fetcher started giving
> > Unknown.Host exception. It was not giving any error, with dsl failure,
> > i.e. I was not able to connect to any sites. Again this is nutch-1.2.
> >
> > Thanks.
> > Alex.
> >
> >
> >
> >
> >
> > -----Original Message-----
> > From: lewis john mcgibbney <le...@gmail.com>
> > To: user <us...@nutch.apache.org>
> > Sent: Tue, Aug 23, 2011 6:37 am
> > Subject: Re: fetcher runs without error with no internet connection
> >
> >
> > Hi Alex,
> >
> > Did you get anywhere with this?
> >
> > What condition led to you seeing unknown host exception?
> >
> > Unless segment gets corrupted, I would assume you could fetch again.
> > Hopefully you can confirm this.
> >
> > On Tue, Aug 16, 2011 at 9:23 PM, <al...@aim.com> wrote:
> > > Hello,
> > >
> > > After running bin/nutch fetch $segment for 2 days, internet connection
> > > was lost, but nutch did not give any errors. Usually I was seeing
> > > Unknown host exception before.
> > > Any ideas what happened and is it OK to stop the fetch and run it again
> > > on the same (old) segment? This is nutch -1.2
> > >
> > > Thanks.
> > > Alex.
Re: fetcher runs without error with no internet connection
Posted by Markus Jelsma <ma...@openindex.io>.
I didn't say you have a DNS-problem only that these exception may occur if the
DNS can't keep up with the requests you make. Make sure you have a DNS problem
before trying to solve a problem that doesn't exist. It's normal to have these
exceptions once in a while.
Solving DNS issues are beyond the scope of this list. You may, however, opt
for some DNS caching in your network.
> What is the solution to the issue with DNS server?
>
>
>
>
>
> -----Original Message-----
> From: Markus Jelsma <ma...@openindex.io>
> To: user <us...@nutch.apache.org>
> Sent: Tue, Aug 23, 2011 12:32 pm
> Subject: Re: fetcher runs without error with no internet connection
>
>
> If you fetch too hard, your DNS-server may not be able to keep up.
>
> > Hi Lewis,
> >
> > I stopped fetcher and started it on the same segment again.
> > But before doing that I turned off modem and fetcher started giving
> > Unknown.Host exception. It was not giving any error, with dsl failure,
> > i.e. I was not able to connect to any sites. Again this is nutch-1.2.
> >
> > Thanks.
> > Alex.
> >
> >
> >
> >
> >
> > -----Original Message-----
> > From: lewis john mcgibbney <le...@gmail.com>
> > To: user <us...@nutch.apache.org>
> > Sent: Tue, Aug 23, 2011 6:37 am
> > Subject: Re: fetcher runs without error with no internet connection
> >
> >
> > Hi Alex,
> >
> > Did you get anywhere with this?
> >
> > What condition led to you seeing unknown host exception?
> >
> > Unless segment gets corrupted, I would assume you could fetch again.
> > Hopefully you can confirm this.
> >
> > On Tue, Aug 16, 2011 at 9:23 PM, <al...@aim.com> wrote:
> > > Hello,
> > >
> > > After running bin/nutch fetch $segment for 2 days, internet connection
> > > was lost, but nutch did not give any errors. Usually I was seeing
> > > Unknown host exception before.
> > > Any ideas what happened and is it OK to stop the fetch and run it again
> > > on the same (old) segment? This is nutch -1.2
> > >
> > > Thanks.
> > > Alex.
Re: fetcher runs without error with no internet connection
Posted by al...@aim.com.
What is the solution to the issue with DNS server?
-----Original Message-----
From: Markus Jelsma <ma...@openindex.io>
To: user <us...@nutch.apache.org>
Sent: Tue, Aug 23, 2011 12:32 pm
Subject: Re: fetcher runs without error with no internet connection
If you fetch too hard, your DNS-server may not be able to keep up.
> Hi Lewis,
>
> I stopped fetcher and started it on the same segment again.
> But before doing that I turned off modem and fetcher started giving
> Unknown.Host exception. It was not giving any error, with dsl failure,
> i.e. I was not able to connect to any sites. Again this is nutch-1.2.
>
> Thanks.
> Alex.
>
>
>
>
>
> -----Original Message-----
> From: lewis john mcgibbney <le...@gmail.com>
> To: user <us...@nutch.apache.org>
> Sent: Tue, Aug 23, 2011 6:37 am
> Subject: Re: fetcher runs without error with no internet connection
>
>
> Hi Alex,
>
> Did you get anywhere with this?
>
> What condition led to you seeing unknown host exception?
>
> Unless segment gets corrupted, I would assume you could fetch again.
> Hopefully you can confirm this.
>
> On Tue, Aug 16, 2011 at 9:23 PM, <al...@aim.com> wrote:
> > Hello,
> >
> > After running bin/nutch fetch $segment for 2 days, internet connection
> > was lost, but nutch did not give any errors. Usually I was seeing
> > Unknown host exception before.
> > Any ideas what happened and is it OK to stop the fetch and run it again
> > on the same (old) segment? This is nutch -1.2
> >
> > Thanks.
> > Alex.
Re: fetcher runs without error with no internet connection
Posted by Markus Jelsma <ma...@openindex.io>.
If you fetch too hard, your DNS-server may not be able to keep up.
> Hi Lewis,
>
> I stopped fetcher and started it on the same segment again.
> But before doing that I turned off modem and fetcher started giving
> Unknown.Host exception. It was not giving any error, with dsl failure,
> i.e. I was not able to connect to any sites. Again this is nutch-1.2.
>
> Thanks.
> Alex.
>
>
>
>
>
> -----Original Message-----
> From: lewis john mcgibbney <le...@gmail.com>
> To: user <us...@nutch.apache.org>
> Sent: Tue, Aug 23, 2011 6:37 am
> Subject: Re: fetcher runs without error with no internet connection
>
>
> Hi Alex,
>
> Did you get anywhere with this?
>
> What condition led to you seeing unknown host exception?
>
> Unless segment gets corrupted, I would assume you could fetch again.
> Hopefully you can confirm this.
>
> On Tue, Aug 16, 2011 at 9:23 PM, <al...@aim.com> wrote:
> > Hello,
> >
> > After running bin/nutch fetch $segment for 2 days, internet connection
> > was lost, but nutch did not give any errors. Usually I was seeing
> > Unknown host exception before.
> > Any ideas what happened and is it OK to stop the fetch and run it again
> > on the same (old) segment? This is nutch -1.2
> >
> > Thanks.
> > Alex.
Re: fetcher runs without error with no internet connection
Posted by al...@aim.com.
Hi Lewis,
I stopped fetcher and started it on the same segment again.
But before doing that I turned off modem and fetcher started giving Unknown.Host exception.
It was not giving any error, with dsl failure, i.e. I was not able to connect to any sites. Again this is nutch-1.2.
Thanks.
Alex.
-----Original Message-----
From: lewis john mcgibbney <le...@gmail.com>
To: user <us...@nutch.apache.org>
Sent: Tue, Aug 23, 2011 6:37 am
Subject: Re: fetcher runs without error with no internet connection
Hi Alex,
Did you get anywhere with this?
What condition led to you seeing unknown host exception?
Unless segment gets corrupted, I would assume you could fetch again.
Hopefully you can confirm this.
On Tue, Aug 16, 2011 at 9:23 PM, <al...@aim.com> wrote:
> Hello,
>
> After running bin/nutch fetch $segment for 2 days, internet connection was
> lost, but nutch did not give any errors. Usually I was seeing Unknown host
> exception before.
> Any ideas what happened and is it OK to stop the fetch and run it again on
> the same (old) segment? This is nutch -1.2
>
> Thanks.
> Alex.
>
--
*Lewis*
Re: fetcher runs without error with no internet connection
Posted by lewis john mcgibbney <le...@gmail.com>.
Hi Alex,
Did you get anywhere with this?
What condition led to you seeing unknown host exception?
Unless segment gets corrupted, I would assume you could fetch again.
Hopefully you can confirm this.
On Tue, Aug 16, 2011 at 9:23 PM, <al...@aim.com> wrote:
> Hello,
>
> After running bin/nutch fetch $segment for 2 days, internet connection was
> lost, but nutch did not give any errors. Usually I was seeing Unknown host
> exception before.
> Any ideas what happened and is it OK to stop the fetch and run it again on
> the same (old) segment? This is nutch -1.2
>
> Thanks.
> Alex.
>
--
*Lewis*