You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@nutch.apache.org by Jesse Hires <jh...@gmail.com> on 2009/12/24 19:30:35 UTC

Is there a way to trim unfetched URLs?

After letting my setup run for a while, I have quite the queue of unfetched
URLs. On the order of 10:1 of fetched vs unfetched.

Is there a way to trim the lowest scoring unfetched URLs from nutch?


Jesse

int GetRandomNumber()
{
   return 4; // Chosen by fair roll of dice
                // Guaranteed to be random
} // xkcd.com

Re: Is there a way to trim unfetched URLs?

Posted by Futebol DotInfo <fu...@yahoo.com>.
unsubscribe

--- On Thu, 12/24/09, Jesse Hires <jh...@gmail.com> wrote:

From: Jesse Hires <jh...@gmail.com>
Subject: Is there a way to trim unfetched URLs?
To: nutch-user@lucene.apache.org
Date: Thursday, December 24, 2009, 10:30 AM

After letting my setup run for a while, I have quite the queue of unfetched
URLs. On the order of 10:1 of fetched vs unfetched.

Is there a way to trim the lowest scoring unfetched URLs from nutch?


Jesse

int GetRandomNumber()
{
   return 4; // Chosen by fair roll of dice
                // Guaranteed to be random
} // xkcd.com