You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@nutch.apache.org by Bill Goffe <go...@Oswego.EDU> on 2005/04/22 23:19:31 UTC

Re: [Nutch-dev] Re: How to manage fetching?

This brings to mind a minor suggestion -- rather than topN, why not have
the top percentage? Each time I use topN I think think in terms of a
percentage of sites. Seems easier to have the machine do such a simple
calculation...

      - Bill


Tim said:

> Thanks. I made the changes you suggested but the problem persisted.
> After about 5 rounds of 1000 URLs one site would "take over." I made
> the attached small change to get around this problem. It allows you to
> specific the maximum number of URLs you want from any single host. I
> now use -topN 1000 -maxSite 500 and things are going as I had hoped.
> 
> Thanks,
> Tim

-- 
         *------------------------------------------------------*
         | Bill Goffe                 goffe@oswego.edu          |
         | Department of Economics    voice: (315) 312-3444     |
         | SUNY Oswego                fax:   (315) 312-5444     |
         | 416 Mahar Hall             <wuecon.wustl.edu/~goffe> |          
         | Oswego, NY  13126                                    |
*--------*------------------------------------------------------*-----------*
|   "Two physics majors, Justin Kasper and Fred Niell, gathered up some     |
| spare junk from their physics labs and dorm rooms and built a             |
| plutonium-producing reactor.                                              |
|   "`It's kind of scary how easy it was to do,' said Niell, assuring       |
| onlookers that there was only a trace of plutonium -- nothing harmful.    |
| `It only took us about a day to build it.  We've been thinking about it   |
| for a few days and we gathered the parts, and last night we assembled     |
| it. In Justin's room -- he lost the coin toss.'"                          |
|   -- A description of part of the University of Chicago Scavenger Hunt,   |
|      where making a reactor was one of the possible projects. New York    |
|      Times, May 19, 1999.                                                 |
*---------------------------------------------------------------------------*