You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@nutch.apache.org by ir <ir...@gmail.com> on 2005/06/01 02:00:45 UTC
Do I understand fetch right?
I have a urlfile with 1 site. I inject it into the db and then do a
fetch and it fetches 1 page.
I insert it into the db (65 entries inserted, i'm guessing the number
of links on that 1 fetched page), then I generate the segments again.
Do a fetch again and it gets 65 pages. Insert those into db, generate
segments again and then do another fetch and it gets 2000+ pages....
So as I understand it each time you generate/fetch its like going
another level deep? So running 4 generate/fetches would be the same
as running crawl with -depth 4?
Also 2nd question. Is there a way from the command line to get the
total number of pages you have indexed? Thanks