You are viewing a plain text version of this content. The canonical link for it is here.

Posted to user@nutch.apache.org by Mikael Krantz <ll...@gmail.com> on 2005/08/28 21:25:11 UTC

How can I access the content of a specific page?

What is the easiest way to access the content of an indexed page if
you have its URL? WebDBReader can give me the Page-object of the page
but that doesn't contain the actual page data, just the
URL/Title/etc...

/M

Re: How can I access the content of a specific page?

Posted by Michael Ji <fj...@yahoo.com>.

I guess you can use nutch/segread to dump segment
data, which contains parse_data...

Michael Ji

--- Mikael Krantz <ll...@gmail.com> wrote:

> What is the easiest way to access the content of an
> indexed page if
> you have its URL? WebDBReader can give me the
> Page-object of the page
> but that doesn't contain the actual page data, just
> the
> URL/Title/etc...
> 
> /M
> 

____________________________________________________
Start your day with Yahoo! - make it your home page 
http://www.yahoo.com/r/hs