You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@nutch.apache.org by Lewis John Mcgibbney <le...@gmail.com> on 2015/01/07 17:38:35 UTC

Re: Help regarding headings plugin

Hi Krishna,

On Thu, Dec 11, 2014 at 5:52 AM, <us...@nutch.apache.org> wrote:

>
> When I dump data from segments, I am getting entire html data. Shouldnot it
> be just headings read from crawling. Why am I getting entire data?
>               Please help me. Thanks in advance.
>
> No this is not the case at all!
The headings plugin merely enables you to make the headings explicit within
your manifestation of the webpage itself. Please look into the Parse
Metadata... or possibly Content Metadata I cannot remeber. Your headings
will all be in there.
hth
Lewis