You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@nutch.apache.org by rohit aman <ro...@gmail.com> on 2011/10/03 03:10:00 UTC

getting 'anchor text' in HtmlParseFilter.filter()

Hi,

Is there a way to get 'anchor text' of a url in
HtmlParserFilter.filter(Content content, ParseResult parseResult,
HTMLMetaTags metaTags, DocumentFragment doc) method.

I wrote a small parserPlugin. Plugin's parser class extends HtmlParserFilter
and implements filter() method. Now, I need to add some metadata depending
on the anchortext. How can I get it in the filter() method?

Thanks in advance
Rohit