You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@nutch.apache.org by susmita ganguli <su...@yahoo.com> on 2007/08/28 16:50:08 UTC

Reading Author and description meta tags

Hi,

I am trying to read Author and Description meta tag
information of HTML web pages.

Using
parse.getData().getContentMeta().get(Metadata.AUTHOR)
return null in the BasicIndexingFilter class.

If anybody has tried this, please share the
information. 

Thanks in advance, 
Susmita !!


       
____________________________________________________________________________________
Moody friends. Drama queens. Your life? Nope! - their life, your story. Play Sims Stories at Yahoo! Games.
http://sims.yahoo.com/  

Re: Reading Author and description meta tags

Posted by Rohan Mehta <ro...@visvo.com>.
Author and description meta tags are not stored in metadata.

In order to do so, you will have to write a custom plugin that 
implements HtmlParseFilter
that will add the metadata you need. Have a look at LanguageIdentifier 
plugin for more help.

Rohan

susmita ganguli wrote:
> Hi,
>
> I am trying to read Author and Description meta tag
> information of HTML web pages.
>
> Using
> parse.getData().getContentMeta().get(Metadata.AUTHOR)
> return null in the BasicIndexingFilter class.
>
> If anybody has tried this, please share the
> information. 
>
> Thanks in advance, 
> Susmita !!
>
>
>        
> ____________________________________________________________________________________
> Moody friends. Drama queens. Your life? Nope! - their life, your story. Play Sims Stories at Yahoo! Games.
> http://sims.yahoo.com/  
>
>   


-- 
This message has been scanned for viruses and
dangerous content and is believed to be clean.