You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@nutch.apache.org by 郭雄 <he...@gmail.com> on 2009/01/09 23:10:32 UTC

how to use nutch get a webpage title and metadata

Hi,i am a newer to nutch,and i need get a webpage title and its metedata
like <meta name="description" content="nutch study"> : "nutch study".
can somebody give a guid on how to use nutch to do it.Thanks a lot!

Re: how to use nutch get a webpage title and metadata

Posted by Ankur Garg <ga...@gmail.com>.
Hi,

you can look into parse-html plugin, lookout for the class
"HTMLMetaProcessor.java " . the metadata is processed there so u can
probably ge it from there



On Sat, Jan 10, 2009 at 3:40 AM, 郭雄 <he...@gmail.com> wrote:

> Hi,i am a newer to nutch,and i need get a webpage title and its metedata
> like <meta name="description" content="nutch study"> : "nutch study".
> can somebody give a guid on how to use nutch to do it.Thanks a lot!
>



-- 
Ankur Garg