You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@nutch.apache.org by Sunnyvale Fl <su...@gmail.com> on 2006/02/01 00:04:28 UTC

Re: adding meta to domain

kudos to the new wiki page - example on writing plugins!
so we don't currently have a plugin to add meta for 0.7.1 then?  since it
looks like it's a development plan for 0.8...
:)

On 1/31/06, Stefan Groschupf <sg...@media-style.com> wrote:
>
> http://issues.apache.org/jira/browse/NUTCH-192
>
> Am 31.01.2006 um 23:35 schrieb Vanderdray, Jacob:
>
> >       What's jira?  I'm actually in the process of writing a set of
> > plugins to process meta tags that you define in the nutch-site.xml
> > file,
> > so I'd be interested in reading about what's being worked on.
> >
> > Thanks,
> > Jake.
> >
> > -----Original Message-----
> > From: Stefan Groschupf [mailto:sg@media-style.com]
> > Sent: Tuesday, January 31, 2006 5:25 PM
> > To: nutch-user@lucene.apache.org
> > Subject: Re: adding meta to domain
> >
> > Meta data support is actually under developerment and come soon. See
> > jira for latest discussion.
> > In any case you can write already a index filter plugin, see the cool
> > fresh wiki documentation for that.
> >
> >
> > Am 31.01.2006 um 23:25 schrieb Sunnyvale Fl:
> >
> >> I need to add some meta data to the index at crawl time and I am
> >> wondering
> >> what is the best way to do it.  For example, for everything in site
> >> www.foo.com, I need to add a meta tag that says source=branchA, and
> >> for
> >> www.bar.com source=branchB.  These meta data are NOT directly
> >> available from
> >> the source content but can be found from a table lookup of key:URL to
> >> value:meta.  I am thinking I can write a plugin to index an
> >> additional meta
> >> field, but what would be the best way to tweak the crawler to do
> >> the table
> >> lookup?
> >
> >
>
>