You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@nutch.apache.org by jasimop <st...@gmail.com> on 2011/06/08 21:14:06 UTC

Re: Nutch Plugin: add several fields at once

This is still an open issue for me and I have not found a solution for it.
Just to be sure: is it possible to add several fields to the index from
within one plugin?
How do you pass data from parsing to indexing stage? Any plugin I could look
at to get an idea?
As described in my last post putting the data into the Parse Metadata seems
not to work, as I alway
get null.

--
View this message in context: http://lucene.472066.n3.nabble.com/Nutch-Plugin-add-several-fields-at-once-tp2981579p3040579.html
Sent from the Nutch - User mailing list archive at Nabble.com.

Re: Nutch Plugin: add several fields at once

Posted by MilleBii <mi...@gmail.com>.
couple additions,

Of course just add as many fields as you whish.

Reading back the Wiki page you pointed out. It's too old and not valid since
1.0 I believe. So I find the time I may have go at it and update.



2011/6/8 MilleBii <mi...@gmail.com>

> Of course it is possible to add multiple fields, I have that running daily.
>
> Have a look at the more plugin to see how it works.
>
> Here an example.
> In the filter method:
>
> String content="some data";
> doc.add("myfield", content);
>
> AND you need to configure the field in
> 'addIndexBackendOptions(Configuration conf)' method
>
>     LuceneWriter.addFieldOptions("myfield", LuceneWriter.STORE.YES,
> LuceneWriter.INDEX.NO, conf);
>
> As for passing data from parse to indexing, I don't know how to do that. So
> you may need to analyse again the content, however at that stage you have
> lost the HTML formatting.
>
> One caveat I found, is that field names need to be lowercase, otherwise it
> doesn't work.
>
>
> 2011/6/8 jasimop <st...@gmail.com>
>
>> This is still an open issue for me and I have not found a solution for it.
>> Just to be sure: is it possible to add several fields to the index from
>> within one plugin?
>> How do you pass data from parsing to indexing stage? Any plugin I could
>> look
>> at to get an idea?
>> As described in my last post putting the data into the Parse Metadata
>> seems
>> not to work, as I alway
>> get null.
>>
>> --
>> View this message in context:
>> http://lucene.472066.n3.nabble.com/Nutch-Plugin-add-several-fields-at-once-tp2981579p3040579.html
>> Sent from the Nutch - User mailing list archive at Nabble.com.
>>
>
>
>
> --
> -MilleBii-
>



-- 
-MilleBii-

Re: Nutch Plugin: add several fields at once

Posted by MilleBii <mi...@gmail.com>.
Of course it is possible to add multiple fields, I have that running daily.

Have a look at the more plugin to see how it works.

Here an example.
In the filter method:

String content="some data";
doc.add("myfield", content);

AND you need to configure the field in 'addIndexBackendOptions(Configuration
conf)' method

    LuceneWriter.addFieldOptions("myfield", LuceneWriter.STORE.YES,
LuceneWriter.INDEX.NO, conf);

As for passing data from parse to indexing, I don't know how to do that. So
you may need to analyse again the content, however at that stage you have
lost the HTML formatting.

One caveat I found, is that field names need to be lowercase, otherwise it
doesn't work.

2011/6/8 jasimop <st...@gmail.com>

> This is still an open issue for me and I have not found a solution for it.
> Just to be sure: is it possible to add several fields to the index from
> within one plugin?
> How do you pass data from parsing to indexing stage? Any plugin I could
> look
> at to get an idea?
> As described in my last post putting the data into the Parse Metadata seems
> not to work, as I alway
> get null.
>
> --
> View this message in context:
> http://lucene.472066.n3.nabble.com/Nutch-Plugin-add-several-fields-at-once-tp2981579p3040579.html
> Sent from the Nutch - User mailing list archive at Nabble.com.
>



-- 
-MilleBii-