You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@nutch.apache.org by Jérôme Charron <je...@gmail.com> on 2005/08/27 10:16:29 UTC
Analysis plugins and lucene-analyzers
Hi,
I would like to add some language specific analysis plugins. In this first
approach, each plugin would be simply a wrapper of the lucene's analyzers.
So each analysis-<lang> plugin need to import
lucene-analyzers-1.9-rc1-dev.jar in its lib directory. In order to avoid
adding this jar in many plugins,
I would like to add the lucene-analyzers-1.9-rc1-dev.jar in the nutch core
lib.
Any comments? Any objection?
Regards
Jérôme
--
http://motrech.free.fr/
http://www.frutch.org/
Re: Analysis plugins and lucene-analyzers
Posted by Jérôme Charron <je...@gmail.com>.
>
> I personal don't like the activation mechanism. I prefer to have the
> 'activated' plugins in the plugin folder and to deactivate just
> remove the plugins from the folder.
>
That is much easier to handle than to manage the plugins in the
> folder AND setup them in the configuration file.
+1
> Right. In case a plugin A require a other plugin B that is not
> available the plugin A will not be loaded, as far I remember. :-/
Stefan, here is my first feed back on the plugins dependencies:
1. How to specify some plugin dependencies?
I assume it is using the directive in the plugin xml descriptor:
<requires>
<import plugin="plugin-id"/>
</requires>
specifies that this plugin requires the "plugin-id" code in its classpath at
runtime. Is it right?
But looking at the plugin framework code, the "requires" element is never
parsed.
Can you confirm this point please?
2. There is a addDependency(String id) method in the PluginDescriptor class,
but this method is seems not to be used in Nutch code.
3. Out of scope, but if someone want to contribute:
I added a direct dependency in my plugin build.xml file to the
lucene-analysis plugin in order to compile it.
But it would be better if this kind of dependency is automatically handled
in the build-plugin.xml ant file by parsing the plugins xml descriptors.
(If someone want to contribute. I'm not an ant expert, but I assume it could
be easy by by writing an ant task based on the PluginManifestParser).
Thanks for your comments.
Jérôme
--
http://motrech.free.fr/
http://www.frutch.org/
Re: Analysis plugins and lucene-analyzers
Posted by Stefan Groschupf <sg...@media-style.com>.
Hi Jérôme,
>>> I do not object against putting lucene-analyzers-1.9-rc1-dev.jar in
>>> nutch core but I would like to give another option. I think it is
>>> possible to create a plugin which contains and exports this library
>>> and make other analysis plugin depend on it.
>>>
>
> Yes, that is possible and sure.. I like this idea very much.. :)
> Yes, that was an option that I was thinking about.
> Stefan, in such a case, does this plugin must be in activated in
> the config
> file or it is just dynamicaly loaded because the analysis plugins
> need it?
It must be activated. The activation mechanism was not planed and was
later submittedt.
I personal don't like the activation mechanism. I prefer to have the
'activated' plugins in the plugin folder and to deactivate just
remove the plugins from the folder.
That is much easier to handle than to manage the plugins in the
folder AND setup them in the configuration file.
The best way would be to have different kind of distributions...
(mini, intranet, web).
> If it must be "manually" activated in the conf file, it seems to be
> "dangerous" (it implies that nutch users known the inter-pugins
> dependencies)
Right. In case a plugin A require a other plugin B that is not
available the plugin A will not be loaded, as far I remember. :-/
Thanks for taking care of this!
Stefan
Re: Analysis plugins and lucene-analyzers
Posted by Jérôme Charron <je...@gmail.com>.
>
> > I do not object against putting lucene-analyzers-1.9-rc1-dev.jar in
> > nutch core but I would like to give another option. I think it is
> > possible to create a plugin which contains and exports this library
> > and make other analysis plugin depend on it.
Yes, that is possible and sure.. I like this idea very much.. :)
Yes, that was an option that I was thinking about.
Stefan, in such a case, does this plugin must be in activated in the config
file or it is just dynamicaly loaded because the analysis plugins need it?
If it must be "manually" activated in the conf file, it seems to be
"dangerous" (it implies that nutch users known the inter-pugins
dependencies)
Thanks for your responses and suggestions.
Jérôme
--
http://motrech.free.fr/
http://www.frutch.org/
Re: Analysis plugins and lucene-analyzers
Posted by Stefan Groschupf <sg...@media-style.com>.
Hi,
> I do not object against putting lucene-analyzers-1.9-rc1-dev.jar in
> nutch core but I would like to give another option. I think it is
> possible to create a plugin which contains and exports this library
> and make other analysis plugin depend on it. I am not an expert in
> it but I think such solution is also possible. But it is just a
> second idea for you to consider - I do not have a preference for
> any of these options.
Yes, that is possible and sure.. I like this idea very much.. :)
Stefan
Re: Analysis plugins and lucene-analyzers
Posted by Piotr Kosiorowski <pk...@gmail.com>.
Hello,
I do not object against putting lucene-analyzers-1.9-rc1-dev.jar in
nutch core but I would like to give another option. I think it is
possible to create a plugin which contains and exports this library and
make other analysis plugin depend on it. I am not an expert in it but I
think such solution is also possible. But it is just a second idea for
you to consider - I do not have a preference for any of these options.
Regards
Piotr
Andrzej Bialecki wrote:
> Jérôme Charron wrote:
>
>> Hi,
>>
>> I would like to add some language specific analysis plugins. In this
>> first approach, each plugin would be simply a wrapper of the lucene's
>> analyzers.
>> So each analysis-<lang> plugin need to import
>> lucene-analyzers-1.9-rc1-dev.jar in its lib directory. In order to
>> avoid adding this jar in many plugins, I would like to add the
>> lucene-analyzers-1.9-rc1-dev.jar in the nutch core lib.
>> Any comments? Any objection?
>
>
> I'm wondering if you could implement this plugin as a more or less
> automatic wrapper around any Lucene classes that implement Analyzer,
> i.e. so that it doesn't require recompiling to change/select the
> language, or add a non-standard analyzer from the classpath. I think
> it's possible to do this, but you would have to code a special-case for
> Snowball analyzers, where the default constructor requires an argument.
> All of this could be read from the plugin.xml or n utch-default.xml files.
>
>
Re: Analysis plugins and lucene-analyzers
Posted by Andrzej Bialecki <ab...@getopt.org>.
Jérôme Charron wrote:
> Hi,
>
> I would like to add some language specific analysis plugins. In this first
> approach, each plugin would be simply a wrapper of the lucene's analyzers.
> So each analysis-<lang> plugin need to import
> lucene-analyzers-1.9-rc1-dev.jar in its lib directory. In order to avoid
> adding this jar in many plugins,
> I would like to add the lucene-analyzers-1.9-rc1-dev.jar in the nutch core
> lib.
> Any comments? Any objection?
I'm wondering if you could implement this plugin as a more or less
automatic wrapper around any Lucene classes that implement Analyzer,
i.e. so that it doesn't require recompiling to change/select the
language, or add a non-standard analyzer from the classpath. I think
it's possible to do this, but you would have to code a special-case for
Snowball analyzers, where the default constructor requires an argument.
All of this could be read from the plugin.xml or n utch-default.xml files.
--
Best regards,
Andrzej Bialecki <><
___. ___ ___ ___ _ _ __________________________________
[__ || __|__/|__||\/| Information Retrieval, Semantic Web
___|||__|| \| || | Embedded Unix, System Integration
http://www.sigram.com Contact: info at sigram dot com