You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@nutch.apache.org by Alan Aguia <aa...@yahoo.com> on 2008/05/12 19:47:37 UTC
plugin number
I have problems with the plugin, tag that is in nutch-site.xml file. After a number of plugins inserted the system doesn't take some of them, eventhough the system load it. I don't really understand what;s going on... is there a limit of plugin that you can use in this tag? For some reason after somo number of plugins the system doesnt use the last one.
For example
<property>
<name>plugin.includes</name>
<value>protocol-http|urlfilter-regex|parse-(text|html|js|pdf|msword|msexcel|mspowerpoint)|index-basic|query-(basic|site|url|extended)|summary-basic|scoring-opic|analysis-es</value>
<description>
Regular expression naming plugin directory names to
include. Any plugin not matching this expression is excluded.
In any case you need at least include the nutch-extensionpoints plugin. By
default Nutch includes crawling just HTML and plain text via HTTP,
and basic indexing and search plugins.
</description>
</property>
If I include one more plugin the system load it but It doesn't use it.
<property>
<name>plugin.includes</name>
<value>protocol-http|urlfilter-regex|parse-(text|html|js|pdf|msword|msexcel|mspowerpoint)|index-basic|query-(basic|site|url)|summary-basic|scoring-opic|language-identifier|analysis-es|query-extended</value>
<description>
Regular expression naming plugin directory names to
include. Any plugin not matching this expression is excluded.
In any case you need at least include the nutch-extensionpoints plugin. By
default Nutch includes crawling just HTML and plain text via HTTP,
and basic indexing and search plugins.
</description>
</property>
Thanks
---------------------------------
Be a better friend, newshound, and know-it-all with Yahoo! Mobile. Try it now.