You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@nutch.apache.org by Apache Wiki <wi...@apache.org> on 2006/03/22 01:04:31 UTC
[Nutch Wiki] Trivial Update of "JeromeCharron" by JeromeCharron
Dear Wiki user,
You have subscribed to a wiki page or wiki category on "Nutch Wiki" for change notification.
The following page has been changed by JeromeCharron:
http://wiki.apache.org/nutch/JeromeCharron
------------------------------------------------------------------------------
* Add ability to handle plugins inter-dependencies (ie, a plugin can specify it has a runtime dependency on another(s) plugin(s) using the <requires><import plugin="plugin-id"/></requires> directive in the plugin.xml plugin descriptor.
* Add ability to automatically load (depending on config) the required plugins specified by plugins dependencies (circular dependencies checked).
* MarkupLanguageParserProposal
- * '''TODO''': [http://microformats.org/ Microformats] HtmlParseFilter:
+ * [http://microformats.org/ Microformats] HtmlParseFilter:
- * [http://microformats.org/wiki/rel-tag rel-tag]
+ * [http://microformats.org/wiki/rel-tag rel-tag] (see microformats-reltag plugin)
- * [http://microformats.org/wiki/hreview hreview]
+ * '''TODO''' [http://microformats.org/wiki/hreview hreview]
* ...
* Nutch [http://fr.wikipedia.org/wiki/Nutch article] on french wikipedia.
-
+ * URL Filters enhancements:
+ * Add a ''mini framework'' plugin for regular expression based URL Filters ([http://svn.apache.org/viewcvs.cgi/lucene/nutch/trunk/src/plugin/lib-regex-filter/ lib-regex-filter])
+ * Add a regex url filter implementation based on [http://www.brics.dk/automaton/ dk.brics.automaton] Finite-State Automata for Java.
+ * See RegexURLFiltersBenchs for a comparison of urlfilter-regex and urlfilter-automaton plugins
----
CategoryHomepage