You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@nutch.apache.org by Apache Wiki <wi...@apache.org> on 2009/03/24 17:52:32 UTC
[Nutch Wiki] Update of "Features" by NycoNyco
Dear Wiki user,
You have subscribed to a wiki page or wiki category on "Nutch Wiki" for change notification.
The following page has been changed by NycoNyco:
http://wiki.apache.org/nutch/Features
The comment on the change is:
(non-exhaustive) tentative features list (please review)
------------------------------------------------------------------------------
(Please reformat this text and divide into feature lists, questions and questions & answers).
== Features ==
+
+ * Fetching, parsing and indexation in parallel and/ou distributed
+ * Plugins
+ * Many formats: plain text, HTML, XML, ZIP, OpenDocument (OpenOffice.org), Microsoft Office (Word, Excel, Powerpoint), PDF, JavaScript, RSS, RTF, MP3 (ID3 tags)
+ * Ontology
+ * Clustering
+ * MapReduce ;
+ * Distributed filesystem (via Hadoop)
+ * Link-graph database
+ * NTLM authentication
== Questions and Answers ==