You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@nutch.apache.org by Apache Wiki <wi...@apache.org> on 2009/03/24 17:52:32 UTC

[Nutch Wiki] Update of "Features" by NycoNyco

Dear Wiki user,

You have subscribed to a wiki page or wiki category on "Nutch Wiki" for change notification.

The following page has been changed by NycoNyco:
http://wiki.apache.org/nutch/Features

The comment on the change is:
(non-exhaustive) tentative features list (please review)

------------------------------------------------------------------------------
  (Please reformat this text and divide into feature lists, questions and questions & answers). 
  
  == Features ==
+ 
+  * Fetching, parsing and indexation in parallel and/ou distributed
+  * Plugins
+  * Many formats: plain text, HTML, XML, ZIP, OpenDocument (OpenOffice.org), Microsoft Office (Word, Excel, Powerpoint), PDF, JavaScript, RSS, RTF, MP3 (ID3 tags)
+  * Ontology
+  * Clustering
+  * MapReduce ;
+  * Distributed filesystem (via Hadoop)
+  * Link-graph database
+  * NTLM authentication
  
  == Questions and Answers ==