You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@nutch.apache.org by Apache Wiki <wi...@apache.org> on 2005/09/22 22:15:57 UTC
[Nutch Wiki] Update of "Features" by LarsAronsson
Dear Wiki user,
You have subscribed to a wiki page or wiki category on "Nutch Wiki" for change notification.
The following page has been changed by LarsAronsson:
http://wiki.apache.org/nutch/Features
The comment on the change is:
Features of Nutch search could be documented here, if I knew the answers.
New page:
Missing from the current Nutch documentation (Tutorial, FAQ) is a list of features. This wiki page could help, if someone who knows the answers can edit it.
*What kind of searches does Nutch support? (quoted, nested, truncation, wildcarding [and where], Boolean),
*Is stemming an option?
*What kind of stemming does Nutch use? (and can you add exceptions/changes?)
*Does Nutch support Boolean operators? (can you use Google-like plus or minus or are you stuck with 1990s terms?)
*Does Nutch support weighted field searching, synonym support?
*What kinds of indexes does Nutch build? (multi-format indexing, incremental indexing, spell-check support, thesauri support, fielded searching, rank-by-reputation?)
*How does the search engine handle punctuation and special characters? (and what's configurable?)
*Which document formats are supported?
*What post-coordination options are available? (hey Karen, what does this mean?)
*How easy is Nutch to configure?
*How transparent is its configuration to a working organization: does it require geeky command line stuff, or can a knowledgable manager enter a web or software interface to view or modify settings?
* How are results sorted?
* Does Nutch support deduping?
* Can one tinker with relevance algoritms?
* Are there ranking overrides?