You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@nutch.apache.org by Ar...@csiro.au on 2010/09/17 12:57:48 UTC

Arch 1.2 has been released

Hello,

I am announcing release of Arch 1.2 based on Nutch 1.2. Arch is an extension of Nutch. It is designed for indexing and search of intranets. Many features have been added that make this task easier and deliver high precision search results.

For details and downloads, please see Arch home page:

http://www.atnf.csiro.au/computing/software/arch/

This version includes a tuning and evaluation module that lets you compare a few search engines in blind tests and/or side-by-side. This is a useful thing if you want to get an idea of real performance of different search engines or trace effects of changes you have made.  You can use it even if you don't use Arch. It includes search plugins for Nutch, Arch, Google and Funnelback. It is easy to write ones for other engines.

People who use Nutch may want to use the upgraded version of the parse-pdf plugin. You can do it yourself though. It just requires switching to the latest libraries (this requires a minor change in the sources to fix the imports). I highly recommend this upgrade because it fixed a lot of PDF parsing errors for us.


Regards,

Arkadi Kosmynin