You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@nutch.apache.org by Lewis John Mcgibbney <le...@gmail.com> on 2015/08/25 06:43:37 UTC

Re: 2.3.1 and version control

Hi Alp,

On Tue, Jul 21, 2015 at 10:20 PM, <us...@nutch.apache.org> wrote:

>
> I would like to use Tesseract OCR within nutch, in order to parse scanned
> pdf files (assuming this is the correct (and only?) way of doing that).
> Skimming through the previous emails, I noticed the support is possible by
> using 2.3.1, which works alongside with tika 1.7+, which is needed for ocr.
>
> I looked though the repositories, subversion and github, but were not able
> to find any tag/branch for 2.3.1. There is one for 2.4, which is in
> development and has 100 smth open issues.
>
> My question is, is there anywhere I can reach 2.3.1, if not, is it safe to
> use 2.4 trunk ? any planned release dates ? any other suggestions ?
>
>
Yes you can use Nutch 2.4 branch which can be checked out from the URL below
http://svn.apache.org/repos/asf/nutch/branches/2.x/
This codebase is under development (not as much as trunk but under
development none-the-less).
If you have any issues with the branch then please let us know here and we
can help you with it.
Lewis