You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@nutch.apache.org by Shashanka Balakuntala <sh...@gmail.com> on 2020/06/19 08:39:10 UTC

Regarding the branch 2.x

Hi guys,
I think everyone is aware of the fact that the branch 2.x is no longer
maintained, and Nutch 2.4 is the last of the 2.x series. I was interested
in the 2.x branch for numerous reasons, major thing including the reason
that it supports gora backend so we get a storage abstraction provided by
GORA. Nutch 1.x is really good, robust and has a great performance. But I
wanted to know if any of the users are interested in the development of the
2.x line(as we have some bugs and new features in 2.5 version)?

Let me know your thoughts.

P.S. I'm up for dev of both branches

*Regards*
  Shashanka Balakuntala Srinivasa

Re: Regarding the branch 2.x

Posted by Sebastian Nagel <wa...@googlemail.com>.
Hi,

sorry for the delayed response. But now, since there wasn't anybody
jumping in quickly, let me just add a few explanations about our
decision to drop the 2.x branch.

Any Nutch user is unlikely to use both 1.x and 2.x in parallel,
both branches are batch-based crawlers and are operated "similarly.
This fact has caused that the community is divided, including the
Nutch committers and PMC.  While initially 2.x was thought as a
more performant replacement for 1.x, it lost the momentum. Indeed,
it never outperformed 1.x (at least, not significantly). Maybe
worse the storage layer added extra complexity.  We've lost
users and committers and the development on 2.x got stale in 2018.
To maintain the branch we'd need 2-3 active committers, so that
we can take care that issues are addressed and releases are
published on a regular basis.  Nutch isn't (and never was) a large
project.  So I'm glad we can keep 1.x alive.  The decision to drop
2.x wasn't an easy one, we didn't want to kill the remaining
community around 2.x but being mostly unable to take care about
the 2.x branch would have harmed the community anyway.


> P.S. I'm up for dev of both branches

Thanks for the offer. And yes, your contributions (from the past and from the future I hope) are really appreciated!

Best,
Sebastian

On 6/19/20 10:39 AM, Shashanka Balakuntala wrote:
> Hi guys,
> I think everyone is aware of the fact that the branch 2.x is no longer maintained, and Nutch 2.4 is the last of the 2.x series. I was
> interested in the 2.x branch for numerous reasons, major thing including the reason that it supports gora backend so we get a storage
> abstraction provided by GORA. Nutch 1.x is really good, robust and has a great performance. But I wanted to know if any of the users are
> interested in the development of the 2.x line(as we have some bugs and new features in 2.5 version)?
> 
> Let me know your thoughts. 
> 
> P.S. I'm up for dev of both branches
> 
> _Regards_
>   Shashanka Balakuntala Srinivasa
>