You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@nutch.apache.org by Andrzej Bialecki <ab...@getopt.org> on 2011/01/04 21:27:54 UTC
Release planning
Hi users & devs,
As you probably know, there are currently two active lines of
development for Nutch:
* Nutch trunk, a.k.a. Nutch 2.0: this is based on a completely
redesigned storage layer that uses Apache Gora, which in turn can use
various storage implementations such as HBase, Cassandra, and MySQL.
This branch is still largely experimental and unstable, but work is
progressing, and at the current pace I think a release should be
possible within the next ~6 months. Another important addition on this
branch is a REST API that allows using Nutch as a black-box crawling
service.
* Nutch branch-1.3: this started as a snapshot of Nutch trunk just
before merging with nutchbase (i.e. switching to Gora as a storage
layer). This branch is still largely similar to the previous versions of
Nutch, and uses Hadoop MapFile/SequenceFile and "segments". As compared
with release 1.2 it does NOT ship with any search infrastructure,
because all search functionality has been delegated to Solr (via
SolrIndexer). This is BTW also true about Nutch trunk.
Regarding branch-1.2 (which is a maintenance branch after release 1.2)
there have been pretty no updates there, if any. Nutch committer
resources are very limited (when it comes to active committers), so I
don't expect any maintenance release from this branch to happen...
I think that considering the relatively remote release date for Nutch
2.-0 it would make sense to roll out a 1.3 release based on branch-1.3,
after making sure that all critical patches from trunk have been merged
in there.
What do you think?
--
Best regards,
Andrzej Bialecki <><
___. ___ ___ ___ _ _ __________________________________
[__ || __|__/|__||\/| Information Retrieval, Semantic Web
___|||__|| \| || | Embedded Unix, System Integration
http://www.sigram.com Contact: info at sigram dot com
Re: Release planning
Posted by Julien Nioche <li...@gmail.com>.
+1 from me. I've committed today a bunch of patches which were in 1.2 but
not in 1.3 (just one last one to do) but haven't compared with 2.0
Having a release based on 1.3 would be great as it would be a nice
transition towards 2.0 (delegate indexing/search, dependency management with
Ivy, separation between local and remote deployment, removal of redondant
plugins etc...).
Julien
--
*
*Open Source Solutions for Text Engineering
http://digitalpebble.blogspot.com/
http://www.digitalpebble.com
On 4 January 2011 20:27, Andrzej Bialecki <ab...@getopt.org> wrote:
> Hi users & devs,
>
> As you probably know, there are currently two active lines of development
> for Nutch:
>
> * Nutch trunk, a.k.a. Nutch 2.0: this is based on a completely redesigned
> storage layer that uses Apache Gora, which in turn can use various storage
> implementations such as HBase, Cassandra, and MySQL. This branch is still
> largely experimental and unstable, but work is progressing, and at the
> current pace I think a release should be possible within the next ~6 months.
> Another important addition on this branch is a REST API that allows using
> Nutch as a black-box crawling service.
>
> * Nutch branch-1.3: this started as a snapshot of Nutch trunk just before
> merging with nutchbase (i.e. switching to Gora as a storage layer). This
> branch is still largely similar to the previous versions of Nutch, and uses
> Hadoop MapFile/SequenceFile and "segments". As compared with release 1.2 it
> does NOT ship with any search infrastructure, because all search
> functionality has been delegated to Solr (via SolrIndexer). This is BTW also
> true about Nutch trunk.
>
> Regarding branch-1.2 (which is a maintenance branch after release 1.2)
> there have been pretty no updates there, if any. Nutch committer resources
> are very limited (when it comes to active committers), so I don't expect any
> maintenance release from this branch to happen...
>
> I think that considering the relatively remote release date for Nutch 2.-0
> it would make sense to roll out a 1.3 release based on branch-1.3, after
> making sure that all critical patches from trunk have been merged in there.
>
> What do you think?
>
> --
> Best regards,
> Andrzej Bialecki <><
> ___. ___ ___ ___ _ _ __________________________________
> [__ || __|__/|__||\/| Information Retrieval, Semantic Web
> ___|||__|| \| || | Embedded Unix, System Integration
> http://www.sigram.com Contact: info at sigram dot com
>
>
Re: Release planning
Posted by Markus Jelsma <ma...@openindex.io>.
Splendid +1!
On Tuesday 04 January 2011 21:27:54 Andrzej Bialecki wrote:
> Hi users & devs,
>
> As you probably know, there are currently two active lines of
> development for Nutch:
>
> * Nutch trunk, a.k.a. Nutch 2.0: this is based on a completely
> redesigned storage layer that uses Apache Gora, which in turn can use
> various storage implementations such as HBase, Cassandra, and MySQL.
> This branch is still largely experimental and unstable, but work is
> progressing, and at the current pace I think a release should be
> possible within the next ~6 months. Another important addition on this
> branch is a REST API that allows using Nutch as a black-box crawling
> service.
>
> * Nutch branch-1.3: this started as a snapshot of Nutch trunk just
> before merging with nutchbase (i.e. switching to Gora as a storage
> layer). This branch is still largely similar to the previous versions of
> Nutch, and uses Hadoop MapFile/SequenceFile and "segments". As compared
> with release 1.2 it does NOT ship with any search infrastructure,
> because all search functionality has been delegated to Solr (via
> SolrIndexer). This is BTW also true about Nutch trunk.
>
> Regarding branch-1.2 (which is a maintenance branch after release 1.2)
> there have been pretty no updates there, if any. Nutch committer
> resources are very limited (when it comes to active committers), so I
> don't expect any maintenance release from this branch to happen...
>
> I think that considering the relatively remote release date for Nutch
> 2.-0 it would make sense to roll out a 1.3 release based on branch-1.3,
> after making sure that all critical patches from trunk have been merged
> in there.
>
> What do you think?
--
Markus Jelsma - CTO - Openindex
http://www.linkedin.com/in/markus17
050-8536620 / 06-50258350
Re: Release planning
Posted by "Mattmann, Chris A (388J)" <ch...@jpl.nasa.gov>.
(cc to dev@nutch since you are addressing devs too)
Hey Andrzej:
>
> As you probably know, there are currently two active lines of
> development for Nutch:
> [...snip...]
>
> Regarding branch-1.2 (which is a maintenance branch after release 1.2)
> there have been pretty no updates there, if any. Nutch committer
> resources are very limited (when it comes to active committers), so I
> don't expect any maintenance release from this branch to happen...
+1, agreed.
>
> I think that considering the relatively remote release date for Nutch
> 2.-0 it would make sense to roll out a 1.3 release based on branch-1.3,
> after making sure that all critical patches from trunk have been merged
> in there.
>
> What do you think?
Sounds good to me. Count me in to RM it if you guys are OK with that!
Cheers,
Chris
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
Chris Mattmann, Ph.D.
Senior Computer Scientist
NASA Jet Propulsion Laboratory Pasadena, CA 91109 USA
Office: 171-266B, Mailstop: 171-246
Email: chris.a.mattmann@nasa.gov
WWW: http://sunset.usc.edu/~mattmann/
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
Adjunct Assistant Professor, Computer Science Department
University of Southern California, Los Angeles, CA 90089 USA
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
Re: Release planning
Posted by Julien Nioche <li...@gmail.com>.
+1 from me. I've committed today a bunch of patches which were in 1.2 but
not in 1.3 (just one last one to do) but haven't compared with 2.0
Having a release based on 1.3 would be great as it would be a nice
transition towards 2.0 (delegate indexing/search, dependency management with
Ivy, separation between local and remote deployment, removal of redondant
plugins etc...).
Julien
--
*
*Open Source Solutions for Text Engineering
http://digitalpebble.blogspot.com/
http://www.digitalpebble.com
On 4 January 2011 20:27, Andrzej Bialecki <ab...@getopt.org> wrote:
> Hi users & devs,
>
> As you probably know, there are currently two active lines of development
> for Nutch:
>
> * Nutch trunk, a.k.a. Nutch 2.0: this is based on a completely redesigned
> storage layer that uses Apache Gora, which in turn can use various storage
> implementations such as HBase, Cassandra, and MySQL. This branch is still
> largely experimental and unstable, but work is progressing, and at the
> current pace I think a release should be possible within the next ~6 months.
> Another important addition on this branch is a REST API that allows using
> Nutch as a black-box crawling service.
>
> * Nutch branch-1.3: this started as a snapshot of Nutch trunk just before
> merging with nutchbase (i.e. switching to Gora as a storage layer). This
> branch is still largely similar to the previous versions of Nutch, and uses
> Hadoop MapFile/SequenceFile and "segments". As compared with release 1.2 it
> does NOT ship with any search infrastructure, because all search
> functionality has been delegated to Solr (via SolrIndexer). This is BTW also
> true about Nutch trunk.
>
> Regarding branch-1.2 (which is a maintenance branch after release 1.2)
> there have been pretty no updates there, if any. Nutch committer resources
> are very limited (when it comes to active committers), so I don't expect any
> maintenance release from this branch to happen...
>
> I think that considering the relatively remote release date for Nutch 2.-0
> it would make sense to roll out a 1.3 release based on branch-1.3, after
> making sure that all critical patches from trunk have been merged in there.
>
> What do you think?
>
> --
> Best regards,
> Andrzej Bialecki <><
> ___. ___ ___ ___ _ _ __________________________________
> [__ || __|__/|__||\/| Information Retrieval, Semantic Web
> ___|||__|| \| || | Embedded Unix, System Integration
> http://www.sigram.com Contact: info at sigram dot com
>
>
Re: Release planning
Posted by "Mattmann, Chris A (388J)" <ch...@jpl.nasa.gov>.
(cc to dev@nutch since you are addressing devs too)
Hey Andrzej:
>
> As you probably know, there are currently two active lines of
> development for Nutch:
> [...snip...]
>
> Regarding branch-1.2 (which is a maintenance branch after release 1.2)
> there have been pretty no updates there, if any. Nutch committer
> resources are very limited (when it comes to active committers), so I
> don't expect any maintenance release from this branch to happen...
+1, agreed.
>
> I think that considering the relatively remote release date for Nutch
> 2.-0 it would make sense to roll out a 1.3 release based on branch-1.3,
> after making sure that all critical patches from trunk have been merged
> in there.
>
> What do you think?
Sounds good to me. Count me in to RM it if you guys are OK with that!
Cheers,
Chris
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
Chris Mattmann, Ph.D.
Senior Computer Scientist
NASA Jet Propulsion Laboratory Pasadena, CA 91109 USA
Office: 171-266B, Mailstop: 171-246
Email: chris.a.mattmann@nasa.gov
WWW: http://sunset.usc.edu/~mattmann/
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
Adjunct Assistant Professor, Computer Science Department
University of Southern California, Los Angeles, CA 90089 USA
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++