You are viewing a plain text version of this content. The canonical link for it is here.
Posted to general@lucene.apache.org by Sami Siren <ss...@gmail.com> on 2009/03/14 08:56:18 UTC

Re: [VOTE] Release Apache Nutch 1.0

We're lacking one +1, could someone please take a look?

Thanks,

Sami Siren



Sami Siren wrote:
> Hello,
> 
> I have packaged the second release candidate for Apache Nutch 1.0 
> release at
> 
> http://people.apache.org/~siren/nutch-1.0/rc1/
> 
> See the CHANGES.txt[1] file for details on release contents and latest 
> changes. The release was made from tag: 
> http://svn.apache.org/viewvc/lucene/nutch/tags/release-1.0-rc1/?pathrev=752004
> 
> Please vote on releasing this package as Apache Nutch 1.0. The vote is 
> open for the next 72 hours. Only votes from Lucene PMC members are 
> binding, but everyone is welcome to check the release candidate and 
> voice their approval or disapproval. The vote  passes if at least three 
> binding +1 votes are cast.
> 
> [ ] +1 Release the packages as Apache Nutch 1.0
> [ ] -1 Do not release the packages because...
> 
> Here's my +1
> 
> 
> Thanks!
> 
> 
> [1] 
> *http://svn.apache.org/viewvc/lucene/nutch/tags/release-1.0-rc1/CHANGES.txt?view=log&pathrev=752004
> 
> *--
> Sami Siren
> 


Re: [VOTE] Release Apache Nutch 1.0

Posted by buddha1021 <bu...@yahoo.cn>.
here is my +1;

Congratulations !


Sami Siren-2 wrote:
> 
> We're lacking one +1, could someone please take a look?
> 
> Thanks,
> 
> Sami Siren
> 
> 
> 
> Sami Siren wrote:
>> Hello,
>> 
>> I have packaged the second release candidate for Apache Nutch 1.0 
>> release at
>> 
>> http://people.apache.org/~siren/nutch-1.0/rc1/
>> 
>> See the CHANGES.txt[1] file for details on release contents and latest 
>> changes. The release was made from tag: 
>> http://svn.apache.org/viewvc/lucene/nutch/tags/release-1.0-rc1/?pathrev=752004
>> 
>> Please vote on releasing this package as Apache Nutch 1.0. The vote is 
>> open for the next 72 hours. Only votes from Lucene PMC members are 
>> binding, but everyone is welcome to check the release candidate and 
>> voice their approval or disapproval. The vote  passes if at least three 
>> binding +1 votes are cast.
>> 
>> [ ] +1 Release the packages as Apache Nutch 1.0
>> [ ] -1 Do not release the packages because...
>> 
>> Here's my +1
>> 
>> 
>> Thanks!
>> 
>> 
>> [1] 
>> *http://svn.apache.org/viewvc/lucene/nutch/tags/release-1.0-rc1/CHANGES.txt?view=log&pathrev=752004
>> 
>> *--
>> Sami Siren
>> 
> 
> 
> 

-- 
View this message in context: http://www.nabble.com/Re%3A--VOTE--Release-Apache-Nutch-1.0-tp22510260p22512821.html
Sent from the Lucene - General mailing list archive at Nabble.com.


Re: [VOTE] Release Apache Nutch 1.0

Posted by Chris Hostetter <ho...@fucit.org>.
For the record: i have not had any time to review hte release candidate, 
my comments below are only based on this thread...

: > I see your point about the fat artifact but I am not totally convinced that
: > users (as in end users) would prefer the idea of fetching the development
: > tools and compiling the software before they use it, at least I am not doing
: > that with the software I use.
: 
: Most end users are happy with just the binaries. But pure source
: releases are really useful for example for people that maintain custom
: modifications as patches against the official source releases (think
: of Linux distributions with system-specific changes, companies with
: proprietary extensions, etc.). I'm not sure if Nutch yet has such
: users.

in terms of release type (source vs binary vs combined) it's important to 
keep the audience in mind ... a pure source release probably isn't 
suitable for the nutch user base, since it's primary purpose is to be a 
standalone application people can run without any knowledge of java. (Solr 
is in a similar boat).  A pure binary release (AFAIK) isn't permitted in 
the ASF.  A combined release tends to satisfy everyone.

it may result in a download that seems bloated, but anecdotaly people 
seem to be more bothered by a download that's lacking something they think 
it should have then by a release that contains more then they think it 
should have.  

A "full" release packge with combined source and binaries (and docs) also 
makes it easier for people to "try out" software easily, and puruse the 
docs, and puruse the source code, etc...

I suspect most people maintaining packages of official releases aren't 
particularly bothered by releases that are combined -- they can always 
ignore the compiled code and only deal with the source, in many cases it's 
as simple as adding "ant clean" to the first line of their patch-and-build 
script.  (but i'm certianly not an expert on how package managers think)

: That would be nice. Note that even the users who just want the
: binaries benefit from such a division as also their downloads will be
: faster.

Ehhh ... i'm not sure that angle (src making banary releases larger) is 
ever really an issue ... if i remember right i think the size of 
the compressed source code for solr was only about 10% of the final size 
of the last full release.

: PS. I know there's a long tradition of doing releases the way you
: prepared Nutch 1.0, and I'm not claiming that it's necessarily the
: wrong way of doing things. My -1 was due to the JAI libraries, not due
: to the structure of the release. However, as described above, I
: personally much prefer the clear distinction between source releases
: and binaries.

While the JAI and LICENSE.txt issues definitely seem like they need worked 
out before i release, i wouldn't suggest radically revamping the packaging 
strategy just prior to a release ... even if the concensus is that 
changing the release structure/size is a good ide in the long run.  
probably better to do it just after the release, so there is a long period 
of developer builds to work out any glitches.


-Hoss


Re: [VOTE] Release Apache Nutch 1.0

Posted by Jukka Zitting <ju...@gmail.com>.
Hi,

On Thu, Mar 19, 2009 at 2:15 PM, Sami Siren <ss...@gmail.com> wrote:
> Jukka Zitting wrote:
>> -1 The release contains the Java Advanced Imaging libraries
>> (jai_core.jar and jai_codec.jar) which are licensed under Sun's Binary
>> Code License. We can't redistribute those libraries.
>
> ok, we need to address that somehow.

See https://issues.apache.org/jira/browse/NUTCH-724 for some suggestions.

>> * Why does the release package contain pre-built documentation and
>> binaries? Downloading the 90MB package takes much longer than checking
>> out and building the 40MB tag from svn.
>> IMHO it would be a service to users to make the release contain just the
>> svn export with instruction on how to build the rest.
>
> I see your point about the fat artifact but I am not totally convinced that
> users (as in end users) would prefer the idea of fetching the development
> tools and compiling the software before they use it, at least I am not doing
> that with the software I use.

Most end users are happy with just the binaries. But pure source
releases are really useful for example for people that maintain custom
modifications as patches against the official source releases (think
of Linux distributions with system-specific changes, companies with
proprietary extensions, etc.). I'm not sure if Nutch yet has such
users.

> I will discuss this with rest of the devs and see what we can do here. One
> solution could be to split the release in two parts binary only and source

That would be nice. Note that even the users who just want the
binaries benefit from such a division as also their downloads will be
faster.

>> More notably: how am I to verify that the
>> release came from the sources in our svn when it contains stuff that
>> doesn't exist in the svn?
>
> May be that I don't understand what you're trying to say here but isn't that
> always the case with binary releases (the difficulty to verify that the
> binary is build from certain tag from svn)?

Exactly. That's why it's so important to have a source-only release
that preferably matches one-to-one to the contents of the respective
svn tag. That should be the official release package that the PMC
reviews and approves.

There is no reasonable way to accurately review binaries, so while we
may (and should) test that they work as expected, ultimately we just
need to trust the release manager when he or she says that the
binaries are the result of building the source release. Thus we should
treat binaries as secondary release artifacts that the release manager
is providing as a convenience for users.

PS. I know there's a long tradition of doing releases the way you
prepared Nutch 1.0, and I'm not claiming that it's necessarily the
wrong way of doing things. My -1 was due to the JAI libraries, not due
to the structure of the release. However, as described above, I
personally much prefer the clear distinction between source releases
and binaries.

BR,

Jukka Zitting

Re: [VOTE] Release Apache Nutch 1.0

Posted by Jukka Zitting <ju...@gmail.com>.
Hi,

On Thu, Mar 19, 2009 at 2:15 PM, Sami Siren <ss...@gmail.com> wrote:
> Jukka Zitting wrote:
>> -1 The release contains the Java Advanced Imaging libraries
>> (jai_core.jar and jai_codec.jar) which are licensed under Sun's Binary
>> Code License. We can't redistribute those libraries.
>
> ok, we need to address that somehow.

See https://issues.apache.org/jira/browse/NUTCH-724 for some suggestions.

>> * Why does the release package contain pre-built documentation and
>> binaries? Downloading the 90MB package takes much longer than checking
>> out and building the 40MB tag from svn.
>> IMHO it would be a service to users to make the release contain just the
>> svn export with instruction on how to build the rest.
>
> I see your point about the fat artifact but I am not totally convinced that
> users (as in end users) would prefer the idea of fetching the development
> tools and compiling the software before they use it, at least I am not doing
> that with the software I use.

Most end users are happy with just the binaries. But pure source
releases are really useful for example for people that maintain custom
modifications as patches against the official source releases (think
of Linux distributions with system-specific changes, companies with
proprietary extensions, etc.). I'm not sure if Nutch yet has such
users.

> I will discuss this with rest of the devs and see what we can do here. One
> solution could be to split the release in two parts binary only and source

That would be nice. Note that even the users who just want the
binaries benefit from such a division as also their downloads will be
faster.

>> More notably: how am I to verify that the
>> release came from the sources in our svn when it contains stuff that
>> doesn't exist in the svn?
>
> May be that I don't understand what you're trying to say here but isn't that
> always the case with binary releases (the difficulty to verify that the
> binary is build from certain tag from svn)?

Exactly. That's why it's so important to have a source-only release
that preferably matches one-to-one to the contents of the respective
svn tag. That should be the official release package that the PMC
reviews and approves.

There is no reasonable way to accurately review binaries, so while we
may (and should) test that they work as expected, ultimately we just
need to trust the release manager when he or she says that the
binaries are the result of building the source release. Thus we should
treat binaries as secondary release artifacts that the release manager
is providing as a convenience for users.

PS. I know there's a long tradition of doing releases the way you
prepared Nutch 1.0, and I'm not claiming that it's necessarily the
wrong way of doing things. My -1 was due to the JAI libraries, not due
to the structure of the release. However, as described above, I
personally much prefer the clear distinction between source releases
and binaries.

BR,

Jukka Zitting

Re: [VOTE] Release Apache Nutch 1.0

Posted by Sami Siren <ss...@gmail.com>.
thanks Jukka,

Jukka Zitting wrote:
> Hi,
> 
> On Thu, Mar 19, 2009 at 10:32 AM, Sami Siren <ss...@gmail.com> wrote:
>> We (as a Nutch community) would really appreciate if somebody from the PMC
>> had the time to check it out.
> 
> -1 The release contains the Java Advanced Imaging libraries
> (jai_core.jar and jai_codec.jar) which are licensed under Sun's Binary
> Code License. We can't redistribute those libraries.

ok, we need to address that somehow.

> Other comments based on a quick look:
> 
> * The LICENSE.txt file should have at least references to the licenses
> of the bundled libraries.
> 
> * The NOTICE.txt file should start with the the following lines:
> 
>           Apache Nutch
>           Copyright 2009 The Apache Software Foundation
> 
> * The NOTICE.txt file should contain the required copyright notices
> from all bundled libraries.
> 
> * The README.txt should start with "Apache Nutch" instead of "Nutch"
> 
> * Why does the release package contain pre-built documentation and
> binaries? Downloading the 90MB package takes much longer than checking
> out and building the 40MB tag from svn.
> IMHO it would be a service to users to make the release contain just the svn export with instruction
> on how to build the rest. 

I see your point about the fat artifact but I am not totally convinced 
that users (as in end users) would prefer the idea of fetching the 
development tools and compiling the software before they use it, at 
least I am not doing that with the software I use.

I will discuss this with rest of the devs and see what we can do here. 
One solution could be to split the release in two parts binary only and 
source (they would both be about the same size since out build process 
currently copies jars around I think that's mostly the reason for the 
gigantic size) as you propose below.

> We can also still provide pre-built binaries
> as separate downloads. 
> More notably: how am I to verify that the
> release came from the sources in our svn when it contains stuff that
> doesn't exist in the svn?

May be that I don't understand what you're trying to say here but isn't 
that always the case with binary releases (the difficulty to verify that 
the binary is build from certain tag from svn)?

--
  Sami Siren

Re: [VOTE] Release Apache Nutch 1.0

Posted by Sami Siren <ss...@gmail.com>.
thanks Jukka,

Jukka Zitting wrote:
> Hi,
> 
> On Thu, Mar 19, 2009 at 10:32 AM, Sami Siren <ss...@gmail.com> wrote:
>> We (as a Nutch community) would really appreciate if somebody from the PMC
>> had the time to check it out.
> 
> -1 The release contains the Java Advanced Imaging libraries
> (jai_core.jar and jai_codec.jar) which are licensed under Sun's Binary
> Code License. We can't redistribute those libraries.

ok, we need to address that somehow.

> Other comments based on a quick look:
> 
> * The LICENSE.txt file should have at least references to the licenses
> of the bundled libraries.
> 
> * The NOTICE.txt file should start with the the following lines:
> 
>           Apache Nutch
>           Copyright 2009 The Apache Software Foundation
> 
> * The NOTICE.txt file should contain the required copyright notices
> from all bundled libraries.
> 
> * The README.txt should start with "Apache Nutch" instead of "Nutch"
> 
> * Why does the release package contain pre-built documentation and
> binaries? Downloading the 90MB package takes much longer than checking
> out and building the 40MB tag from svn.
> IMHO it would be a service to users to make the release contain just the svn export with instruction
> on how to build the rest. 

I see your point about the fat artifact but I am not totally convinced 
that users (as in end users) would prefer the idea of fetching the 
development tools and compiling the software before they use it, at 
least I am not doing that with the software I use.

I will discuss this with rest of the devs and see what we can do here. 
One solution could be to split the release in two parts binary only and 
source (they would both be about the same size since out build process 
currently copies jars around I think that's mostly the reason for the 
gigantic size) as you propose below.

> We can also still provide pre-built binaries
> as separate downloads. 
> More notably: how am I to verify that the
> release came from the sources in our svn when it contains stuff that
> doesn't exist in the svn?

May be that I don't understand what you're trying to say here but isn't 
that always the case with binary releases (the difficulty to verify that 
the binary is build from certain tag from svn)?

--
  Sami Siren

Re: [VOTE] Release Apache Nutch 1.0

Posted by Jukka Zitting <ju...@gmail.com>.
Hi,

On Thu, Mar 19, 2009 at 10:32 AM, Sami Siren <ss...@gmail.com> wrote:
> We (as a Nutch community) would really appreciate if somebody from the PMC
> had the time to check it out.

-1 The release contains the Java Advanced Imaging libraries
(jai_core.jar and jai_codec.jar) which are licensed under Sun's Binary
Code License. We can't redistribute those libraries.

Other comments based on a quick look:

* The LICENSE.txt file should have at least references to the licenses
of the bundled libraries.

* The NOTICE.txt file should start with the the following lines:

          Apache Nutch
          Copyright 2009 The Apache Software Foundation

* The NOTICE.txt file should contain the required copyright notices
from all bundled libraries.

* The README.txt should start with "Apache Nutch" instead of "Nutch"

* Why does the release package contain pre-built documentation and
binaries? Downloading the 90MB package takes much longer than checking
out and building the 40MB tag from svn. IMHO it would be a service to
users to make the release contain just the svn export with instruction
on how to build the rest. We can also still provide pre-built binaries
as separate downloads. More notably: how am I to verify that the
release came from the sources in our svn when it contains stuff that
doesn't exist in the svn?

BR,

Jukka Zitting

Re: [VOTE] Release Apache Nutch 1.0

Posted by Jukka Zitting <ju...@gmail.com>.
Hi,

On Thu, Mar 19, 2009 at 10:32 AM, Sami Siren <ss...@gmail.com> wrote:
> We (as a Nutch community) would really appreciate if somebody from the PMC
> had the time to check it out.

-1 The release contains the Java Advanced Imaging libraries
(jai_core.jar and jai_codec.jar) which are licensed under Sun's Binary
Code License. We can't redistribute those libraries.

Other comments based on a quick look:

* The LICENSE.txt file should have at least references to the licenses
of the bundled libraries.

* The NOTICE.txt file should start with the the following lines:

          Apache Nutch
          Copyright 2009 The Apache Software Foundation

* The NOTICE.txt file should contain the required copyright notices
from all bundled libraries.

* The README.txt should start with "Apache Nutch" instead of "Nutch"

* Why does the release package contain pre-built documentation and
binaries? Downloading the 90MB package takes much longer than checking
out and building the 40MB tag from svn. IMHO it would be a service to
users to make the release contain just the svn export with instruction
on how to build the rest. We can also still provide pre-built binaries
as separate downloads. More notably: how am I to verify that the
release came from the sources in our svn when it contains stuff that
doesn't exist in the svn?

BR,

Jukka Zitting

Re: [VOTE] Release Apache Nutch 1.0

Posted by Sami Siren <ss...@gmail.com>.
Fellow PMC members,

As you might know already we have posted a release candidate for Nutch 
1.0 some time ago. However we have so far received only two +1 votes 
from Lucene PMC members and one more is required before we can actually 
finalize the release.

The vote thread as it currently is can be seen from:
http://www.lucidimagination.com/search/document/33b2a26db25db492/vote_release_apache_nutch_1_0

We (as a Nutch community) would really appreciate if somebody from the 
PMC had the time to check it out.

Thanks for your time,

  Sami Siren



Sami Siren wrote:
> We're lacking one +1, could someone please take a look?
> 
> Thanks,
> 
> Sami Siren
> 
> 
> 
> Sami Siren wrote:
>> Hello,
>>
>> I have packaged the second release candidate for Apache Nutch 1.0 
>> release at
>>
>> http://people.apache.org/~siren/nutch-1.0/rc1/
>>
>> See the CHANGES.txt[1] file for details on release contents and latest 
>> changes. The release was made from tag: 
>> http://svn.apache.org/viewvc/lucene/nutch/tags/release-1.0-rc1/?pathrev=752004 
>>
>>
>> Please vote on releasing this package as Apache Nutch 1.0. The vote is 
>> open for the next 72 hours. Only votes from Lucene PMC members are 
>> binding, but everyone is welcome to check the release candidate and 
>> voice their approval or disapproval. The vote  passes if at least 
>> three binding +1 votes are cast.
>>
>> [ ] +1 Release the packages as Apache Nutch 1.0
>> [ ] -1 Do not release the packages because...
>>
>> Here's my +1
>>
>>
>> Thanks!
>>
>>
>> [1] 
>> *http://svn.apache.org/viewvc/lucene/nutch/tags/release-1.0-rc1/CHANGES.txt?view=log&pathrev=752004 
>>
>>
>> *--
>> Sami Siren
>>
> 


Re: [VOTE] Release Apache Nutch 1.0

Posted by Sami Siren <ss...@gmail.com>.
Fellow PMC members,

As you might know already we have posted a release candidate for Nutch 
1.0 some time ago. However we have so far received only two +1 votes 
from Lucene PMC members and one more is required before we can actually 
finalize the release.

The vote thread as it currently is can be seen from:
http://www.lucidimagination.com/search/document/33b2a26db25db492/vote_release_apache_nutch_1_0

We (as a Nutch community) would really appreciate if somebody from the 
PMC had the time to check it out.

Thanks for your time,

  Sami Siren



Sami Siren wrote:
> We're lacking one +1, could someone please take a look?
> 
> Thanks,
> 
> Sami Siren
> 
> 
> 
> Sami Siren wrote:
>> Hello,
>>
>> I have packaged the second release candidate for Apache Nutch 1.0 
>> release at
>>
>> http://people.apache.org/~siren/nutch-1.0/rc1/
>>
>> See the CHANGES.txt[1] file for details on release contents and latest 
>> changes. The release was made from tag: 
>> http://svn.apache.org/viewvc/lucene/nutch/tags/release-1.0-rc1/?pathrev=752004 
>>
>>
>> Please vote on releasing this package as Apache Nutch 1.0. The vote is 
>> open for the next 72 hours. Only votes from Lucene PMC members are 
>> binding, but everyone is welcome to check the release candidate and 
>> voice their approval or disapproval. The vote  passes if at least 
>> three binding +1 votes are cast.
>>
>> [ ] +1 Release the packages as Apache Nutch 1.0
>> [ ] -1 Do not release the packages because...
>>
>> Here's my +1
>>
>>
>> Thanks!
>>
>>
>> [1] 
>> *http://svn.apache.org/viewvc/lucene/nutch/tags/release-1.0-rc1/CHANGES.txt?view=log&pathrev=752004 
>>
>>
>> *--
>> Sami Siren
>>
>