You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@giraph.apache.org by Avery Ching <ac...@apache.org> on 2013/04/13 07:39:56 UTC

[VOTE][CHANGED] Release Giraph 1.0 (rc1)

Thanks to the quick feedback from Roman and Lewis, we have cut a new RC1 
that addresses the following issues.

* Got rid of .git repo in tarball
* Fixed issue with not compiling without git repo (GIRAPH-628)
* Used gnutar in OSX rather than tar to generate the tarball and get rid 
of warnings
* Pushed GIRAPH-627 to support the yarn profile better
* Tarball name changed to the final artifact name (giraph-1.0.tar.gz)

Release notes:
http://people.apache.org/~aching/giraph-1.0-RC1/RELEASE_NOTES.html

Release artifacts:
http://people.apache.org/~aching/giraph-1.0-RC1/

Corresponding git tag:
https://git-wip-us.apache.org/repos/asf?p=giraph.git;a=shortlog;h=refs/tags/release-1.0-RC1

Signing keys:
http://people.apache.org/keys/group/giraph.asc

The vote runs for 72 hours, until Monday 11pm PST.

Thanks,

Avery

Original message below regarding rc0:

-------------------------------

Fellow Giraphers,

We have a our first release candidate since graduating from incubation. 
  This is a source release, primarily due to the different versions of 
Hadoop we support with munge (similar to the 0.1 release).  Since 0.1, 
we've made A TON of progress on overall performance, optimizing memory 
use, split vertex/edge inputs, easy interoperability with Apache Hive, 
and a bunch of other areas.  In many ways, this is an almost totally 
different codebase.  Thanks everyone for your hard work!

Apache Giraph has been running in production at Facebook (against 
Facebook's Corona implementation of Hadoop - 
https://github.com/facebook/hadoop-20/tree/master/src/contrib/corona) 
since around last December.  It has proven to be very scalable, 
performant, and enables a bunch of new applications.  Based on the 
drastic improvements and the use of Giraph in production, it seems 
appropriate to bump up our version to 1.0.

While anyone can vote, the ASF requires majority approval from the PMC 
-- i.e., at least three PMC members must vote affirmatively for release, 
and there must be more positive than negative votes. Releases may not be 
vetoed. Before voting +1 PMC members are required to download the signed 
source code package, compile it as provided, and test the resulting 
executable on their own platform, along with also verifying that the 
package meets the requirements of the ASF policy on releases.

Please test this against many other Hadoop versions and let us know how 
this goes!

Release notes:
http://people.apache.org/~aching/giraph-1.0-RC0/RELEASE_NOTES.html

Release artifacts:
http://people.apache.org/~aching/giraph-1.0-RC0/

Corresponding git tag:
https://git-wip-us.apache.org/repos/asf?p=giraph.git;a=shortlog;h=refs/tags/release-1.0-RC0

Signing keys:
http://people.apache.org/keys/group/giraph.asc

The vote runs for 72 hours, until Monday 4pm PST.

Thanks everyone for your patience with this release!

Avery

Re: [VOTE][CHANGED] Release Giraph 1.0 (rc1)

Posted by Roman Shaposhnik <rv...@apache.org>.
On Sat, Apr 13, 2013 at 6:46 AM, Avery Ching <ac...@apache.org> wrote:
> That's great Sebastian.  I would also recommend taking a look at the
> PageRankBenchmark for a performance comparison.  It has been a lot of speed
> improvements that should be a bunch faster than PageRankVertex.  Even that
> though, is not totally optimized.  Hopefully we'll be adding a "how to
> optimize performance" guide in the near future.  Should we delay the release
> or simply just ship a 1.1, say in the next month with this fix and
> supporting YARN's 2.0.4?  I'd like to get on a more normal release cycle
> rather than once a year =).

Excellent point. The release process that I have seen work very well in
the past is that you get on a regular date-driven release schedule (lets
say quarterly) but you still allow for patch release with very limited
scope from time to time. Path releases are NOT date driven -- they
go out when all the patches required are committed to trunk and back
ported to the appropriate branch.

So, would the following work for us: we go ahead with 1.0 release as planned
(unless showstoppers get identified in the next 72 hours). But we all agree
to 1.0.1 with scope strictly limited to YARN stabilization on top of
Hadoop 2.0.4.
If we get all that we need for 1.0.1 in a week -- we'll release in a week if
it takes us 3 weeks -- we'll release in 3 weeks and so on.


Thanks,
Roman.

Re: [VOTE][CHANGED] Release Giraph 1.0 (rc1)

Posted by Claudio Martella <cl...@gmail.com>.
In general, my understanding of RC is that we should not add new features
or improvements. I agree that we cannot fix all the open issues for bugs,
but the least we can do is get the issues with a working patch in. In
particular given that we're releasing a 1.0.


On Sun, Apr 14, 2013 at 6:18 PM, Avery Ching <ac...@apache.org> wrote:

> Hi Sebastian,
>
> Thanks for the patch.  I'll try to take a look at it.
>
> The only reason I bring the optimizations up is that a lot of folks tend
> to compare PageRank performance.  The optimizations I'm referring to are
> Giraph ones, not algorithmic ones.  We use ints, floats for ids, messages,
> respectively instead longs, doubles (1/2 network traffic) and
> IntNullArrayEdges vertex edges (efficient array backed edges) instead of
> ByteArrayEdges.  You can see https://issues.apache.org/**
> jira/browse/giraph-543 <https://issues.apache.org/jira/browse/giraph-543>for more details.
>
> Anyway, given that we are going to ship a 1.0.1 release in a few weeks for
> a variety of reasons, should this really hold up the current release?  I
> would prefer to not cut anymore RCs unless things are totally broken (i.e.
> profiles not compiling, major Giraph bugs, etc.).  There are still a lot of
> outstanding issues in JIRA, we can't fix them all for the 1.0 release.
>
> Let me know what you think.
>
> Avery
>
>
> On 4/13/13 10:46 AM, Sebastian Schelter wrote:
>
>> Hi Avery,
>>
>> I found the bug and can I provide a patch today or tomorrow, so
>> hopefully we can include that in the release (to not knowingly ship
>> bugged code). Furthermore I improved the code to protect against
>> rounding errors.
>>
>> I don't really get what you mean with the missing optimization in
>> comparison to the benchmark PageRank implementation.
>>
>> The implementation in o.a.g.examples.PageRankVertex aims to be a robust
>> real-world implementation. As optimization, it dismisses edge weights
>> and reuses objects where possible. Furthermore it is able to handle
>> dangling vertices that are present in almost every real-world network
>> and it automatically detects the number of supersteps to run. With the
>> patch, it should also provide improved numerical stability.
>>
>> If the runtimes doesn't look good enough when compared to the benchmark
>> implementation, this might also be caused by the dataset which has a
>> skewed degree distribution (like most real-world networks). The
>> benchmark uses a uniform degree distribution AFAIK.
>>
>> Best,
>> Sebastian
>>
>> On 13.04.2013 15:46, Avery Ching wrote:
>>
>>> That's great Sebastian.  I would also recommend taking a look at the
>>> PageRankBenchmark for a performance comparison.  It has been a lot of
>>> speed improvements that should be a bunch faster than PageRankVertex.
>>> Even that though, is not totally optimized.  Hopefully we'll be adding a
>>> "how to optimize performance" guide in the near future.  Should we delay
>>> the release or simply just ship a 1.1, say in the next month with this
>>> fix and supporting YARN's 2.0.4?  I'd like to get on a more normal
>>> release cycle rather than once a year =).
>>>
>>> Avery
>>>
>>> On 4/13/13 3:02 AM, Sebastian Schelter wrote:
>>>
>>>> Hi there,
>>>>
>>>> I got some good and bad news, I tested PageRankVertex (not the Benchmark
>>>> but the example implementation o.a.g.examples.PageRankVertex) from trunk
>>>> compiled for Hadoop 1.0 on a cluster of 26 machines with 208 cores.
>>>>
>>>> I used the Webbase2001 dataset [1] which has 115M vertices and more than
>>>> 1B edges and got some awesome running times, average superstep takes 15
>>>> seconds (!!!). Awesome work, I have to say!
>>>>
>>>> Unfortunately, there seems to be an issue with the convergence
>>>> detection, as it didn't get the correct convergence behavior. I'd like
>>>> to have a look into that this week, so we can ship a performant PageRank
>>>> implementation which automatically runs an appropriate number of
>>>> supersteps. Hope this doesn't delay the release too much.
>>>>
>>>> Best,
>>>> Sebastian
>>>>
>>>>
>>>> [1] http://law.di.unimi.it/**webdata/webbase-2001/<http://law.di.unimi.it/webdata/webbase-2001/>
>>>>
>>>>
>>>> On 13.04.2013 07:39, Avery Ching wrote:
>>>>
>>>>> Thanks to the quick feedback from Roman and Lewis, we have cut a new
>>>>> RC1
>>>>> that addresses the following issues.
>>>>>
>>>>> * Got rid of .git repo in tarball
>>>>> * Fixed issue with not compiling without git repo (GIRAPH-628)
>>>>> * Used gnutar in OSX rather than tar to generate the tarball and get
>>>>> rid
>>>>> of warnings
>>>>> * Pushed GIRAPH-627 to support the yarn profile better
>>>>> * Tarball name changed to the final artifact name (giraph-1.0.tar.gz)
>>>>>
>>>>> Release notes:
>>>>> http://people.apache.org/~**aching/giraph-1.0-RC1/RELEASE_**NOTES.html<http://people.apache.org/~aching/giraph-1.0-RC1/RELEASE_NOTES.html>
>>>>>
>>>>> Release artifacts:
>>>>> http://people.apache.org/~**aching/giraph-1.0-RC1/<http://people.apache.org/~aching/giraph-1.0-RC1/>
>>>>>
>>>>> Corresponding git tag:
>>>>> https://git-wip-us.apache.org/**repos/asf?p=giraph.git;a=**
>>>>> shortlog;h=refs/tags/release-**1.0-RC1<https://git-wip-us.apache.org/repos/asf?p=giraph.git;a=shortlog;h=refs/tags/release-1.0-RC1>
>>>>>
>>>>>
>>>>>
>>>>> Signing keys:
>>>>> http://people.apache.org/keys/**group/giraph.asc<http://people.apache.org/keys/group/giraph.asc>
>>>>>
>>>>> The vote runs for 72 hours, until Monday 11pm PST.
>>>>>
>>>>> Thanks,
>>>>>
>>>>> Avery
>>>>>
>>>>> Original message below regarding rc0:
>>>>>
>>>>> ------------------------------**-
>>>>>
>>>>> Fellow Giraphers,
>>>>>
>>>>> We have a our first release candidate since graduating from incubation.
>>>>>    This is a source release, primarily due to the different versions of
>>>>> Hadoop we support with munge (similar to the 0.1 release).  Since 0.1,
>>>>> we've made A TON of progress on overall performance, optimizing memory
>>>>> use, split vertex/edge inputs, easy interoperability with Apache Hive,
>>>>> and a bunch of other areas.  In many ways, this is an almost totally
>>>>> different codebase.  Thanks everyone for your hard work!
>>>>>
>>>>> Apache Giraph has been running in production at Facebook (against
>>>>> Facebook's Corona implementation of Hadoop -
>>>>> https://github.com/facebook/**hadoop-20/tree/master/src/**
>>>>> contrib/corona<https://github.com/facebook/hadoop-20/tree/master/src/contrib/corona>
>>>>> )
>>>>> since around last December.  It has proven to be very scalable,
>>>>> performant, and enables a bunch of new applications.  Based on the
>>>>> drastic improvements and the use of Giraph in production, it seems
>>>>> appropriate to bump up our version to 1.0.
>>>>>
>>>>> While anyone can vote, the ASF requires majority approval from the PMC
>>>>> -- i.e., at least three PMC members must vote affirmatively for
>>>>> release,
>>>>> and there must be more positive than negative votes. Releases may not
>>>>> be
>>>>> vetoed. Before voting +1 PMC members are required to download the
>>>>> signed
>>>>> source code package, compile it as provided, and test the resulting
>>>>> executable on their own platform, along with also verifying that the
>>>>> package meets the requirements of the ASF policy on releases.
>>>>>
>>>>> Please test this against many other Hadoop versions and let us know how
>>>>> this goes!
>>>>>
>>>>> Release notes:
>>>>> http://people.apache.org/~**aching/giraph-1.0-RC0/RELEASE_**NOTES.html<http://people.apache.org/~aching/giraph-1.0-RC0/RELEASE_NOTES.html>
>>>>>
>>>>> Release artifacts:
>>>>> http://people.apache.org/~**aching/giraph-1.0-RC0/<http://people.apache.org/~aching/giraph-1.0-RC0/>
>>>>>
>>>>> Corresponding git tag:
>>>>> https://git-wip-us.apache.org/**repos/asf?p=giraph.git;a=**
>>>>> shortlog;h=refs/tags/release-**1.0-RC0<https://git-wip-us.apache.org/repos/asf?p=giraph.git;a=shortlog;h=refs/tags/release-1.0-RC0>
>>>>>
>>>>>
>>>>>
>>>>> Signing keys:
>>>>> http://people.apache.org/keys/**group/giraph.asc<http://people.apache.org/keys/group/giraph.asc>
>>>>>
>>>>> The vote runs for 72 hours, until Monday 4pm PST.
>>>>>
>>>>> Thanks everyone for your patience with this release!
>>>>>
>>>>> Avery
>>>>>
>>>>
>


-- 
   Claudio Martella
   claudio.martella@gmail.com

Re: [VOTE][CHANGED] Release Giraph 1.0 (rc1)

Posted by Claudio Martella <cl...@gmail.com>.
I don't understand Gianmarco's argument. Do you claim that people use
Giraph only with more vertices than Integer.MAX_VALUE?


On Mon, Apr 15, 2013 at 12:28 AM, Avery Ching <ac...@apache.org> wrote:

> I generally agree and can understand that is mostly typically true, but
> many other benchmarks are doing this to show off performance.  Also, if you
> have the FB graph of a billion users, it could theoretically fit into an
> 32-bit integer.
>
> Avery
>
>
> On 4/14/13 2:41 PM, Gianmarco De Francisci Morales wrote:
>
>> Hi,
>>
>> only one quick comment on optimizations and using ints as ids.
>> In my opinion, if you can use an int as an id for your dataset, probably
>> you don't need Giraph for your problem.
>> Just my 2c
>>
>> Cheers,
>>
>> --
>> Gianmarco
>>
>>
>> On Sun, Apr 14, 2013 at 11:26 PM, Sebastian Schelter <ss...@apache.org>
>> wrote:
>>
>>  Thank you, Avery, wish I had found the bug earlier.
>>> Am 14.04.2013 23:25 schrieb "Avery Ching" <ac...@apache.org>:
>>>
>>>  Thanks for your input Sebastian.  Given the choice to removing
>>>> PageRankVertex or adding the fix, I've added your fix and will cut RC2 a
>>>> bit later today.  I really hope this is the last RC.
>>>>
>>>> Avery
>>>>
>>>> On 4/14/13 9:34 AM, Sebastian Schelter wrote:
>>>>
>>>>  Hi Avery,
>>>>>
>>>>> I see your concerns. The benchmarking question is difficult, we had
>>>>> very
>>>>> bad experiences with Mahout in that regards. E.g., we once had a
>>>>> M/R-based PageRank implementation in Mahout that uses our integer-based
>>>>> vectors and removed it as we got public complaints that you can't fit
>>>>> the whole web into the range of an integer. Personally, I'd also
>>>>> refrain
>>>>> from using floats instead of doubles for benchmarks, as this simply
>>>>> means you give up on accuracy.
>>>>>
>>>>> Regarding benchmarks, I guess the best thing we could do is publish our
>>>>> own numbers. The current runtimes I've seen are already very good,
>>>>> Giraph beat a very optimized Stratosphere implementation that we did
>>>>> for
>>>>> a recent paper by approx. 25%.
>>>>>
>>>>> To conclude, I do in no way want to hold up the current release. I'm
>>>>> perfectly fine with not including the patch and optimizing the
>>>>> implementation for a 1.0.1 release, but then we should remove the
>>>>> current examples.PageRankVertex from the 1.0 release, as the
>>>>> convergence
>>>>> detection is broken and we should not knowingly ship bugged code.
>>>>>
>>>>> Best,
>>>>> Sebastian
>>>>>
>>>>>
>>>>> On 14.04.2013 18:18, Avery Ching wrote:
>>>>>
>>>>>  Hi Sebastian,
>>>>>>
>>>>>> Thanks for the patch.  I'll try to take a look at it.
>>>>>>
>>>>>> The only reason I bring the optimizations up is that a lot of folks
>>>>>>
>>>>> tend
>>>
>>>> to compare PageRank performance.  The optimizations I'm referring to
>>>>>>
>>>>> are
>>>
>>>> Giraph ones, not algorithmic ones.  We use ints, floats for ids,
>>>>>> messages, respectively instead longs, doubles (1/2 network traffic)
>>>>>> and
>>>>>> IntNullArrayEdges vertex edges (efficient array backed edges) instead
>>>>>>
>>>>> of
>>>
>>>> ByteArrayEdges.  You can see
>>>>>> https://issues.apache.org/****jira/browse/giraph-543<https://issues.apache.org/**jira/browse/giraph-543>
>>>>>> <
>>>>>>
>>>>> https://issues.apache.org/**jira/browse/giraph-543<https://issues.apache.org/jira/browse/giraph-543>>for
>>> more details.
>>>
>>>> Anyway, given that we are going to ship a 1.0.1 release in a few weeks
>>>>>> for a variety of reasons, should this really hold up the current
>>>>>> release?  I would prefer to not cut anymore RCs unless things are
>>>>>> totally broken (i.e. profiles not compiling, major Giraph bugs, etc.).
>>>>>> There are still a lot of outstanding issues in JIRA, we can't fix them
>>>>>> all for the 1.0 release.
>>>>>>
>>>>>> Let me know what you think.
>>>>>>
>>>>>> Avery
>>>>>>
>>>>>> On 4/13/13 10:46 AM, Sebastian Schelter wrote:
>>>>>>
>>>>>>  Hi Avery,
>>>>>>>
>>>>>>> I found the bug and can I provide a patch today or tomorrow, so
>>>>>>> hopefully we can include that in the release (to not knowingly ship
>>>>>>> bugged code). Furthermore I improved the code to protect against
>>>>>>> rounding errors.
>>>>>>>
>>>>>>> I don't really get what you mean with the missing optimization in
>>>>>>> comparison to the benchmark PageRank implementation.
>>>>>>>
>>>>>>> The implementation in o.a.g.examples.PageRankVertex aims to be a
>>>>>>>
>>>>>> robust
>>>
>>>>  real-world implementation. As optimization, it dismisses edge weights
>>>>>>> and reuses objects where possible. Furthermore it is able to handle
>>>>>>> dangling vertices that are present in almost every real-world network
>>>>>>> and it automatically detects the number of supersteps to run. With
>>>>>>> the
>>>>>>> patch, it should also provide improved numerical stability.
>>>>>>>
>>>>>>> If the runtimes doesn't look good enough when compared to the
>>>>>>>
>>>>>> benchmark
>>>
>>>>  implementation, this might also be caused by the dataset which has a
>>>>>>> skewed degree distribution (like most real-world networks). The
>>>>>>> benchmark uses a uniform degree distribution AFAIK.
>>>>>>>
>>>>>>> Best,
>>>>>>> Sebastian
>>>>>>>
>>>>>>> On 13.04.2013 15:46, Avery Ching wrote:
>>>>>>>
>>>>>>>  That's great Sebastian.  I would also recommend taking a look at the
>>>>>>>> PageRankBenchmark for a performance comparison.  It has been a lot
>>>>>>>> of
>>>>>>>> speed improvements that should be a bunch faster than
>>>>>>>> PageRankVertex.
>>>>>>>> Even that though, is not totally optimized.  Hopefully we'll be
>>>>>>>>
>>>>>>> adding
>>>
>>>>  a
>>>>>>>> "how to optimize performance" guide in the near future.  Should we
>>>>>>>> delay
>>>>>>>> the release or simply just ship a 1.1, say in the next month with
>>>>>>>>
>>>>>>> this
>>>
>>>>  fix and supporting YARN's 2.0.4?  I'd like to get on a more normal
>>>>>>>> release cycle rather than once a year =).
>>>>>>>>
>>>>>>>> Avery
>>>>>>>>
>>>>>>>> On 4/13/13 3:02 AM, Sebastian Schelter wrote:
>>>>>>>>
>>>>>>>>  Hi there,
>>>>>>>>>
>>>>>>>>> I got some good and bad news, I tested PageRankVertex (not the
>>>>>>>>> Benchmark
>>>>>>>>> but the example implementation o.a.g.examples.PageRankVertex) from
>>>>>>>>> trunk
>>>>>>>>> compiled for Hadoop 1.0 on a cluster of 26 machines with 208 cores.
>>>>>>>>>
>>>>>>>>> I used the Webbase2001 dataset [1] which has 115M vertices and more
>>>>>>>>> than
>>>>>>>>> 1B edges and got some awesome running times, average superstep
>>>>>>>>> takes
>>>>>>>>> 15
>>>>>>>>> seconds (!!!). Awesome work, I have to say!
>>>>>>>>>
>>>>>>>>> Unfortunately, there seems to be an issue with the convergence
>>>>>>>>> detection, as it didn't get the correct convergence behavior. I'd
>>>>>>>>>
>>>>>>>> like
>>>
>>>>  to have a look into that this week, so we can ship a performant
>>>>>>>>> PageRank
>>>>>>>>> implementation which automatically runs an appropriate number of
>>>>>>>>> supersteps. Hope this doesn't delay the release too much.
>>>>>>>>>
>>>>>>>>> Best,
>>>>>>>>> Sebastian
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> [1] http://law.di.unimi.it/****webdata/webbase-2001/<http://law.di.unimi.it/**webdata/webbase-2001/>
>>>>>>>>> <
>>>>>>>>>
>>>>>>>> http://law.di.unimi.it/**webdata/webbase-2001/<http://law.di.unimi.it/webdata/webbase-2001/>
>>> >
>>>
>>>>
>>>>>>>>> On 13.04.2013 07:39, Avery Ching wrote:
>>>>>>>>>
>>>>>>>>>  Thanks to the quick feedback from Roman and Lewis, we have cut a
>>>>>>>>>> new RC1
>>>>>>>>>> that addresses the following issues.
>>>>>>>>>>
>>>>>>>>>> * Got rid of .git repo in tarball
>>>>>>>>>> * Fixed issue with not compiling without git repo (GIRAPH-628)
>>>>>>>>>> * Used gnutar in OSX rather than tar to generate the tarball and
>>>>>>>>>> get rid
>>>>>>>>>> of warnings
>>>>>>>>>> * Pushed GIRAPH-627 to support the yarn profile better
>>>>>>>>>> * Tarball name changed to the final artifact name
>>>>>>>>>>
>>>>>>>>> (giraph-1.0.tar.gz)
>>>
>>>>  Release notes:
>>>>>>>>>> http://people.apache.org/~****aching/giraph-1.0-RC1/RELEASE_****<http://people.apache.org/~**aching/giraph-1.0-RC1/RELEASE_**>
>>>>>>>>>> NOTES.html<
>>>>>>>>>>
>>>>>>>>> http://people.apache.org/~**aching/giraph-1.0-RC1/RELEASE_**
>>> NOTES.html<http://people.apache.org/~aching/giraph-1.0-RC1/RELEASE_NOTES.html>
>>> >
>>>
>>>>  Release artifacts:
>>>>>>>>>> http://people.apache.org/~****aching/giraph-1.0-RC1/<http://people.apache.org/~**aching/giraph-1.0-RC1/>
>>>>>>>>>> <
>>>>>>>>>>
>>>>>>>>> http://people.apache.org/~**aching/giraph-1.0-RC1/<http://people.apache.org/~aching/giraph-1.0-RC1/>
>>> >
>>>
>>>>  Corresponding git tag:
>>>>>>>>>> https://git-wip-us.apache.org/****repos/asf?p=giraph.git;a=**<https://git-wip-us.apache.org/**repos/asf?p=giraph.git;a=**>
>>>>>>>>>> shortlog;h=refs/tags/release-****1.0-RC1<
>>>>>>>>>>
>>>>>>>>> https://git-wip-us.apache.org/**repos/asf?p=giraph.git;a=**
>>> shortlog;h=refs/tags/release-**1.0-RC1<https://git-wip-us.apache.org/repos/asf?p=giraph.git;a=shortlog;h=refs/tags/release-1.0-RC1>
>>>
>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> Signing keys:
>>>>>>>>>> http://people.apache.org/keys/****group/giraph.asc<http://people.apache.org/keys/**group/giraph.asc>
>>>>>>>>>> <
>>>>>>>>>>
>>>>>>>>> http://people.apache.org/keys/**group/giraph.asc<http://people.apache.org/keys/group/giraph.asc>
>>> >
>>>
>>>>  The vote runs for 72 hours, until Monday 11pm PST.
>>>>>>>>>>
>>>>>>>>>> Thanks,
>>>>>>>>>>
>>>>>>>>>> Avery
>>>>>>>>>>
>>>>>>>>>> Original message below regarding rc0:
>>>>>>>>>>
>>>>>>>>>> ------------------------------****-
>>>>>>>>>>
>>>>>>>>>> Fellow Giraphers,
>>>>>>>>>>
>>>>>>>>>> We have a our first release candidate since graduating from
>>>>>>>>>> incubation.
>>>>>>>>>>      This is a source release, primarily due to the different
>>>>>>>>>> versions of
>>>>>>>>>> Hadoop we support with munge (similar to the 0.1 release).  Since
>>>>>>>>>> 0.1,
>>>>>>>>>> we've made A TON of progress on overall performance, optimizing
>>>>>>>>>> memory
>>>>>>>>>> use, split vertex/edge inputs, easy interoperability with Apache
>>>>>>>>>> Hive,
>>>>>>>>>> and a bunch of other areas.  In many ways, this is an almost
>>>>>>>>>>
>>>>>>>>> totally
>>>
>>>>  different codebase.  Thanks everyone for your hard work!
>>>>>>>>>>
>>>>>>>>>> Apache Giraph has been running in production at Facebook (against
>>>>>>>>>> Facebook's Corona implementation of Hadoop -
>>>>>>>>>> https://github.com/facebook/****hadoop-20/tree/master/src/**<https://github.com/facebook/**hadoop-20/tree/master/src/**>
>>>>>>>>>> contrib/corona<
>>>>>>>>>>
>>>>>>>>> https://github.com/facebook/**hadoop-20/tree/master/src/**
>>> contrib/corona<https://github.com/facebook/hadoop-20/tree/master/src/contrib/corona>
>>> >
>>>
>>>>  )
>>>>>>>>>> since around last December.  It has proven to be very scalable,
>>>>>>>>>> performant, and enables a bunch of new applications.  Based on the
>>>>>>>>>> drastic improvements and the use of Giraph in production, it seems
>>>>>>>>>> appropriate to bump up our version to 1.0.
>>>>>>>>>>
>>>>>>>>>> While anyone can vote, the ASF requires majority approval from the
>>>>>>>>>> PMC
>>>>>>>>>> -- i.e., at least three PMC members must vote affirmatively for
>>>>>>>>>> release,
>>>>>>>>>> and there must be more positive than negative votes. Releases may
>>>>>>>>>> not be
>>>>>>>>>> vetoed. Before voting +1 PMC members are required to download the
>>>>>>>>>> signed
>>>>>>>>>> source code package, compile it as provided, and test the
>>>>>>>>>> resulting
>>>>>>>>>> executable on their own platform, along with also verifying that
>>>>>>>>>>
>>>>>>>>> the
>>>
>>>>  package meets the requirements of the ASF policy on releases.
>>>>>>>>>>
>>>>>>>>>> Please test this against many other Hadoop versions and let us
>>>>>>>>>> know
>>>>>>>>>> how
>>>>>>>>>> this goes!
>>>>>>>>>>
>>>>>>>>>> Release notes:
>>>>>>>>>> http://people.apache.org/~****aching/giraph-1.0-RC0/RELEASE_****<http://people.apache.org/~**aching/giraph-1.0-RC0/RELEASE_**>
>>>>>>>>>> NOTES.html<
>>>>>>>>>>
>>>>>>>>> http://people.apache.org/~**aching/giraph-1.0-RC0/RELEASE_**
>>> NOTES.html<http://people.apache.org/~aching/giraph-1.0-RC0/RELEASE_NOTES.html>
>>> >
>>>
>>>>  Release artifacts:
>>>>>>>>>> http://people.apache.org/~****aching/giraph-1.0-RC0/<http://people.apache.org/~**aching/giraph-1.0-RC0/>
>>>>>>>>>> <
>>>>>>>>>>
>>>>>>>>> http://people.apache.org/~**aching/giraph-1.0-RC0/<http://people.apache.org/~aching/giraph-1.0-RC0/>
>>> >
>>>
>>>>  Corresponding git tag:
>>>>>>>>>> https://git-wip-us.apache.org/****repos/asf?p=giraph.git;a=**<https://git-wip-us.apache.org/**repos/asf?p=giraph.git;a=**>
>>>>>>>>>> shortlog;h=refs/tags/release-****1.0-RC0<
>>>>>>>>>>
>>>>>>>>> https://git-wip-us.apache.org/**repos/asf?p=giraph.git;a=**
>>> shortlog;h=refs/tags/release-**1.0-RC0<https://git-wip-us.apache.org/repos/asf?p=giraph.git;a=shortlog;h=refs/tags/release-1.0-RC0>
>>>
>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> Signing keys:
>>>>>>>>>> http://people.apache.org/keys/****group/giraph.asc<http://people.apache.org/keys/**group/giraph.asc>
>>>>>>>>>> <
>>>>>>>>>>
>>>>>>>>> http://people.apache.org/keys/**group/giraph.asc<http://people.apache.org/keys/group/giraph.asc>
>>> >
>>>
>>>>  The vote runs for 72 hours, until Monday 4pm PST.
>>>>>>>>>>
>>>>>>>>>> Thanks everyone for your patience with this release!
>>>>>>>>>>
>>>>>>>>>> Avery
>>>>>>>>>>
>>>>>>>>>>
>


-- 
   Claudio Martella
   claudio.martella@gmail.com

Re: [VOTE][CHANGED] Release Giraph 1.0 (rc1)

Posted by Avery Ching <ac...@apache.org>.
I generally agree and can understand that is mostly typically true, but 
many other benchmarks are doing this to show off performance.  Also, if 
you have the FB graph of a billion users, it could theoretically fit 
into an 32-bit integer.

Avery

On 4/14/13 2:41 PM, Gianmarco De Francisci Morales wrote:
> Hi,
>
> only one quick comment on optimizations and using ints as ids.
> In my opinion, if you can use an int as an id for your dataset, probably
> you don't need Giraph for your problem.
> Just my 2c
>
> Cheers,
>
> --
> Gianmarco
>
>
> On Sun, Apr 14, 2013 at 11:26 PM, Sebastian Schelter <ss...@apache.org> wrote:
>
>> Thank you, Avery, wish I had found the bug earlier.
>> Am 14.04.2013 23:25 schrieb "Avery Ching" <ac...@apache.org>:
>>
>>> Thanks for your input Sebastian.  Given the choice to removing
>>> PageRankVertex or adding the fix, I've added your fix and will cut RC2 a
>>> bit later today.  I really hope this is the last RC.
>>>
>>> Avery
>>>
>>> On 4/14/13 9:34 AM, Sebastian Schelter wrote:
>>>
>>>> Hi Avery,
>>>>
>>>> I see your concerns. The benchmarking question is difficult, we had very
>>>> bad experiences with Mahout in that regards. E.g., we once had a
>>>> M/R-based PageRank implementation in Mahout that uses our integer-based
>>>> vectors and removed it as we got public complaints that you can't fit
>>>> the whole web into the range of an integer. Personally, I'd also refrain
>>>> from using floats instead of doubles for benchmarks, as this simply
>>>> means you give up on accuracy.
>>>>
>>>> Regarding benchmarks, I guess the best thing we could do is publish our
>>>> own numbers. The current runtimes I've seen are already very good,
>>>> Giraph beat a very optimized Stratosphere implementation that we did for
>>>> a recent paper by approx. 25%.
>>>>
>>>> To conclude, I do in no way want to hold up the current release. I'm
>>>> perfectly fine with not including the patch and optimizing the
>>>> implementation for a 1.0.1 release, but then we should remove the
>>>> current examples.PageRankVertex from the 1.0 release, as the convergence
>>>> detection is broken and we should not knowingly ship bugged code.
>>>>
>>>> Best,
>>>> Sebastian
>>>>
>>>>
>>>> On 14.04.2013 18:18, Avery Ching wrote:
>>>>
>>>>> Hi Sebastian,
>>>>>
>>>>> Thanks for the patch.  I'll try to take a look at it.
>>>>>
>>>>> The only reason I bring the optimizations up is that a lot of folks
>> tend
>>>>> to compare PageRank performance.  The optimizations I'm referring to
>> are
>>>>> Giraph ones, not algorithmic ones.  We use ints, floats for ids,
>>>>> messages, respectively instead longs, doubles (1/2 network traffic) and
>>>>> IntNullArrayEdges vertex edges (efficient array backed edges) instead
>> of
>>>>> ByteArrayEdges.  You can see
>>>>> https://issues.apache.org/**jira/browse/giraph-543<
>> https://issues.apache.org/jira/browse/giraph-543>for more details.
>>>>> Anyway, given that we are going to ship a 1.0.1 release in a few weeks
>>>>> for a variety of reasons, should this really hold up the current
>>>>> release?  I would prefer to not cut anymore RCs unless things are
>>>>> totally broken (i.e. profiles not compiling, major Giraph bugs, etc.).
>>>>> There are still a lot of outstanding issues in JIRA, we can't fix them
>>>>> all for the 1.0 release.
>>>>>
>>>>> Let me know what you think.
>>>>>
>>>>> Avery
>>>>>
>>>>> On 4/13/13 10:46 AM, Sebastian Schelter wrote:
>>>>>
>>>>>> Hi Avery,
>>>>>>
>>>>>> I found the bug and can I provide a patch today or tomorrow, so
>>>>>> hopefully we can include that in the release (to not knowingly ship
>>>>>> bugged code). Furthermore I improved the code to protect against
>>>>>> rounding errors.
>>>>>>
>>>>>> I don't really get what you mean with the missing optimization in
>>>>>> comparison to the benchmark PageRank implementation.
>>>>>>
>>>>>> The implementation in o.a.g.examples.PageRankVertex aims to be a
>> robust
>>>>>> real-world implementation. As optimization, it dismisses edge weights
>>>>>> and reuses objects where possible. Furthermore it is able to handle
>>>>>> dangling vertices that are present in almost every real-world network
>>>>>> and it automatically detects the number of supersteps to run. With the
>>>>>> patch, it should also provide improved numerical stability.
>>>>>>
>>>>>> If the runtimes doesn't look good enough when compared to the
>> benchmark
>>>>>> implementation, this might also be caused by the dataset which has a
>>>>>> skewed degree distribution (like most real-world networks). The
>>>>>> benchmark uses a uniform degree distribution AFAIK.
>>>>>>
>>>>>> Best,
>>>>>> Sebastian
>>>>>>
>>>>>> On 13.04.2013 15:46, Avery Ching wrote:
>>>>>>
>>>>>>> That's great Sebastian.  I would also recommend taking a look at the
>>>>>>> PageRankBenchmark for a performance comparison.  It has been a lot of
>>>>>>> speed improvements that should be a bunch faster than PageRankVertex.
>>>>>>> Even that though, is not totally optimized.  Hopefully we'll be
>> adding
>>>>>>> a
>>>>>>> "how to optimize performance" guide in the near future.  Should we
>>>>>>> delay
>>>>>>> the release or simply just ship a 1.1, say in the next month with
>> this
>>>>>>> fix and supporting YARN's 2.0.4?  I'd like to get on a more normal
>>>>>>> release cycle rather than once a year =).
>>>>>>>
>>>>>>> Avery
>>>>>>>
>>>>>>> On 4/13/13 3:02 AM, Sebastian Schelter wrote:
>>>>>>>
>>>>>>>> Hi there,
>>>>>>>>
>>>>>>>> I got some good and bad news, I tested PageRankVertex (not the
>>>>>>>> Benchmark
>>>>>>>> but the example implementation o.a.g.examples.PageRankVertex) from
>>>>>>>> trunk
>>>>>>>> compiled for Hadoop 1.0 on a cluster of 26 machines with 208 cores.
>>>>>>>>
>>>>>>>> I used the Webbase2001 dataset [1] which has 115M vertices and more
>>>>>>>> than
>>>>>>>> 1B edges and got some awesome running times, average superstep takes
>>>>>>>> 15
>>>>>>>> seconds (!!!). Awesome work, I have to say!
>>>>>>>>
>>>>>>>> Unfortunately, there seems to be an issue with the convergence
>>>>>>>> detection, as it didn't get the correct convergence behavior. I'd
>> like
>>>>>>>> to have a look into that this week, so we can ship a performant
>>>>>>>> PageRank
>>>>>>>> implementation which automatically runs an appropriate number of
>>>>>>>> supersteps. Hope this doesn't delay the release too much.
>>>>>>>>
>>>>>>>> Best,
>>>>>>>> Sebastian
>>>>>>>>
>>>>>>>>
>>>>>>>> [1] http://law.di.unimi.it/**webdata/webbase-2001/<
>> http://law.di.unimi.it/webdata/webbase-2001/>
>>>>>>>>
>>>>>>>> On 13.04.2013 07:39, Avery Ching wrote:
>>>>>>>>
>>>>>>>>> Thanks to the quick feedback from Roman and Lewis, we have cut a
>>>>>>>>> new RC1
>>>>>>>>> that addresses the following issues.
>>>>>>>>>
>>>>>>>>> * Got rid of .git repo in tarball
>>>>>>>>> * Fixed issue with not compiling without git repo (GIRAPH-628)
>>>>>>>>> * Used gnutar in OSX rather than tar to generate the tarball and
>>>>>>>>> get rid
>>>>>>>>> of warnings
>>>>>>>>> * Pushed GIRAPH-627 to support the yarn profile better
>>>>>>>>> * Tarball name changed to the final artifact name
>> (giraph-1.0.tar.gz)
>>>>>>>>> Release notes:
>>>>>>>>> http://people.apache.org/~**aching/giraph-1.0-RC1/RELEASE_**
>>>>>>>>> NOTES.html<
>> http://people.apache.org/~aching/giraph-1.0-RC1/RELEASE_NOTES.html>
>>>>>>>>> Release artifacts:
>>>>>>>>> http://people.apache.org/~**aching/giraph-1.0-RC1/<
>> http://people.apache.org/~aching/giraph-1.0-RC1/>
>>>>>>>>> Corresponding git tag:
>>>>>>>>> https://git-wip-us.apache.org/**repos/asf?p=giraph.git;a=**
>>>>>>>>> shortlog;h=refs/tags/release-**1.0-RC1<
>> https://git-wip-us.apache.org/repos/asf?p=giraph.git;a=shortlog;h=refs/tags/release-1.0-RC1
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> Signing keys:
>>>>>>>>> http://people.apache.org/keys/**group/giraph.asc<
>> http://people.apache.org/keys/group/giraph.asc>
>>>>>>>>> The vote runs for 72 hours, until Monday 11pm PST.
>>>>>>>>>
>>>>>>>>> Thanks,
>>>>>>>>>
>>>>>>>>> Avery
>>>>>>>>>
>>>>>>>>> Original message below regarding rc0:
>>>>>>>>>
>>>>>>>>> ------------------------------**-
>>>>>>>>>
>>>>>>>>> Fellow Giraphers,
>>>>>>>>>
>>>>>>>>> We have a our first release candidate since graduating from
>>>>>>>>> incubation.
>>>>>>>>>      This is a source release, primarily due to the different
>>>>>>>>> versions of
>>>>>>>>> Hadoop we support with munge (similar to the 0.1 release).  Since
>>>>>>>>> 0.1,
>>>>>>>>> we've made A TON of progress on overall performance, optimizing
>>>>>>>>> memory
>>>>>>>>> use, split vertex/edge inputs, easy interoperability with Apache
>>>>>>>>> Hive,
>>>>>>>>> and a bunch of other areas.  In many ways, this is an almost
>> totally
>>>>>>>>> different codebase.  Thanks everyone for your hard work!
>>>>>>>>>
>>>>>>>>> Apache Giraph has been running in production at Facebook (against
>>>>>>>>> Facebook's Corona implementation of Hadoop -
>>>>>>>>> https://github.com/facebook/**hadoop-20/tree/master/src/**
>>>>>>>>> contrib/corona<
>> https://github.com/facebook/hadoop-20/tree/master/src/contrib/corona>
>>>>>>>>> )
>>>>>>>>> since around last December.  It has proven to be very scalable,
>>>>>>>>> performant, and enables a bunch of new applications.  Based on the
>>>>>>>>> drastic improvements and the use of Giraph in production, it seems
>>>>>>>>> appropriate to bump up our version to 1.0.
>>>>>>>>>
>>>>>>>>> While anyone can vote, the ASF requires majority approval from the
>>>>>>>>> PMC
>>>>>>>>> -- i.e., at least three PMC members must vote affirmatively for
>>>>>>>>> release,
>>>>>>>>> and there must be more positive than negative votes. Releases may
>>>>>>>>> not be
>>>>>>>>> vetoed. Before voting +1 PMC members are required to download the
>>>>>>>>> signed
>>>>>>>>> source code package, compile it as provided, and test the resulting
>>>>>>>>> executable on their own platform, along with also verifying that
>> the
>>>>>>>>> package meets the requirements of the ASF policy on releases.
>>>>>>>>>
>>>>>>>>> Please test this against many other Hadoop versions and let us know
>>>>>>>>> how
>>>>>>>>> this goes!
>>>>>>>>>
>>>>>>>>> Release notes:
>>>>>>>>> http://people.apache.org/~**aching/giraph-1.0-RC0/RELEASE_**
>>>>>>>>> NOTES.html<
>> http://people.apache.org/~aching/giraph-1.0-RC0/RELEASE_NOTES.html>
>>>>>>>>> Release artifacts:
>>>>>>>>> http://people.apache.org/~**aching/giraph-1.0-RC0/<
>> http://people.apache.org/~aching/giraph-1.0-RC0/>
>>>>>>>>> Corresponding git tag:
>>>>>>>>> https://git-wip-us.apache.org/**repos/asf?p=giraph.git;a=**
>>>>>>>>> shortlog;h=refs/tags/release-**1.0-RC0<
>> https://git-wip-us.apache.org/repos/asf?p=giraph.git;a=shortlog;h=refs/tags/release-1.0-RC0
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> Signing keys:
>>>>>>>>> http://people.apache.org/keys/**group/giraph.asc<
>> http://people.apache.org/keys/group/giraph.asc>
>>>>>>>>> The vote runs for 72 hours, until Monday 4pm PST.
>>>>>>>>>
>>>>>>>>> Thanks everyone for your patience with this release!
>>>>>>>>>
>>>>>>>>> Avery
>>>>>>>>>


Re: [VOTE][CHANGED] Release Giraph 1.0 (rc1)

Posted by Gianmarco De Francisci Morales <gd...@apache.org>.
Hi,

only one quick comment on optimizations and using ints as ids.
In my opinion, if you can use an int as an id for your dataset, probably
you don't need Giraph for your problem.
Just my 2c

Cheers,

--
Gianmarco


On Sun, Apr 14, 2013 at 11:26 PM, Sebastian Schelter <ss...@apache.org> wrote:

> Thank you, Avery, wish I had found the bug earlier.
> Am 14.04.2013 23:25 schrieb "Avery Ching" <ac...@apache.org>:
>
> > Thanks for your input Sebastian.  Given the choice to removing
> > PageRankVertex or adding the fix, I've added your fix and will cut RC2 a
> > bit later today.  I really hope this is the last RC.
> >
> > Avery
> >
> > On 4/14/13 9:34 AM, Sebastian Schelter wrote:
> >
> >> Hi Avery,
> >>
> >> I see your concerns. The benchmarking question is difficult, we had very
> >> bad experiences with Mahout in that regards. E.g., we once had a
> >> M/R-based PageRank implementation in Mahout that uses our integer-based
> >> vectors and removed it as we got public complaints that you can't fit
> >> the whole web into the range of an integer. Personally, I'd also refrain
> >> from using floats instead of doubles for benchmarks, as this simply
> >> means you give up on accuracy.
> >>
> >> Regarding benchmarks, I guess the best thing we could do is publish our
> >> own numbers. The current runtimes I've seen are already very good,
> >> Giraph beat a very optimized Stratosphere implementation that we did for
> >> a recent paper by approx. 25%.
> >>
> >> To conclude, I do in no way want to hold up the current release. I'm
> >> perfectly fine with not including the patch and optimizing the
> >> implementation for a 1.0.1 release, but then we should remove the
> >> current examples.PageRankVertex from the 1.0 release, as the convergence
> >> detection is broken and we should not knowingly ship bugged code.
> >>
> >> Best,
> >> Sebastian
> >>
> >>
> >> On 14.04.2013 18:18, Avery Ching wrote:
> >>
> >>> Hi Sebastian,
> >>>
> >>> Thanks for the patch.  I'll try to take a look at it.
> >>>
> >>> The only reason I bring the optimizations up is that a lot of folks
> tend
> >>> to compare PageRank performance.  The optimizations I'm referring to
> are
> >>> Giraph ones, not algorithmic ones.  We use ints, floats for ids,
> >>> messages, respectively instead longs, doubles (1/2 network traffic) and
> >>> IntNullArrayEdges vertex edges (efficient array backed edges) instead
> of
> >>> ByteArrayEdges.  You can see
> >>> https://issues.apache.org/**jira/browse/giraph-543<
> https://issues.apache.org/jira/browse/giraph-543>for more details.
> >>>
> >>> Anyway, given that we are going to ship a 1.0.1 release in a few weeks
> >>> for a variety of reasons, should this really hold up the current
> >>> release?  I would prefer to not cut anymore RCs unless things are
> >>> totally broken (i.e. profiles not compiling, major Giraph bugs, etc.).
> >>> There are still a lot of outstanding issues in JIRA, we can't fix them
> >>> all for the 1.0 release.
> >>>
> >>> Let me know what you think.
> >>>
> >>> Avery
> >>>
> >>> On 4/13/13 10:46 AM, Sebastian Schelter wrote:
> >>>
> >>>> Hi Avery,
> >>>>
> >>>> I found the bug and can I provide a patch today or tomorrow, so
> >>>> hopefully we can include that in the release (to not knowingly ship
> >>>> bugged code). Furthermore I improved the code to protect against
> >>>> rounding errors.
> >>>>
> >>>> I don't really get what you mean with the missing optimization in
> >>>> comparison to the benchmark PageRank implementation.
> >>>>
> >>>> The implementation in o.a.g.examples.PageRankVertex aims to be a
> robust
> >>>> real-world implementation. As optimization, it dismisses edge weights
> >>>> and reuses objects where possible. Furthermore it is able to handle
> >>>> dangling vertices that are present in almost every real-world network
> >>>> and it automatically detects the number of supersteps to run. With the
> >>>> patch, it should also provide improved numerical stability.
> >>>>
> >>>> If the runtimes doesn't look good enough when compared to the
> benchmark
> >>>> implementation, this might also be caused by the dataset which has a
> >>>> skewed degree distribution (like most real-world networks). The
> >>>> benchmark uses a uniform degree distribution AFAIK.
> >>>>
> >>>> Best,
> >>>> Sebastian
> >>>>
> >>>> On 13.04.2013 15:46, Avery Ching wrote:
> >>>>
> >>>>> That's great Sebastian.  I would also recommend taking a look at the
> >>>>> PageRankBenchmark for a performance comparison.  It has been a lot of
> >>>>> speed improvements that should be a bunch faster than PageRankVertex.
> >>>>> Even that though, is not totally optimized.  Hopefully we'll be
> adding
> >>>>> a
> >>>>> "how to optimize performance" guide in the near future.  Should we
> >>>>> delay
> >>>>> the release or simply just ship a 1.1, say in the next month with
> this
> >>>>> fix and supporting YARN's 2.0.4?  I'd like to get on a more normal
> >>>>> release cycle rather than once a year =).
> >>>>>
> >>>>> Avery
> >>>>>
> >>>>> On 4/13/13 3:02 AM, Sebastian Schelter wrote:
> >>>>>
> >>>>>> Hi there,
> >>>>>>
> >>>>>> I got some good and bad news, I tested PageRankVertex (not the
> >>>>>> Benchmark
> >>>>>> but the example implementation o.a.g.examples.PageRankVertex) from
> >>>>>> trunk
> >>>>>> compiled for Hadoop 1.0 on a cluster of 26 machines with 208 cores.
> >>>>>>
> >>>>>> I used the Webbase2001 dataset [1] which has 115M vertices and more
> >>>>>> than
> >>>>>> 1B edges and got some awesome running times, average superstep takes
> >>>>>> 15
> >>>>>> seconds (!!!). Awesome work, I have to say!
> >>>>>>
> >>>>>> Unfortunately, there seems to be an issue with the convergence
> >>>>>> detection, as it didn't get the correct convergence behavior. I'd
> like
> >>>>>> to have a look into that this week, so we can ship a performant
> >>>>>> PageRank
> >>>>>> implementation which automatically runs an appropriate number of
> >>>>>> supersteps. Hope this doesn't delay the release too much.
> >>>>>>
> >>>>>> Best,
> >>>>>> Sebastian
> >>>>>>
> >>>>>>
> >>>>>> [1] http://law.di.unimi.it/**webdata/webbase-2001/<
> http://law.di.unimi.it/webdata/webbase-2001/>
> >>>>>>
> >>>>>>
> >>>>>> On 13.04.2013 07:39, Avery Ching wrote:
> >>>>>>
> >>>>>>> Thanks to the quick feedback from Roman and Lewis, we have cut a
> >>>>>>> new RC1
> >>>>>>> that addresses the following issues.
> >>>>>>>
> >>>>>>> * Got rid of .git repo in tarball
> >>>>>>> * Fixed issue with not compiling without git repo (GIRAPH-628)
> >>>>>>> * Used gnutar in OSX rather than tar to generate the tarball and
> >>>>>>> get rid
> >>>>>>> of warnings
> >>>>>>> * Pushed GIRAPH-627 to support the yarn profile better
> >>>>>>> * Tarball name changed to the final artifact name
> (giraph-1.0.tar.gz)
> >>>>>>>
> >>>>>>> Release notes:
> >>>>>>> http://people.apache.org/~**aching/giraph-1.0-RC1/RELEASE_**
> >>>>>>> NOTES.html<
> http://people.apache.org/~aching/giraph-1.0-RC1/RELEASE_NOTES.html>
> >>>>>>>
> >>>>>>> Release artifacts:
> >>>>>>> http://people.apache.org/~**aching/giraph-1.0-RC1/<
> http://people.apache.org/~aching/giraph-1.0-RC1/>
> >>>>>>>
> >>>>>>> Corresponding git tag:
> >>>>>>> https://git-wip-us.apache.org/**repos/asf?p=giraph.git;a=**
> >>>>>>> shortlog;h=refs/tags/release-**1.0-RC1<
> https://git-wip-us.apache.org/repos/asf?p=giraph.git;a=shortlog;h=refs/tags/release-1.0-RC1
> >
> >>>>>>>
> >>>>>>>
> >>>>>>>
> >>>>>>>
> >>>>>>> Signing keys:
> >>>>>>> http://people.apache.org/keys/**group/giraph.asc<
> http://people.apache.org/keys/group/giraph.asc>
> >>>>>>>
> >>>>>>> The vote runs for 72 hours, until Monday 11pm PST.
> >>>>>>>
> >>>>>>> Thanks,
> >>>>>>>
> >>>>>>> Avery
> >>>>>>>
> >>>>>>> Original message below regarding rc0:
> >>>>>>>
> >>>>>>> ------------------------------**-
> >>>>>>>
> >>>>>>> Fellow Giraphers,
> >>>>>>>
> >>>>>>> We have a our first release candidate since graduating from
> >>>>>>> incubation.
> >>>>>>>     This is a source release, primarily due to the different
> >>>>>>> versions of
> >>>>>>> Hadoop we support with munge (similar to the 0.1 release).  Since
> >>>>>>> 0.1,
> >>>>>>> we've made A TON of progress on overall performance, optimizing
> >>>>>>> memory
> >>>>>>> use, split vertex/edge inputs, easy interoperability with Apache
> >>>>>>> Hive,
> >>>>>>> and a bunch of other areas.  In many ways, this is an almost
> totally
> >>>>>>> different codebase.  Thanks everyone for your hard work!
> >>>>>>>
> >>>>>>> Apache Giraph has been running in production at Facebook (against
> >>>>>>> Facebook's Corona implementation of Hadoop -
> >>>>>>> https://github.com/facebook/**hadoop-20/tree/master/src/**
> >>>>>>> contrib/corona<
> https://github.com/facebook/hadoop-20/tree/master/src/contrib/corona>
> >>>>>>> )
> >>>>>>> since around last December.  It has proven to be very scalable,
> >>>>>>> performant, and enables a bunch of new applications.  Based on the
> >>>>>>> drastic improvements and the use of Giraph in production, it seems
> >>>>>>> appropriate to bump up our version to 1.0.
> >>>>>>>
> >>>>>>> While anyone can vote, the ASF requires majority approval from the
> >>>>>>> PMC
> >>>>>>> -- i.e., at least three PMC members must vote affirmatively for
> >>>>>>> release,
> >>>>>>> and there must be more positive than negative votes. Releases may
> >>>>>>> not be
> >>>>>>> vetoed. Before voting +1 PMC members are required to download the
> >>>>>>> signed
> >>>>>>> source code package, compile it as provided, and test the resulting
> >>>>>>> executable on their own platform, along with also verifying that
> the
> >>>>>>> package meets the requirements of the ASF policy on releases.
> >>>>>>>
> >>>>>>> Please test this against many other Hadoop versions and let us know
> >>>>>>> how
> >>>>>>> this goes!
> >>>>>>>
> >>>>>>> Release notes:
> >>>>>>> http://people.apache.org/~**aching/giraph-1.0-RC0/RELEASE_**
> >>>>>>> NOTES.html<
> http://people.apache.org/~aching/giraph-1.0-RC0/RELEASE_NOTES.html>
> >>>>>>>
> >>>>>>> Release artifacts:
> >>>>>>> http://people.apache.org/~**aching/giraph-1.0-RC0/<
> http://people.apache.org/~aching/giraph-1.0-RC0/>
> >>>>>>>
> >>>>>>> Corresponding git tag:
> >>>>>>> https://git-wip-us.apache.org/**repos/asf?p=giraph.git;a=**
> >>>>>>> shortlog;h=refs/tags/release-**1.0-RC0<
> https://git-wip-us.apache.org/repos/asf?p=giraph.git;a=shortlog;h=refs/tags/release-1.0-RC0
> >
> >>>>>>>
> >>>>>>>
> >>>>>>>
> >>>>>>>
> >>>>>>> Signing keys:
> >>>>>>> http://people.apache.org/keys/**group/giraph.asc<
> http://people.apache.org/keys/group/giraph.asc>
> >>>>>>>
> >>>>>>> The vote runs for 72 hours, until Monday 4pm PST.
> >>>>>>>
> >>>>>>> Thanks everyone for your patience with this release!
> >>>>>>>
> >>>>>>> Avery
> >>>>>>>
> >>>>>>
> >
>

Re: [VOTE][CHANGED] Release Giraph 1.0 (rc1)

Posted by Sebastian Schelter <ss...@apache.org>.
Thank you, Avery, wish I had found the bug earlier.
Am 14.04.2013 23:25 schrieb "Avery Ching" <ac...@apache.org>:

> Thanks for your input Sebastian.  Given the choice to removing
> PageRankVertex or adding the fix, I've added your fix and will cut RC2 a
> bit later today.  I really hope this is the last RC.
>
> Avery
>
> On 4/14/13 9:34 AM, Sebastian Schelter wrote:
>
>> Hi Avery,
>>
>> I see your concerns. The benchmarking question is difficult, we had very
>> bad experiences with Mahout in that regards. E.g., we once had a
>> M/R-based PageRank implementation in Mahout that uses our integer-based
>> vectors and removed it as we got public complaints that you can't fit
>> the whole web into the range of an integer. Personally, I'd also refrain
>> from using floats instead of doubles for benchmarks, as this simply
>> means you give up on accuracy.
>>
>> Regarding benchmarks, I guess the best thing we could do is publish our
>> own numbers. The current runtimes I've seen are already very good,
>> Giraph beat a very optimized Stratosphere implementation that we did for
>> a recent paper by approx. 25%.
>>
>> To conclude, I do in no way want to hold up the current release. I'm
>> perfectly fine with not including the patch and optimizing the
>> implementation for a 1.0.1 release, but then we should remove the
>> current examples.PageRankVertex from the 1.0 release, as the convergence
>> detection is broken and we should not knowingly ship bugged code.
>>
>> Best,
>> Sebastian
>>
>>
>> On 14.04.2013 18:18, Avery Ching wrote:
>>
>>> Hi Sebastian,
>>>
>>> Thanks for the patch.  I'll try to take a look at it.
>>>
>>> The only reason I bring the optimizations up is that a lot of folks tend
>>> to compare PageRank performance.  The optimizations I'm referring to are
>>> Giraph ones, not algorithmic ones.  We use ints, floats for ids,
>>> messages, respectively instead longs, doubles (1/2 network traffic) and
>>> IntNullArrayEdges vertex edges (efficient array backed edges) instead of
>>> ByteArrayEdges.  You can see
>>> https://issues.apache.org/**jira/browse/giraph-543<https://issues.apache.org/jira/browse/giraph-543>for more details.
>>>
>>> Anyway, given that we are going to ship a 1.0.1 release in a few weeks
>>> for a variety of reasons, should this really hold up the current
>>> release?  I would prefer to not cut anymore RCs unless things are
>>> totally broken (i.e. profiles not compiling, major Giraph bugs, etc.).
>>> There are still a lot of outstanding issues in JIRA, we can't fix them
>>> all for the 1.0 release.
>>>
>>> Let me know what you think.
>>>
>>> Avery
>>>
>>> On 4/13/13 10:46 AM, Sebastian Schelter wrote:
>>>
>>>> Hi Avery,
>>>>
>>>> I found the bug and can I provide a patch today or tomorrow, so
>>>> hopefully we can include that in the release (to not knowingly ship
>>>> bugged code). Furthermore I improved the code to protect against
>>>> rounding errors.
>>>>
>>>> I don't really get what you mean with the missing optimization in
>>>> comparison to the benchmark PageRank implementation.
>>>>
>>>> The implementation in o.a.g.examples.PageRankVertex aims to be a robust
>>>> real-world implementation. As optimization, it dismisses edge weights
>>>> and reuses objects where possible. Furthermore it is able to handle
>>>> dangling vertices that are present in almost every real-world network
>>>> and it automatically detects the number of supersteps to run. With the
>>>> patch, it should also provide improved numerical stability.
>>>>
>>>> If the runtimes doesn't look good enough when compared to the benchmark
>>>> implementation, this might also be caused by the dataset which has a
>>>> skewed degree distribution (like most real-world networks). The
>>>> benchmark uses a uniform degree distribution AFAIK.
>>>>
>>>> Best,
>>>> Sebastian
>>>>
>>>> On 13.04.2013 15:46, Avery Ching wrote:
>>>>
>>>>> That's great Sebastian.  I would also recommend taking a look at the
>>>>> PageRankBenchmark for a performance comparison.  It has been a lot of
>>>>> speed improvements that should be a bunch faster than PageRankVertex.
>>>>> Even that though, is not totally optimized.  Hopefully we'll be adding
>>>>> a
>>>>> "how to optimize performance" guide in the near future.  Should we
>>>>> delay
>>>>> the release or simply just ship a 1.1, say in the next month with this
>>>>> fix and supporting YARN's 2.0.4?  I'd like to get on a more normal
>>>>> release cycle rather than once a year =).
>>>>>
>>>>> Avery
>>>>>
>>>>> On 4/13/13 3:02 AM, Sebastian Schelter wrote:
>>>>>
>>>>>> Hi there,
>>>>>>
>>>>>> I got some good and bad news, I tested PageRankVertex (not the
>>>>>> Benchmark
>>>>>> but the example implementation o.a.g.examples.PageRankVertex) from
>>>>>> trunk
>>>>>> compiled for Hadoop 1.0 on a cluster of 26 machines with 208 cores.
>>>>>>
>>>>>> I used the Webbase2001 dataset [1] which has 115M vertices and more
>>>>>> than
>>>>>> 1B edges and got some awesome running times, average superstep takes
>>>>>> 15
>>>>>> seconds (!!!). Awesome work, I have to say!
>>>>>>
>>>>>> Unfortunately, there seems to be an issue with the convergence
>>>>>> detection, as it didn't get the correct convergence behavior. I'd like
>>>>>> to have a look into that this week, so we can ship a performant
>>>>>> PageRank
>>>>>> implementation which automatically runs an appropriate number of
>>>>>> supersteps. Hope this doesn't delay the release too much.
>>>>>>
>>>>>> Best,
>>>>>> Sebastian
>>>>>>
>>>>>>
>>>>>> [1] http://law.di.unimi.it/**webdata/webbase-2001/<http://law.di.unimi.it/webdata/webbase-2001/>
>>>>>>
>>>>>>
>>>>>> On 13.04.2013 07:39, Avery Ching wrote:
>>>>>>
>>>>>>> Thanks to the quick feedback from Roman and Lewis, we have cut a
>>>>>>> new RC1
>>>>>>> that addresses the following issues.
>>>>>>>
>>>>>>> * Got rid of .git repo in tarball
>>>>>>> * Fixed issue with not compiling without git repo (GIRAPH-628)
>>>>>>> * Used gnutar in OSX rather than tar to generate the tarball and
>>>>>>> get rid
>>>>>>> of warnings
>>>>>>> * Pushed GIRAPH-627 to support the yarn profile better
>>>>>>> * Tarball name changed to the final artifact name (giraph-1.0.tar.gz)
>>>>>>>
>>>>>>> Release notes:
>>>>>>> http://people.apache.org/~**aching/giraph-1.0-RC1/RELEASE_**
>>>>>>> NOTES.html<http://people.apache.org/~aching/giraph-1.0-RC1/RELEASE_NOTES.html>
>>>>>>>
>>>>>>> Release artifacts:
>>>>>>> http://people.apache.org/~**aching/giraph-1.0-RC1/<http://people.apache.org/~aching/giraph-1.0-RC1/>
>>>>>>>
>>>>>>> Corresponding git tag:
>>>>>>> https://git-wip-us.apache.org/**repos/asf?p=giraph.git;a=**
>>>>>>> shortlog;h=refs/tags/release-**1.0-RC1<https://git-wip-us.apache.org/repos/asf?p=giraph.git;a=shortlog;h=refs/tags/release-1.0-RC1>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> Signing keys:
>>>>>>> http://people.apache.org/keys/**group/giraph.asc<http://people.apache.org/keys/group/giraph.asc>
>>>>>>>
>>>>>>> The vote runs for 72 hours, until Monday 11pm PST.
>>>>>>>
>>>>>>> Thanks,
>>>>>>>
>>>>>>> Avery
>>>>>>>
>>>>>>> Original message below regarding rc0:
>>>>>>>
>>>>>>> ------------------------------**-
>>>>>>>
>>>>>>> Fellow Giraphers,
>>>>>>>
>>>>>>> We have a our first release candidate since graduating from
>>>>>>> incubation.
>>>>>>>     This is a source release, primarily due to the different
>>>>>>> versions of
>>>>>>> Hadoop we support with munge (similar to the 0.1 release).  Since
>>>>>>> 0.1,
>>>>>>> we've made A TON of progress on overall performance, optimizing
>>>>>>> memory
>>>>>>> use, split vertex/edge inputs, easy interoperability with Apache
>>>>>>> Hive,
>>>>>>> and a bunch of other areas.  In many ways, this is an almost totally
>>>>>>> different codebase.  Thanks everyone for your hard work!
>>>>>>>
>>>>>>> Apache Giraph has been running in production at Facebook (against
>>>>>>> Facebook's Corona implementation of Hadoop -
>>>>>>> https://github.com/facebook/**hadoop-20/tree/master/src/**
>>>>>>> contrib/corona<https://github.com/facebook/hadoop-20/tree/master/src/contrib/corona>
>>>>>>> )
>>>>>>> since around last December.  It has proven to be very scalable,
>>>>>>> performant, and enables a bunch of new applications.  Based on the
>>>>>>> drastic improvements and the use of Giraph in production, it seems
>>>>>>> appropriate to bump up our version to 1.0.
>>>>>>>
>>>>>>> While anyone can vote, the ASF requires majority approval from the
>>>>>>> PMC
>>>>>>> -- i.e., at least three PMC members must vote affirmatively for
>>>>>>> release,
>>>>>>> and there must be more positive than negative votes. Releases may
>>>>>>> not be
>>>>>>> vetoed. Before voting +1 PMC members are required to download the
>>>>>>> signed
>>>>>>> source code package, compile it as provided, and test the resulting
>>>>>>> executable on their own platform, along with also verifying that the
>>>>>>> package meets the requirements of the ASF policy on releases.
>>>>>>>
>>>>>>> Please test this against many other Hadoop versions and let us know
>>>>>>> how
>>>>>>> this goes!
>>>>>>>
>>>>>>> Release notes:
>>>>>>> http://people.apache.org/~**aching/giraph-1.0-RC0/RELEASE_**
>>>>>>> NOTES.html<http://people.apache.org/~aching/giraph-1.0-RC0/RELEASE_NOTES.html>
>>>>>>>
>>>>>>> Release artifacts:
>>>>>>> http://people.apache.org/~**aching/giraph-1.0-RC0/<http://people.apache.org/~aching/giraph-1.0-RC0/>
>>>>>>>
>>>>>>> Corresponding git tag:
>>>>>>> https://git-wip-us.apache.org/**repos/asf?p=giraph.git;a=**
>>>>>>> shortlog;h=refs/tags/release-**1.0-RC0<https://git-wip-us.apache.org/repos/asf?p=giraph.git;a=shortlog;h=refs/tags/release-1.0-RC0>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> Signing keys:
>>>>>>> http://people.apache.org/keys/**group/giraph.asc<http://people.apache.org/keys/group/giraph.asc>
>>>>>>>
>>>>>>> The vote runs for 72 hours, until Monday 4pm PST.
>>>>>>>
>>>>>>> Thanks everyone for your patience with this release!
>>>>>>>
>>>>>>> Avery
>>>>>>>
>>>>>>
>

Re: [VOTE][CHANGED] Release Giraph 1.0 (rc1)

Posted by Avery Ching <ac...@apache.org>.
Thanks for your input Sebastian.  Given the choice to removing 
PageRankVertex or adding the fix, I've added your fix and will cut RC2 a 
bit later today.  I really hope this is the last RC.

Avery

On 4/14/13 9:34 AM, Sebastian Schelter wrote:
> Hi Avery,
>
> I see your concerns. The benchmarking question is difficult, we had very
> bad experiences with Mahout in that regards. E.g., we once had a
> M/R-based PageRank implementation in Mahout that uses our integer-based
> vectors and removed it as we got public complaints that you can't fit
> the whole web into the range of an integer. Personally, I'd also refrain
> from using floats instead of doubles for benchmarks, as this simply
> means you give up on accuracy.
>
> Regarding benchmarks, I guess the best thing we could do is publish our
> own numbers. The current runtimes I've seen are already very good,
> Giraph beat a very optimized Stratosphere implementation that we did for
> a recent paper by approx. 25%.
>
> To conclude, I do in no way want to hold up the current release. I'm
> perfectly fine with not including the patch and optimizing the
> implementation for a 1.0.1 release, but then we should remove the
> current examples.PageRankVertex from the 1.0 release, as the convergence
> detection is broken and we should not knowingly ship bugged code.
>
> Best,
> Sebastian
>
>
> On 14.04.2013 18:18, Avery Ching wrote:
>> Hi Sebastian,
>>
>> Thanks for the patch.  I'll try to take a look at it.
>>
>> The only reason I bring the optimizations up is that a lot of folks tend
>> to compare PageRank performance.  The optimizations I'm referring to are
>> Giraph ones, not algorithmic ones.  We use ints, floats for ids,
>> messages, respectively instead longs, doubles (1/2 network traffic) and
>> IntNullArrayEdges vertex edges (efficient array backed edges) instead of
>> ByteArrayEdges.  You can see
>> https://issues.apache.org/jira/browse/giraph-543 for more details.
>>
>> Anyway, given that we are going to ship a 1.0.1 release in a few weeks
>> for a variety of reasons, should this really hold up the current
>> release?  I would prefer to not cut anymore RCs unless things are
>> totally broken (i.e. profiles not compiling, major Giraph bugs, etc.).
>> There are still a lot of outstanding issues in JIRA, we can't fix them
>> all for the 1.0 release.
>>
>> Let me know what you think.
>>
>> Avery
>>
>> On 4/13/13 10:46 AM, Sebastian Schelter wrote:
>>> Hi Avery,
>>>
>>> I found the bug and can I provide a patch today or tomorrow, so
>>> hopefully we can include that in the release (to not knowingly ship
>>> bugged code). Furthermore I improved the code to protect against
>>> rounding errors.
>>>
>>> I don't really get what you mean with the missing optimization in
>>> comparison to the benchmark PageRank implementation.
>>>
>>> The implementation in o.a.g.examples.PageRankVertex aims to be a robust
>>> real-world implementation. As optimization, it dismisses edge weights
>>> and reuses objects where possible. Furthermore it is able to handle
>>> dangling vertices that are present in almost every real-world network
>>> and it automatically detects the number of supersteps to run. With the
>>> patch, it should also provide improved numerical stability.
>>>
>>> If the runtimes doesn't look good enough when compared to the benchmark
>>> implementation, this might also be caused by the dataset which has a
>>> skewed degree distribution (like most real-world networks). The
>>> benchmark uses a uniform degree distribution AFAIK.
>>>
>>> Best,
>>> Sebastian
>>>
>>> On 13.04.2013 15:46, Avery Ching wrote:
>>>> That's great Sebastian.  I would also recommend taking a look at the
>>>> PageRankBenchmark for a performance comparison.  It has been a lot of
>>>> speed improvements that should be a bunch faster than PageRankVertex.
>>>> Even that though, is not totally optimized.  Hopefully we'll be adding a
>>>> "how to optimize performance" guide in the near future.  Should we delay
>>>> the release or simply just ship a 1.1, say in the next month with this
>>>> fix and supporting YARN's 2.0.4?  I'd like to get on a more normal
>>>> release cycle rather than once a year =).
>>>>
>>>> Avery
>>>>
>>>> On 4/13/13 3:02 AM, Sebastian Schelter wrote:
>>>>> Hi there,
>>>>>
>>>>> I got some good and bad news, I tested PageRankVertex (not the
>>>>> Benchmark
>>>>> but the example implementation o.a.g.examples.PageRankVertex) from
>>>>> trunk
>>>>> compiled for Hadoop 1.0 on a cluster of 26 machines with 208 cores.
>>>>>
>>>>> I used the Webbase2001 dataset [1] which has 115M vertices and more
>>>>> than
>>>>> 1B edges and got some awesome running times, average superstep takes 15
>>>>> seconds (!!!). Awesome work, I have to say!
>>>>>
>>>>> Unfortunately, there seems to be an issue with the convergence
>>>>> detection, as it didn't get the correct convergence behavior. I'd like
>>>>> to have a look into that this week, so we can ship a performant
>>>>> PageRank
>>>>> implementation which automatically runs an appropriate number of
>>>>> supersteps. Hope this doesn't delay the release too much.
>>>>>
>>>>> Best,
>>>>> Sebastian
>>>>>
>>>>>
>>>>> [1] http://law.di.unimi.it/webdata/webbase-2001/
>>>>>
>>>>>
>>>>> On 13.04.2013 07:39, Avery Ching wrote:
>>>>>> Thanks to the quick feedback from Roman and Lewis, we have cut a
>>>>>> new RC1
>>>>>> that addresses the following issues.
>>>>>>
>>>>>> * Got rid of .git repo in tarball
>>>>>> * Fixed issue with not compiling without git repo (GIRAPH-628)
>>>>>> * Used gnutar in OSX rather than tar to generate the tarball and
>>>>>> get rid
>>>>>> of warnings
>>>>>> * Pushed GIRAPH-627 to support the yarn profile better
>>>>>> * Tarball name changed to the final artifact name (giraph-1.0.tar.gz)
>>>>>>
>>>>>> Release notes:
>>>>>> http://people.apache.org/~aching/giraph-1.0-RC1/RELEASE_NOTES.html
>>>>>>
>>>>>> Release artifacts:
>>>>>> http://people.apache.org/~aching/giraph-1.0-RC1/
>>>>>>
>>>>>> Corresponding git tag:
>>>>>> https://git-wip-us.apache.org/repos/asf?p=giraph.git;a=shortlog;h=refs/tags/release-1.0-RC1
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>> Signing keys:
>>>>>> http://people.apache.org/keys/group/giraph.asc
>>>>>>
>>>>>> The vote runs for 72 hours, until Monday 11pm PST.
>>>>>>
>>>>>> Thanks,
>>>>>>
>>>>>> Avery
>>>>>>
>>>>>> Original message below regarding rc0:
>>>>>>
>>>>>> -------------------------------
>>>>>>
>>>>>> Fellow Giraphers,
>>>>>>
>>>>>> We have a our first release candidate since graduating from
>>>>>> incubation.
>>>>>>     This is a source release, primarily due to the different
>>>>>> versions of
>>>>>> Hadoop we support with munge (similar to the 0.1 release).  Since 0.1,
>>>>>> we've made A TON of progress on overall performance, optimizing memory
>>>>>> use, split vertex/edge inputs, easy interoperability with Apache Hive,
>>>>>> and a bunch of other areas.  In many ways, this is an almost totally
>>>>>> different codebase.  Thanks everyone for your hard work!
>>>>>>
>>>>>> Apache Giraph has been running in production at Facebook (against
>>>>>> Facebook's Corona implementation of Hadoop -
>>>>>> https://github.com/facebook/hadoop-20/tree/master/src/contrib/corona)
>>>>>> since around last December.  It has proven to be very scalable,
>>>>>> performant, and enables a bunch of new applications.  Based on the
>>>>>> drastic improvements and the use of Giraph in production, it seems
>>>>>> appropriate to bump up our version to 1.0.
>>>>>>
>>>>>> While anyone can vote, the ASF requires majority approval from the PMC
>>>>>> -- i.e., at least three PMC members must vote affirmatively for
>>>>>> release,
>>>>>> and there must be more positive than negative votes. Releases may
>>>>>> not be
>>>>>> vetoed. Before voting +1 PMC members are required to download the
>>>>>> signed
>>>>>> source code package, compile it as provided, and test the resulting
>>>>>> executable on their own platform, along with also verifying that the
>>>>>> package meets the requirements of the ASF policy on releases.
>>>>>>
>>>>>> Please test this against many other Hadoop versions and let us know
>>>>>> how
>>>>>> this goes!
>>>>>>
>>>>>> Release notes:
>>>>>> http://people.apache.org/~aching/giraph-1.0-RC0/RELEASE_NOTES.html
>>>>>>
>>>>>> Release artifacts:
>>>>>> http://people.apache.org/~aching/giraph-1.0-RC0/
>>>>>>
>>>>>> Corresponding git tag:
>>>>>> https://git-wip-us.apache.org/repos/asf?p=giraph.git;a=shortlog;h=refs/tags/release-1.0-RC0
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>> Signing keys:
>>>>>> http://people.apache.org/keys/group/giraph.asc
>>>>>>
>>>>>> The vote runs for 72 hours, until Monday 4pm PST.
>>>>>>
>>>>>> Thanks everyone for your patience with this release!
>>>>>>
>>>>>> Avery


Re: [VOTE][CHANGED] Release Giraph 1.0 (rc1)

Posted by Avery Ching <ac...@apache.org>.
Thanks for your input Sebastian.  Given the choice to removing 
PageRankVertex or adding the fix, I've added your fix and will cut RC2 a 
bit later today.  I really hope this is the last RC.

Avery

On 4/14/13 9:34 AM, Sebastian Schelter wrote:
> Hi Avery,
>
> I see your concerns. The benchmarking question is difficult, we had very
> bad experiences with Mahout in that regards. E.g., we once had a
> M/R-based PageRank implementation in Mahout that uses our integer-based
> vectors and removed it as we got public complaints that you can't fit
> the whole web into the range of an integer. Personally, I'd also refrain
> from using floats instead of doubles for benchmarks, as this simply
> means you give up on accuracy.
>
> Regarding benchmarks, I guess the best thing we could do is publish our
> own numbers. The current runtimes I've seen are already very good,
> Giraph beat a very optimized Stratosphere implementation that we did for
> a recent paper by approx. 25%.
>
> To conclude, I do in no way want to hold up the current release. I'm
> perfectly fine with not including the patch and optimizing the
> implementation for a 1.0.1 release, but then we should remove the
> current examples.PageRankVertex from the 1.0 release, as the convergence
> detection is broken and we should not knowingly ship bugged code.
>
> Best,
> Sebastian
>
>
> On 14.04.2013 18:18, Avery Ching wrote:
>> Hi Sebastian,
>>
>> Thanks for the patch.  I'll try to take a look at it.
>>
>> The only reason I bring the optimizations up is that a lot of folks tend
>> to compare PageRank performance.  The optimizations I'm referring to are
>> Giraph ones, not algorithmic ones.  We use ints, floats for ids,
>> messages, respectively instead longs, doubles (1/2 network traffic) and
>> IntNullArrayEdges vertex edges (efficient array backed edges) instead of
>> ByteArrayEdges.  You can see
>> https://issues.apache.org/jira/browse/giraph-543 for more details.
>>
>> Anyway, given that we are going to ship a 1.0.1 release in a few weeks
>> for a variety of reasons, should this really hold up the current
>> release?  I would prefer to not cut anymore RCs unless things are
>> totally broken (i.e. profiles not compiling, major Giraph bugs, etc.).
>> There are still a lot of outstanding issues in JIRA, we can't fix them
>> all for the 1.0 release.
>>
>> Let me know what you think.
>>
>> Avery
>>
>> On 4/13/13 10:46 AM, Sebastian Schelter wrote:
>>> Hi Avery,
>>>
>>> I found the bug and can I provide a patch today or tomorrow, so
>>> hopefully we can include that in the release (to not knowingly ship
>>> bugged code). Furthermore I improved the code to protect against
>>> rounding errors.
>>>
>>> I don't really get what you mean with the missing optimization in
>>> comparison to the benchmark PageRank implementation.
>>>
>>> The implementation in o.a.g.examples.PageRankVertex aims to be a robust
>>> real-world implementation. As optimization, it dismisses edge weights
>>> and reuses objects where possible. Furthermore it is able to handle
>>> dangling vertices that are present in almost every real-world network
>>> and it automatically detects the number of supersteps to run. With the
>>> patch, it should also provide improved numerical stability.
>>>
>>> If the runtimes doesn't look good enough when compared to the benchmark
>>> implementation, this might also be caused by the dataset which has a
>>> skewed degree distribution (like most real-world networks). The
>>> benchmark uses a uniform degree distribution AFAIK.
>>>
>>> Best,
>>> Sebastian
>>>
>>> On 13.04.2013 15:46, Avery Ching wrote:
>>>> That's great Sebastian.  I would also recommend taking a look at the
>>>> PageRankBenchmark for a performance comparison.  It has been a lot of
>>>> speed improvements that should be a bunch faster than PageRankVertex.
>>>> Even that though, is not totally optimized.  Hopefully we'll be adding a
>>>> "how to optimize performance" guide in the near future.  Should we delay
>>>> the release or simply just ship a 1.1, say in the next month with this
>>>> fix and supporting YARN's 2.0.4?  I'd like to get on a more normal
>>>> release cycle rather than once a year =).
>>>>
>>>> Avery
>>>>
>>>> On 4/13/13 3:02 AM, Sebastian Schelter wrote:
>>>>> Hi there,
>>>>>
>>>>> I got some good and bad news, I tested PageRankVertex (not the
>>>>> Benchmark
>>>>> but the example implementation o.a.g.examples.PageRankVertex) from
>>>>> trunk
>>>>> compiled for Hadoop 1.0 on a cluster of 26 machines with 208 cores.
>>>>>
>>>>> I used the Webbase2001 dataset [1] which has 115M vertices and more
>>>>> than
>>>>> 1B edges and got some awesome running times, average superstep takes 15
>>>>> seconds (!!!). Awesome work, I have to say!
>>>>>
>>>>> Unfortunately, there seems to be an issue with the convergence
>>>>> detection, as it didn't get the correct convergence behavior. I'd like
>>>>> to have a look into that this week, so we can ship a performant
>>>>> PageRank
>>>>> implementation which automatically runs an appropriate number of
>>>>> supersteps. Hope this doesn't delay the release too much.
>>>>>
>>>>> Best,
>>>>> Sebastian
>>>>>
>>>>>
>>>>> [1] http://law.di.unimi.it/webdata/webbase-2001/
>>>>>
>>>>>
>>>>> On 13.04.2013 07:39, Avery Ching wrote:
>>>>>> Thanks to the quick feedback from Roman and Lewis, we have cut a
>>>>>> new RC1
>>>>>> that addresses the following issues.
>>>>>>
>>>>>> * Got rid of .git repo in tarball
>>>>>> * Fixed issue with not compiling without git repo (GIRAPH-628)
>>>>>> * Used gnutar in OSX rather than tar to generate the tarball and
>>>>>> get rid
>>>>>> of warnings
>>>>>> * Pushed GIRAPH-627 to support the yarn profile better
>>>>>> * Tarball name changed to the final artifact name (giraph-1.0.tar.gz)
>>>>>>
>>>>>> Release notes:
>>>>>> http://people.apache.org/~aching/giraph-1.0-RC1/RELEASE_NOTES.html
>>>>>>
>>>>>> Release artifacts:
>>>>>> http://people.apache.org/~aching/giraph-1.0-RC1/
>>>>>>
>>>>>> Corresponding git tag:
>>>>>> https://git-wip-us.apache.org/repos/asf?p=giraph.git;a=shortlog;h=refs/tags/release-1.0-RC1
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>> Signing keys:
>>>>>> http://people.apache.org/keys/group/giraph.asc
>>>>>>
>>>>>> The vote runs for 72 hours, until Monday 11pm PST.
>>>>>>
>>>>>> Thanks,
>>>>>>
>>>>>> Avery
>>>>>>
>>>>>> Original message below regarding rc0:
>>>>>>
>>>>>> -------------------------------
>>>>>>
>>>>>> Fellow Giraphers,
>>>>>>
>>>>>> We have a our first release candidate since graduating from
>>>>>> incubation.
>>>>>>     This is a source release, primarily due to the different
>>>>>> versions of
>>>>>> Hadoop we support with munge (similar to the 0.1 release).  Since 0.1,
>>>>>> we've made A TON of progress on overall performance, optimizing memory
>>>>>> use, split vertex/edge inputs, easy interoperability with Apache Hive,
>>>>>> and a bunch of other areas.  In many ways, this is an almost totally
>>>>>> different codebase.  Thanks everyone for your hard work!
>>>>>>
>>>>>> Apache Giraph has been running in production at Facebook (against
>>>>>> Facebook's Corona implementation of Hadoop -
>>>>>> https://github.com/facebook/hadoop-20/tree/master/src/contrib/corona)
>>>>>> since around last December.  It has proven to be very scalable,
>>>>>> performant, and enables a bunch of new applications.  Based on the
>>>>>> drastic improvements and the use of Giraph in production, it seems
>>>>>> appropriate to bump up our version to 1.0.
>>>>>>
>>>>>> While anyone can vote, the ASF requires majority approval from the PMC
>>>>>> -- i.e., at least three PMC members must vote affirmatively for
>>>>>> release,
>>>>>> and there must be more positive than negative votes. Releases may
>>>>>> not be
>>>>>> vetoed. Before voting +1 PMC members are required to download the
>>>>>> signed
>>>>>> source code package, compile it as provided, and test the resulting
>>>>>> executable on their own platform, along with also verifying that the
>>>>>> package meets the requirements of the ASF policy on releases.
>>>>>>
>>>>>> Please test this against many other Hadoop versions and let us know
>>>>>> how
>>>>>> this goes!
>>>>>>
>>>>>> Release notes:
>>>>>> http://people.apache.org/~aching/giraph-1.0-RC0/RELEASE_NOTES.html
>>>>>>
>>>>>> Release artifacts:
>>>>>> http://people.apache.org/~aching/giraph-1.0-RC0/
>>>>>>
>>>>>> Corresponding git tag:
>>>>>> https://git-wip-us.apache.org/repos/asf?p=giraph.git;a=shortlog;h=refs/tags/release-1.0-RC0
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>> Signing keys:
>>>>>> http://people.apache.org/keys/group/giraph.asc
>>>>>>
>>>>>> The vote runs for 72 hours, until Monday 4pm PST.
>>>>>>
>>>>>> Thanks everyone for your patience with this release!
>>>>>>
>>>>>> Avery


Re: [VOTE][CHANGED] Release Giraph 1.0 (rc1)

Posted by Sebastian Schelter <ss...@apache.org>.
Hi Avery,

I see your concerns. The benchmarking question is difficult, we had very
bad experiences with Mahout in that regards. E.g., we once had a
M/R-based PageRank implementation in Mahout that uses our integer-based
vectors and removed it as we got public complaints that you can't fit
the whole web into the range of an integer. Personally, I'd also refrain
from using floats instead of doubles for benchmarks, as this simply
means you give up on accuracy.

Regarding benchmarks, I guess the best thing we could do is publish our
own numbers. The current runtimes I've seen are already very good,
Giraph beat a very optimized Stratosphere implementation that we did for
a recent paper by approx. 25%.

To conclude, I do in no way want to hold up the current release. I'm
perfectly fine with not including the patch and optimizing the
implementation for a 1.0.1 release, but then we should remove the
current examples.PageRankVertex from the 1.0 release, as the convergence
detection is broken and we should not knowingly ship bugged code.

Best,
Sebastian


On 14.04.2013 18:18, Avery Ching wrote:
> Hi Sebastian,
> 
> Thanks for the patch.  I'll try to take a look at it.
> 
> The only reason I bring the optimizations up is that a lot of folks tend
> to compare PageRank performance.  The optimizations I'm referring to are
> Giraph ones, not algorithmic ones.  We use ints, floats for ids,
> messages, respectively instead longs, doubles (1/2 network traffic) and
> IntNullArrayEdges vertex edges (efficient array backed edges) instead of
> ByteArrayEdges.  You can see
> https://issues.apache.org/jira/browse/giraph-543 for more details.
> 
> Anyway, given that we are going to ship a 1.0.1 release in a few weeks
> for a variety of reasons, should this really hold up the current
> release?  I would prefer to not cut anymore RCs unless things are
> totally broken (i.e. profiles not compiling, major Giraph bugs, etc.). 
> There are still a lot of outstanding issues in JIRA, we can't fix them
> all for the 1.0 release.
> 
> Let me know what you think.
> 
> Avery
> 
> On 4/13/13 10:46 AM, Sebastian Schelter wrote:
>> Hi Avery,
>>
>> I found the bug and can I provide a patch today or tomorrow, so
>> hopefully we can include that in the release (to not knowingly ship
>> bugged code). Furthermore I improved the code to protect against
>> rounding errors.
>>
>> I don't really get what you mean with the missing optimization in
>> comparison to the benchmark PageRank implementation.
>>
>> The implementation in o.a.g.examples.PageRankVertex aims to be a robust
>> real-world implementation. As optimization, it dismisses edge weights
>> and reuses objects where possible. Furthermore it is able to handle
>> dangling vertices that are present in almost every real-world network
>> and it automatically detects the number of supersteps to run. With the
>> patch, it should also provide improved numerical stability.
>>
>> If the runtimes doesn't look good enough when compared to the benchmark
>> implementation, this might also be caused by the dataset which has a
>> skewed degree distribution (like most real-world networks). The
>> benchmark uses a uniform degree distribution AFAIK.
>>
>> Best,
>> Sebastian
>>
>> On 13.04.2013 15:46, Avery Ching wrote:
>>> That's great Sebastian.  I would also recommend taking a look at the
>>> PageRankBenchmark for a performance comparison.  It has been a lot of
>>> speed improvements that should be a bunch faster than PageRankVertex.
>>> Even that though, is not totally optimized.  Hopefully we'll be adding a
>>> "how to optimize performance" guide in the near future.  Should we delay
>>> the release or simply just ship a 1.1, say in the next month with this
>>> fix and supporting YARN's 2.0.4?  I'd like to get on a more normal
>>> release cycle rather than once a year =).
>>>
>>> Avery
>>>
>>> On 4/13/13 3:02 AM, Sebastian Schelter wrote:
>>>> Hi there,
>>>>
>>>> I got some good and bad news, I tested PageRankVertex (not the
>>>> Benchmark
>>>> but the example implementation o.a.g.examples.PageRankVertex) from
>>>> trunk
>>>> compiled for Hadoop 1.0 on a cluster of 26 machines with 208 cores.
>>>>
>>>> I used the Webbase2001 dataset [1] which has 115M vertices and more
>>>> than
>>>> 1B edges and got some awesome running times, average superstep takes 15
>>>> seconds (!!!). Awesome work, I have to say!
>>>>
>>>> Unfortunately, there seems to be an issue with the convergence
>>>> detection, as it didn't get the correct convergence behavior. I'd like
>>>> to have a look into that this week, so we can ship a performant
>>>> PageRank
>>>> implementation which automatically runs an appropriate number of
>>>> supersteps. Hope this doesn't delay the release too much.
>>>>
>>>> Best,
>>>> Sebastian
>>>>
>>>>
>>>> [1] http://law.di.unimi.it/webdata/webbase-2001/
>>>>
>>>>
>>>> On 13.04.2013 07:39, Avery Ching wrote:
>>>>> Thanks to the quick feedback from Roman and Lewis, we have cut a
>>>>> new RC1
>>>>> that addresses the following issues.
>>>>>
>>>>> * Got rid of .git repo in tarball
>>>>> * Fixed issue with not compiling without git repo (GIRAPH-628)
>>>>> * Used gnutar in OSX rather than tar to generate the tarball and
>>>>> get rid
>>>>> of warnings
>>>>> * Pushed GIRAPH-627 to support the yarn profile better
>>>>> * Tarball name changed to the final artifact name (giraph-1.0.tar.gz)
>>>>>
>>>>> Release notes:
>>>>> http://people.apache.org/~aching/giraph-1.0-RC1/RELEASE_NOTES.html
>>>>>
>>>>> Release artifacts:
>>>>> http://people.apache.org/~aching/giraph-1.0-RC1/
>>>>>
>>>>> Corresponding git tag:
>>>>> https://git-wip-us.apache.org/repos/asf?p=giraph.git;a=shortlog;h=refs/tags/release-1.0-RC1
>>>>>
>>>>>
>>>>>
>>>>>
>>>>> Signing keys:
>>>>> http://people.apache.org/keys/group/giraph.asc
>>>>>
>>>>> The vote runs for 72 hours, until Monday 11pm PST.
>>>>>
>>>>> Thanks,
>>>>>
>>>>> Avery
>>>>>
>>>>> Original message below regarding rc0:
>>>>>
>>>>> -------------------------------
>>>>>
>>>>> Fellow Giraphers,
>>>>>
>>>>> We have a our first release candidate since graduating from
>>>>> incubation.
>>>>>    This is a source release, primarily due to the different
>>>>> versions of
>>>>> Hadoop we support with munge (similar to the 0.1 release).  Since 0.1,
>>>>> we've made A TON of progress on overall performance, optimizing memory
>>>>> use, split vertex/edge inputs, easy interoperability with Apache Hive,
>>>>> and a bunch of other areas.  In many ways, this is an almost totally
>>>>> different codebase.  Thanks everyone for your hard work!
>>>>>
>>>>> Apache Giraph has been running in production at Facebook (against
>>>>> Facebook's Corona implementation of Hadoop -
>>>>> https://github.com/facebook/hadoop-20/tree/master/src/contrib/corona)
>>>>> since around last December.  It has proven to be very scalable,
>>>>> performant, and enables a bunch of new applications.  Based on the
>>>>> drastic improvements and the use of Giraph in production, it seems
>>>>> appropriate to bump up our version to 1.0.
>>>>>
>>>>> While anyone can vote, the ASF requires majority approval from the PMC
>>>>> -- i.e., at least three PMC members must vote affirmatively for
>>>>> release,
>>>>> and there must be more positive than negative votes. Releases may
>>>>> not be
>>>>> vetoed. Before voting +1 PMC members are required to download the
>>>>> signed
>>>>> source code package, compile it as provided, and test the resulting
>>>>> executable on their own platform, along with also verifying that the
>>>>> package meets the requirements of the ASF policy on releases.
>>>>>
>>>>> Please test this against many other Hadoop versions and let us know
>>>>> how
>>>>> this goes!
>>>>>
>>>>> Release notes:
>>>>> http://people.apache.org/~aching/giraph-1.0-RC0/RELEASE_NOTES.html
>>>>>
>>>>> Release artifacts:
>>>>> http://people.apache.org/~aching/giraph-1.0-RC0/
>>>>>
>>>>> Corresponding git tag:
>>>>> https://git-wip-us.apache.org/repos/asf?p=giraph.git;a=shortlog;h=refs/tags/release-1.0-RC0
>>>>>
>>>>>
>>>>>
>>>>>
>>>>> Signing keys:
>>>>> http://people.apache.org/keys/group/giraph.asc
>>>>>
>>>>> The vote runs for 72 hours, until Monday 4pm PST.
>>>>>
>>>>> Thanks everyone for your patience with this release!
>>>>>
>>>>> Avery
> 


Re: [VOTE][CHANGED] Release Giraph 1.0 (rc1)

Posted by Sebastian Schelter <ss...@apache.org>.
Hi Avery,

I see your concerns. The benchmarking question is difficult, we had very
bad experiences with Mahout in that regards. E.g., we once had a
M/R-based PageRank implementation in Mahout that uses our integer-based
vectors and removed it as we got public complaints that you can't fit
the whole web into the range of an integer. Personally, I'd also refrain
from using floats instead of doubles for benchmarks, as this simply
means you give up on accuracy.

Regarding benchmarks, I guess the best thing we could do is publish our
own numbers. The current runtimes I've seen are already very good,
Giraph beat a very optimized Stratosphere implementation that we did for
a recent paper by approx. 25%.

To conclude, I do in no way want to hold up the current release. I'm
perfectly fine with not including the patch and optimizing the
implementation for a 1.0.1 release, but then we should remove the
current examples.PageRankVertex from the 1.0 release, as the convergence
detection is broken and we should not knowingly ship bugged code.

Best,
Sebastian


On 14.04.2013 18:18, Avery Ching wrote:
> Hi Sebastian,
> 
> Thanks for the patch.  I'll try to take a look at it.
> 
> The only reason I bring the optimizations up is that a lot of folks tend
> to compare PageRank performance.  The optimizations I'm referring to are
> Giraph ones, not algorithmic ones.  We use ints, floats for ids,
> messages, respectively instead longs, doubles (1/2 network traffic) and
> IntNullArrayEdges vertex edges (efficient array backed edges) instead of
> ByteArrayEdges.  You can see
> https://issues.apache.org/jira/browse/giraph-543 for more details.
> 
> Anyway, given that we are going to ship a 1.0.1 release in a few weeks
> for a variety of reasons, should this really hold up the current
> release?  I would prefer to not cut anymore RCs unless things are
> totally broken (i.e. profiles not compiling, major Giraph bugs, etc.). 
> There are still a lot of outstanding issues in JIRA, we can't fix them
> all for the 1.0 release.
> 
> Let me know what you think.
> 
> Avery
> 
> On 4/13/13 10:46 AM, Sebastian Schelter wrote:
>> Hi Avery,
>>
>> I found the bug and can I provide a patch today or tomorrow, so
>> hopefully we can include that in the release (to not knowingly ship
>> bugged code). Furthermore I improved the code to protect against
>> rounding errors.
>>
>> I don't really get what you mean with the missing optimization in
>> comparison to the benchmark PageRank implementation.
>>
>> The implementation in o.a.g.examples.PageRankVertex aims to be a robust
>> real-world implementation. As optimization, it dismisses edge weights
>> and reuses objects where possible. Furthermore it is able to handle
>> dangling vertices that are present in almost every real-world network
>> and it automatically detects the number of supersteps to run. With the
>> patch, it should also provide improved numerical stability.
>>
>> If the runtimes doesn't look good enough when compared to the benchmark
>> implementation, this might also be caused by the dataset which has a
>> skewed degree distribution (like most real-world networks). The
>> benchmark uses a uniform degree distribution AFAIK.
>>
>> Best,
>> Sebastian
>>
>> On 13.04.2013 15:46, Avery Ching wrote:
>>> That's great Sebastian.  I would also recommend taking a look at the
>>> PageRankBenchmark for a performance comparison.  It has been a lot of
>>> speed improvements that should be a bunch faster than PageRankVertex.
>>> Even that though, is not totally optimized.  Hopefully we'll be adding a
>>> "how to optimize performance" guide in the near future.  Should we delay
>>> the release or simply just ship a 1.1, say in the next month with this
>>> fix and supporting YARN's 2.0.4?  I'd like to get on a more normal
>>> release cycle rather than once a year =).
>>>
>>> Avery
>>>
>>> On 4/13/13 3:02 AM, Sebastian Schelter wrote:
>>>> Hi there,
>>>>
>>>> I got some good and bad news, I tested PageRankVertex (not the
>>>> Benchmark
>>>> but the example implementation o.a.g.examples.PageRankVertex) from
>>>> trunk
>>>> compiled for Hadoop 1.0 on a cluster of 26 machines with 208 cores.
>>>>
>>>> I used the Webbase2001 dataset [1] which has 115M vertices and more
>>>> than
>>>> 1B edges and got some awesome running times, average superstep takes 15
>>>> seconds (!!!). Awesome work, I have to say!
>>>>
>>>> Unfortunately, there seems to be an issue with the convergence
>>>> detection, as it didn't get the correct convergence behavior. I'd like
>>>> to have a look into that this week, so we can ship a performant
>>>> PageRank
>>>> implementation which automatically runs an appropriate number of
>>>> supersteps. Hope this doesn't delay the release too much.
>>>>
>>>> Best,
>>>> Sebastian
>>>>
>>>>
>>>> [1] http://law.di.unimi.it/webdata/webbase-2001/
>>>>
>>>>
>>>> On 13.04.2013 07:39, Avery Ching wrote:
>>>>> Thanks to the quick feedback from Roman and Lewis, we have cut a
>>>>> new RC1
>>>>> that addresses the following issues.
>>>>>
>>>>> * Got rid of .git repo in tarball
>>>>> * Fixed issue with not compiling without git repo (GIRAPH-628)
>>>>> * Used gnutar in OSX rather than tar to generate the tarball and
>>>>> get rid
>>>>> of warnings
>>>>> * Pushed GIRAPH-627 to support the yarn profile better
>>>>> * Tarball name changed to the final artifact name (giraph-1.0.tar.gz)
>>>>>
>>>>> Release notes:
>>>>> http://people.apache.org/~aching/giraph-1.0-RC1/RELEASE_NOTES.html
>>>>>
>>>>> Release artifacts:
>>>>> http://people.apache.org/~aching/giraph-1.0-RC1/
>>>>>
>>>>> Corresponding git tag:
>>>>> https://git-wip-us.apache.org/repos/asf?p=giraph.git;a=shortlog;h=refs/tags/release-1.0-RC1
>>>>>
>>>>>
>>>>>
>>>>>
>>>>> Signing keys:
>>>>> http://people.apache.org/keys/group/giraph.asc
>>>>>
>>>>> The vote runs for 72 hours, until Monday 11pm PST.
>>>>>
>>>>> Thanks,
>>>>>
>>>>> Avery
>>>>>
>>>>> Original message below regarding rc0:
>>>>>
>>>>> -------------------------------
>>>>>
>>>>> Fellow Giraphers,
>>>>>
>>>>> We have a our first release candidate since graduating from
>>>>> incubation.
>>>>>    This is a source release, primarily due to the different
>>>>> versions of
>>>>> Hadoop we support with munge (similar to the 0.1 release).  Since 0.1,
>>>>> we've made A TON of progress on overall performance, optimizing memory
>>>>> use, split vertex/edge inputs, easy interoperability with Apache Hive,
>>>>> and a bunch of other areas.  In many ways, this is an almost totally
>>>>> different codebase.  Thanks everyone for your hard work!
>>>>>
>>>>> Apache Giraph has been running in production at Facebook (against
>>>>> Facebook's Corona implementation of Hadoop -
>>>>> https://github.com/facebook/hadoop-20/tree/master/src/contrib/corona)
>>>>> since around last December.  It has proven to be very scalable,
>>>>> performant, and enables a bunch of new applications.  Based on the
>>>>> drastic improvements and the use of Giraph in production, it seems
>>>>> appropriate to bump up our version to 1.0.
>>>>>
>>>>> While anyone can vote, the ASF requires majority approval from the PMC
>>>>> -- i.e., at least three PMC members must vote affirmatively for
>>>>> release,
>>>>> and there must be more positive than negative votes. Releases may
>>>>> not be
>>>>> vetoed. Before voting +1 PMC members are required to download the
>>>>> signed
>>>>> source code package, compile it as provided, and test the resulting
>>>>> executable on their own platform, along with also verifying that the
>>>>> package meets the requirements of the ASF policy on releases.
>>>>>
>>>>> Please test this against many other Hadoop versions and let us know
>>>>> how
>>>>> this goes!
>>>>>
>>>>> Release notes:
>>>>> http://people.apache.org/~aching/giraph-1.0-RC0/RELEASE_NOTES.html
>>>>>
>>>>> Release artifacts:
>>>>> http://people.apache.org/~aching/giraph-1.0-RC0/
>>>>>
>>>>> Corresponding git tag:
>>>>> https://git-wip-us.apache.org/repos/asf?p=giraph.git;a=shortlog;h=refs/tags/release-1.0-RC0
>>>>>
>>>>>
>>>>>
>>>>>
>>>>> Signing keys:
>>>>> http://people.apache.org/keys/group/giraph.asc
>>>>>
>>>>> The vote runs for 72 hours, until Monday 4pm PST.
>>>>>
>>>>> Thanks everyone for your patience with this release!
>>>>>
>>>>> Avery
> 


Re: [VOTE][CHANGED] Release Giraph 1.0 (rc1)

Posted by Avery Ching <ac...@apache.org>.
Hi Sebastian,

Thanks for the patch.  I'll try to take a look at it.

The only reason I bring the optimizations up is that a lot of folks tend 
to compare PageRank performance.  The optimizations I'm referring to are 
Giraph ones, not algorithmic ones.  We use ints, floats for ids, 
messages, respectively instead longs, doubles (1/2 network traffic) and 
IntNullArrayEdges vertex edges (efficient array backed edges) instead of 
ByteArrayEdges.  You can see 
https://issues.apache.org/jira/browse/giraph-543 for more details.

Anyway, given that we are going to ship a 1.0.1 release in a few weeks 
for a variety of reasons, should this really hold up the current 
release?  I would prefer to not cut anymore RCs unless things are 
totally broken (i.e. profiles not compiling, major Giraph bugs, etc.).  
There are still a lot of outstanding issues in JIRA, we can't fix them 
all for the 1.0 release.

Let me know what you think.

Avery

On 4/13/13 10:46 AM, Sebastian Schelter wrote:
> Hi Avery,
>
> I found the bug and can I provide a patch today or tomorrow, so
> hopefully we can include that in the release (to not knowingly ship
> bugged code). Furthermore I improved the code to protect against
> rounding errors.
>
> I don't really get what you mean with the missing optimization in
> comparison to the benchmark PageRank implementation.
>
> The implementation in o.a.g.examples.PageRankVertex aims to be a robust
> real-world implementation. As optimization, it dismisses edge weights
> and reuses objects where possible. Furthermore it is able to handle
> dangling vertices that are present in almost every real-world network
> and it automatically detects the number of supersteps to run. With the
> patch, it should also provide improved numerical stability.
>
> If the runtimes doesn't look good enough when compared to the benchmark
> implementation, this might also be caused by the dataset which has a
> skewed degree distribution (like most real-world networks). The
> benchmark uses a uniform degree distribution AFAIK.
>
> Best,
> Sebastian
>
> On 13.04.2013 15:46, Avery Ching wrote:
>> That's great Sebastian.  I would also recommend taking a look at the
>> PageRankBenchmark for a performance comparison.  It has been a lot of
>> speed improvements that should be a bunch faster than PageRankVertex.
>> Even that though, is not totally optimized.  Hopefully we'll be adding a
>> "how to optimize performance" guide in the near future.  Should we delay
>> the release or simply just ship a 1.1, say in the next month with this
>> fix and supporting YARN's 2.0.4?  I'd like to get on a more normal
>> release cycle rather than once a year =).
>>
>> Avery
>>
>> On 4/13/13 3:02 AM, Sebastian Schelter wrote:
>>> Hi there,
>>>
>>> I got some good and bad news, I tested PageRankVertex (not the Benchmark
>>> but the example implementation o.a.g.examples.PageRankVertex) from trunk
>>> compiled for Hadoop 1.0 on a cluster of 26 machines with 208 cores.
>>>
>>> I used the Webbase2001 dataset [1] which has 115M vertices and more than
>>> 1B edges and got some awesome running times, average superstep takes 15
>>> seconds (!!!). Awesome work, I have to say!
>>>
>>> Unfortunately, there seems to be an issue with the convergence
>>> detection, as it didn't get the correct convergence behavior. I'd like
>>> to have a look into that this week, so we can ship a performant PageRank
>>> implementation which automatically runs an appropriate number of
>>> supersteps. Hope this doesn't delay the release too much.
>>>
>>> Best,
>>> Sebastian
>>>
>>>
>>> [1] http://law.di.unimi.it/webdata/webbase-2001/
>>>
>>>
>>> On 13.04.2013 07:39, Avery Ching wrote:
>>>> Thanks to the quick feedback from Roman and Lewis, we have cut a new RC1
>>>> that addresses the following issues.
>>>>
>>>> * Got rid of .git repo in tarball
>>>> * Fixed issue with not compiling without git repo (GIRAPH-628)
>>>> * Used gnutar in OSX rather than tar to generate the tarball and get rid
>>>> of warnings
>>>> * Pushed GIRAPH-627 to support the yarn profile better
>>>> * Tarball name changed to the final artifact name (giraph-1.0.tar.gz)
>>>>
>>>> Release notes:
>>>> http://people.apache.org/~aching/giraph-1.0-RC1/RELEASE_NOTES.html
>>>>
>>>> Release artifacts:
>>>> http://people.apache.org/~aching/giraph-1.0-RC1/
>>>>
>>>> Corresponding git tag:
>>>> https://git-wip-us.apache.org/repos/asf?p=giraph.git;a=shortlog;h=refs/tags/release-1.0-RC1
>>>>
>>>>
>>>>
>>>> Signing keys:
>>>> http://people.apache.org/keys/group/giraph.asc
>>>>
>>>> The vote runs for 72 hours, until Monday 11pm PST.
>>>>
>>>> Thanks,
>>>>
>>>> Avery
>>>>
>>>> Original message below regarding rc0:
>>>>
>>>> -------------------------------
>>>>
>>>> Fellow Giraphers,
>>>>
>>>> We have a our first release candidate since graduating from incubation.
>>>>    This is a source release, primarily due to the different versions of
>>>> Hadoop we support with munge (similar to the 0.1 release).  Since 0.1,
>>>> we've made A TON of progress on overall performance, optimizing memory
>>>> use, split vertex/edge inputs, easy interoperability with Apache Hive,
>>>> and a bunch of other areas.  In many ways, this is an almost totally
>>>> different codebase.  Thanks everyone for your hard work!
>>>>
>>>> Apache Giraph has been running in production at Facebook (against
>>>> Facebook's Corona implementation of Hadoop -
>>>> https://github.com/facebook/hadoop-20/tree/master/src/contrib/corona)
>>>> since around last December.  It has proven to be very scalable,
>>>> performant, and enables a bunch of new applications.  Based on the
>>>> drastic improvements and the use of Giraph in production, it seems
>>>> appropriate to bump up our version to 1.0.
>>>>
>>>> While anyone can vote, the ASF requires majority approval from the PMC
>>>> -- i.e., at least three PMC members must vote affirmatively for release,
>>>> and there must be more positive than negative votes. Releases may not be
>>>> vetoed. Before voting +1 PMC members are required to download the signed
>>>> source code package, compile it as provided, and test the resulting
>>>> executable on their own platform, along with also verifying that the
>>>> package meets the requirements of the ASF policy on releases.
>>>>
>>>> Please test this against many other Hadoop versions and let us know how
>>>> this goes!
>>>>
>>>> Release notes:
>>>> http://people.apache.org/~aching/giraph-1.0-RC0/RELEASE_NOTES.html
>>>>
>>>> Release artifacts:
>>>> http://people.apache.org/~aching/giraph-1.0-RC0/
>>>>
>>>> Corresponding git tag:
>>>> https://git-wip-us.apache.org/repos/asf?p=giraph.git;a=shortlog;h=refs/tags/release-1.0-RC0
>>>>
>>>>
>>>>
>>>> Signing keys:
>>>> http://people.apache.org/keys/group/giraph.asc
>>>>
>>>> The vote runs for 72 hours, until Monday 4pm PST.
>>>>
>>>> Thanks everyone for your patience with this release!
>>>>
>>>> Avery


Re: [VOTE][CHANGED] Release Giraph 1.0 (rc1)

Posted by Avery Ching <ac...@apache.org>.
Hi Sebastian,

Thanks for the patch.  I'll try to take a look at it.

The only reason I bring the optimizations up is that a lot of folks tend 
to compare PageRank performance.  The optimizations I'm referring to are 
Giraph ones, not algorithmic ones.  We use ints, floats for ids, 
messages, respectively instead longs, doubles (1/2 network traffic) and 
IntNullArrayEdges vertex edges (efficient array backed edges) instead of 
ByteArrayEdges.  You can see 
https://issues.apache.org/jira/browse/giraph-543 for more details.

Anyway, given that we are going to ship a 1.0.1 release in a few weeks 
for a variety of reasons, should this really hold up the current 
release?  I would prefer to not cut anymore RCs unless things are 
totally broken (i.e. profiles not compiling, major Giraph bugs, etc.).  
There are still a lot of outstanding issues in JIRA, we can't fix them 
all for the 1.0 release.

Let me know what you think.

Avery

On 4/13/13 10:46 AM, Sebastian Schelter wrote:
> Hi Avery,
>
> I found the bug and can I provide a patch today or tomorrow, so
> hopefully we can include that in the release (to not knowingly ship
> bugged code). Furthermore I improved the code to protect against
> rounding errors.
>
> I don't really get what you mean with the missing optimization in
> comparison to the benchmark PageRank implementation.
>
> The implementation in o.a.g.examples.PageRankVertex aims to be a robust
> real-world implementation. As optimization, it dismisses edge weights
> and reuses objects where possible. Furthermore it is able to handle
> dangling vertices that are present in almost every real-world network
> and it automatically detects the number of supersteps to run. With the
> patch, it should also provide improved numerical stability.
>
> If the runtimes doesn't look good enough when compared to the benchmark
> implementation, this might also be caused by the dataset which has a
> skewed degree distribution (like most real-world networks). The
> benchmark uses a uniform degree distribution AFAIK.
>
> Best,
> Sebastian
>
> On 13.04.2013 15:46, Avery Ching wrote:
>> That's great Sebastian.  I would also recommend taking a look at the
>> PageRankBenchmark for a performance comparison.  It has been a lot of
>> speed improvements that should be a bunch faster than PageRankVertex.
>> Even that though, is not totally optimized.  Hopefully we'll be adding a
>> "how to optimize performance" guide in the near future.  Should we delay
>> the release or simply just ship a 1.1, say in the next month with this
>> fix and supporting YARN's 2.0.4?  I'd like to get on a more normal
>> release cycle rather than once a year =).
>>
>> Avery
>>
>> On 4/13/13 3:02 AM, Sebastian Schelter wrote:
>>> Hi there,
>>>
>>> I got some good and bad news, I tested PageRankVertex (not the Benchmark
>>> but the example implementation o.a.g.examples.PageRankVertex) from trunk
>>> compiled for Hadoop 1.0 on a cluster of 26 machines with 208 cores.
>>>
>>> I used the Webbase2001 dataset [1] which has 115M vertices and more than
>>> 1B edges and got some awesome running times, average superstep takes 15
>>> seconds (!!!). Awesome work, I have to say!
>>>
>>> Unfortunately, there seems to be an issue with the convergence
>>> detection, as it didn't get the correct convergence behavior. I'd like
>>> to have a look into that this week, so we can ship a performant PageRank
>>> implementation which automatically runs an appropriate number of
>>> supersteps. Hope this doesn't delay the release too much.
>>>
>>> Best,
>>> Sebastian
>>>
>>>
>>> [1] http://law.di.unimi.it/webdata/webbase-2001/
>>>
>>>
>>> On 13.04.2013 07:39, Avery Ching wrote:
>>>> Thanks to the quick feedback from Roman and Lewis, we have cut a new RC1
>>>> that addresses the following issues.
>>>>
>>>> * Got rid of .git repo in tarball
>>>> * Fixed issue with not compiling without git repo (GIRAPH-628)
>>>> * Used gnutar in OSX rather than tar to generate the tarball and get rid
>>>> of warnings
>>>> * Pushed GIRAPH-627 to support the yarn profile better
>>>> * Tarball name changed to the final artifact name (giraph-1.0.tar.gz)
>>>>
>>>> Release notes:
>>>> http://people.apache.org/~aching/giraph-1.0-RC1/RELEASE_NOTES.html
>>>>
>>>> Release artifacts:
>>>> http://people.apache.org/~aching/giraph-1.0-RC1/
>>>>
>>>> Corresponding git tag:
>>>> https://git-wip-us.apache.org/repos/asf?p=giraph.git;a=shortlog;h=refs/tags/release-1.0-RC1
>>>>
>>>>
>>>>
>>>> Signing keys:
>>>> http://people.apache.org/keys/group/giraph.asc
>>>>
>>>> The vote runs for 72 hours, until Monday 11pm PST.
>>>>
>>>> Thanks,
>>>>
>>>> Avery
>>>>
>>>> Original message below regarding rc0:
>>>>
>>>> -------------------------------
>>>>
>>>> Fellow Giraphers,
>>>>
>>>> We have a our first release candidate since graduating from incubation.
>>>>    This is a source release, primarily due to the different versions of
>>>> Hadoop we support with munge (similar to the 0.1 release).  Since 0.1,
>>>> we've made A TON of progress on overall performance, optimizing memory
>>>> use, split vertex/edge inputs, easy interoperability with Apache Hive,
>>>> and a bunch of other areas.  In many ways, this is an almost totally
>>>> different codebase.  Thanks everyone for your hard work!
>>>>
>>>> Apache Giraph has been running in production at Facebook (against
>>>> Facebook's Corona implementation of Hadoop -
>>>> https://github.com/facebook/hadoop-20/tree/master/src/contrib/corona)
>>>> since around last December.  It has proven to be very scalable,
>>>> performant, and enables a bunch of new applications.  Based on the
>>>> drastic improvements and the use of Giraph in production, it seems
>>>> appropriate to bump up our version to 1.0.
>>>>
>>>> While anyone can vote, the ASF requires majority approval from the PMC
>>>> -- i.e., at least three PMC members must vote affirmatively for release,
>>>> and there must be more positive than negative votes. Releases may not be
>>>> vetoed. Before voting +1 PMC members are required to download the signed
>>>> source code package, compile it as provided, and test the resulting
>>>> executable on their own platform, along with also verifying that the
>>>> package meets the requirements of the ASF policy on releases.
>>>>
>>>> Please test this against many other Hadoop versions and let us know how
>>>> this goes!
>>>>
>>>> Release notes:
>>>> http://people.apache.org/~aching/giraph-1.0-RC0/RELEASE_NOTES.html
>>>>
>>>> Release artifacts:
>>>> http://people.apache.org/~aching/giraph-1.0-RC0/
>>>>
>>>> Corresponding git tag:
>>>> https://git-wip-us.apache.org/repos/asf?p=giraph.git;a=shortlog;h=refs/tags/release-1.0-RC0
>>>>
>>>>
>>>>
>>>> Signing keys:
>>>> http://people.apache.org/keys/group/giraph.asc
>>>>
>>>> The vote runs for 72 hours, until Monday 4pm PST.
>>>>
>>>> Thanks everyone for your patience with this release!
>>>>
>>>> Avery


Re: [VOTE][CHANGED] Release Giraph 1.0 (rc1)

Posted by Sebastian Schelter <ss...@apache.org>.
Hi Avery,

I found the bug and can I provide a patch today or tomorrow, so
hopefully we can include that in the release (to not knowingly ship
bugged code). Furthermore I improved the code to protect against
rounding errors.

I don't really get what you mean with the missing optimization in
comparison to the benchmark PageRank implementation.

The implementation in o.a.g.examples.PageRankVertex aims to be a robust
real-world implementation. As optimization, it dismisses edge weights
and reuses objects where possible. Furthermore it is able to handle
dangling vertices that are present in almost every real-world network
and it automatically detects the number of supersteps to run. With the
patch, it should also provide improved numerical stability.

If the runtimes doesn't look good enough when compared to the benchmark
implementation, this might also be caused by the dataset which has a
skewed degree distribution (like most real-world networks). The
benchmark uses a uniform degree distribution AFAIK.

Best,
Sebastian

On 13.04.2013 15:46, Avery Ching wrote:
> That's great Sebastian.  I would also recommend taking a look at the
> PageRankBenchmark for a performance comparison.  It has been a lot of
> speed improvements that should be a bunch faster than PageRankVertex. 
> Even that though, is not totally optimized.  Hopefully we'll be adding a
> "how to optimize performance" guide in the near future.  Should we delay
> the release or simply just ship a 1.1, say in the next month with this
> fix and supporting YARN's 2.0.4?  I'd like to get on a more normal
> release cycle rather than once a year =).
> 
> Avery
> 
> On 4/13/13 3:02 AM, Sebastian Schelter wrote:
>> Hi there,
>>
>> I got some good and bad news, I tested PageRankVertex (not the Benchmark
>> but the example implementation o.a.g.examples.PageRankVertex) from trunk
>> compiled for Hadoop 1.0 on a cluster of 26 machines with 208 cores.
>>
>> I used the Webbase2001 dataset [1] which has 115M vertices and more than
>> 1B edges and got some awesome running times, average superstep takes 15
>> seconds (!!!). Awesome work, I have to say!
>>
>> Unfortunately, there seems to be an issue with the convergence
>> detection, as it didn't get the correct convergence behavior. I'd like
>> to have a look into that this week, so we can ship a performant PageRank
>> implementation which automatically runs an appropriate number of
>> supersteps. Hope this doesn't delay the release too much.
>>
>> Best,
>> Sebastian
>>
>>
>> [1] http://law.di.unimi.it/webdata/webbase-2001/
>>
>>
>> On 13.04.2013 07:39, Avery Ching wrote:
>>> Thanks to the quick feedback from Roman and Lewis, we have cut a new RC1
>>> that addresses the following issues.
>>>
>>> * Got rid of .git repo in tarball
>>> * Fixed issue with not compiling without git repo (GIRAPH-628)
>>> * Used gnutar in OSX rather than tar to generate the tarball and get rid
>>> of warnings
>>> * Pushed GIRAPH-627 to support the yarn profile better
>>> * Tarball name changed to the final artifact name (giraph-1.0.tar.gz)
>>>
>>> Release notes:
>>> http://people.apache.org/~aching/giraph-1.0-RC1/RELEASE_NOTES.html
>>>
>>> Release artifacts:
>>> http://people.apache.org/~aching/giraph-1.0-RC1/
>>>
>>> Corresponding git tag:
>>> https://git-wip-us.apache.org/repos/asf?p=giraph.git;a=shortlog;h=refs/tags/release-1.0-RC1
>>>
>>>
>>>
>>> Signing keys:
>>> http://people.apache.org/keys/group/giraph.asc
>>>
>>> The vote runs for 72 hours, until Monday 11pm PST.
>>>
>>> Thanks,
>>>
>>> Avery
>>>
>>> Original message below regarding rc0:
>>>
>>> -------------------------------
>>>
>>> Fellow Giraphers,
>>>
>>> We have a our first release candidate since graduating from incubation.
>>>   This is a source release, primarily due to the different versions of
>>> Hadoop we support with munge (similar to the 0.1 release).  Since 0.1,
>>> we've made A TON of progress on overall performance, optimizing memory
>>> use, split vertex/edge inputs, easy interoperability with Apache Hive,
>>> and a bunch of other areas.  In many ways, this is an almost totally
>>> different codebase.  Thanks everyone for your hard work!
>>>
>>> Apache Giraph has been running in production at Facebook (against
>>> Facebook's Corona implementation of Hadoop -
>>> https://github.com/facebook/hadoop-20/tree/master/src/contrib/corona)
>>> since around last December.  It has proven to be very scalable,
>>> performant, and enables a bunch of new applications.  Based on the
>>> drastic improvements and the use of Giraph in production, it seems
>>> appropriate to bump up our version to 1.0.
>>>
>>> While anyone can vote, the ASF requires majority approval from the PMC
>>> -- i.e., at least three PMC members must vote affirmatively for release,
>>> and there must be more positive than negative votes. Releases may not be
>>> vetoed. Before voting +1 PMC members are required to download the signed
>>> source code package, compile it as provided, and test the resulting
>>> executable on their own platform, along with also verifying that the
>>> package meets the requirements of the ASF policy on releases.
>>>
>>> Please test this against many other Hadoop versions and let us know how
>>> this goes!
>>>
>>> Release notes:
>>> http://people.apache.org/~aching/giraph-1.0-RC0/RELEASE_NOTES.html
>>>
>>> Release artifacts:
>>> http://people.apache.org/~aching/giraph-1.0-RC0/
>>>
>>> Corresponding git tag:
>>> https://git-wip-us.apache.org/repos/asf?p=giraph.git;a=shortlog;h=refs/tags/release-1.0-RC0
>>>
>>>
>>>
>>> Signing keys:
>>> http://people.apache.org/keys/group/giraph.asc
>>>
>>> The vote runs for 72 hours, until Monday 4pm PST.
>>>
>>> Thanks everyone for your patience with this release!
>>>
>>> Avery
> 


Re: [VOTE][CHANGED] Release Giraph 1.0 (rc1)

Posted by Roman Shaposhnik <rv...@apache.org>.
On Sat, Apr 13, 2013 at 6:46 AM, Avery Ching <ac...@apache.org> wrote:
> That's great Sebastian.  I would also recommend taking a look at the
> PageRankBenchmark for a performance comparison.  It has been a lot of speed
> improvements that should be a bunch faster than PageRankVertex.  Even that
> though, is not totally optimized.  Hopefully we'll be adding a "how to
> optimize performance" guide in the near future.  Should we delay the release
> or simply just ship a 1.1, say in the next month with this fix and
> supporting YARN's 2.0.4?  I'd like to get on a more normal release cycle
> rather than once a year =).

Excellent point. The release process that I have seen work very well in
the past is that you get on a regular date-driven release schedule (lets
say quarterly) but you still allow for patch release with very limited
scope from time to time. Path releases are NOT date driven -- they
go out when all the patches required are committed to trunk and back
ported to the appropriate branch.

So, would the following work for us: we go ahead with 1.0 release as planned
(unless showstoppers get identified in the next 72 hours). But we all agree
to 1.0.1 with scope strictly limited to YARN stabilization on top of
Hadoop 2.0.4.
If we get all that we need for 1.0.1 in a week -- we'll release in a week if
it takes us 3 weeks -- we'll release in 3 weeks and so on.


Thanks,
Roman.

Re: [VOTE][CHANGED] Release Giraph 1.0 (rc1)

Posted by Sebastian Schelter <ss...@apache.org>.
Hi Avery,

I found the bug and can I provide a patch today or tomorrow, so
hopefully we can include that in the release (to not knowingly ship
bugged code). Furthermore I improved the code to protect against
rounding errors.

I don't really get what you mean with the missing optimization in
comparison to the benchmark PageRank implementation.

The implementation in o.a.g.examples.PageRankVertex aims to be a robust
real-world implementation. As optimization, it dismisses edge weights
and reuses objects where possible. Furthermore it is able to handle
dangling vertices that are present in almost every real-world network
and it automatically detects the number of supersteps to run. With the
patch, it should also provide improved numerical stability.

If the runtimes doesn't look good enough when compared to the benchmark
implementation, this might also be caused by the dataset which has a
skewed degree distribution (like most real-world networks). The
benchmark uses a uniform degree distribution AFAIK.

Best,
Sebastian

On 13.04.2013 15:46, Avery Ching wrote:
> That's great Sebastian.  I would also recommend taking a look at the
> PageRankBenchmark for a performance comparison.  It has been a lot of
> speed improvements that should be a bunch faster than PageRankVertex. 
> Even that though, is not totally optimized.  Hopefully we'll be adding a
> "how to optimize performance" guide in the near future.  Should we delay
> the release or simply just ship a 1.1, say in the next month with this
> fix and supporting YARN's 2.0.4?  I'd like to get on a more normal
> release cycle rather than once a year =).
> 
> Avery
> 
> On 4/13/13 3:02 AM, Sebastian Schelter wrote:
>> Hi there,
>>
>> I got some good and bad news, I tested PageRankVertex (not the Benchmark
>> but the example implementation o.a.g.examples.PageRankVertex) from trunk
>> compiled for Hadoop 1.0 on a cluster of 26 machines with 208 cores.
>>
>> I used the Webbase2001 dataset [1] which has 115M vertices and more than
>> 1B edges and got some awesome running times, average superstep takes 15
>> seconds (!!!). Awesome work, I have to say!
>>
>> Unfortunately, there seems to be an issue with the convergence
>> detection, as it didn't get the correct convergence behavior. I'd like
>> to have a look into that this week, so we can ship a performant PageRank
>> implementation which automatically runs an appropriate number of
>> supersteps. Hope this doesn't delay the release too much.
>>
>> Best,
>> Sebastian
>>
>>
>> [1] http://law.di.unimi.it/webdata/webbase-2001/
>>
>>
>> On 13.04.2013 07:39, Avery Ching wrote:
>>> Thanks to the quick feedback from Roman and Lewis, we have cut a new RC1
>>> that addresses the following issues.
>>>
>>> * Got rid of .git repo in tarball
>>> * Fixed issue with not compiling without git repo (GIRAPH-628)
>>> * Used gnutar in OSX rather than tar to generate the tarball and get rid
>>> of warnings
>>> * Pushed GIRAPH-627 to support the yarn profile better
>>> * Tarball name changed to the final artifact name (giraph-1.0.tar.gz)
>>>
>>> Release notes:
>>> http://people.apache.org/~aching/giraph-1.0-RC1/RELEASE_NOTES.html
>>>
>>> Release artifacts:
>>> http://people.apache.org/~aching/giraph-1.0-RC1/
>>>
>>> Corresponding git tag:
>>> https://git-wip-us.apache.org/repos/asf?p=giraph.git;a=shortlog;h=refs/tags/release-1.0-RC1
>>>
>>>
>>>
>>> Signing keys:
>>> http://people.apache.org/keys/group/giraph.asc
>>>
>>> The vote runs for 72 hours, until Monday 11pm PST.
>>>
>>> Thanks,
>>>
>>> Avery
>>>
>>> Original message below regarding rc0:
>>>
>>> -------------------------------
>>>
>>> Fellow Giraphers,
>>>
>>> We have a our first release candidate since graduating from incubation.
>>>   This is a source release, primarily due to the different versions of
>>> Hadoop we support with munge (similar to the 0.1 release).  Since 0.1,
>>> we've made A TON of progress on overall performance, optimizing memory
>>> use, split vertex/edge inputs, easy interoperability with Apache Hive,
>>> and a bunch of other areas.  In many ways, this is an almost totally
>>> different codebase.  Thanks everyone for your hard work!
>>>
>>> Apache Giraph has been running in production at Facebook (against
>>> Facebook's Corona implementation of Hadoop -
>>> https://github.com/facebook/hadoop-20/tree/master/src/contrib/corona)
>>> since around last December.  It has proven to be very scalable,
>>> performant, and enables a bunch of new applications.  Based on the
>>> drastic improvements and the use of Giraph in production, it seems
>>> appropriate to bump up our version to 1.0.
>>>
>>> While anyone can vote, the ASF requires majority approval from the PMC
>>> -- i.e., at least three PMC members must vote affirmatively for release,
>>> and there must be more positive than negative votes. Releases may not be
>>> vetoed. Before voting +1 PMC members are required to download the signed
>>> source code package, compile it as provided, and test the resulting
>>> executable on their own platform, along with also verifying that the
>>> package meets the requirements of the ASF policy on releases.
>>>
>>> Please test this against many other Hadoop versions and let us know how
>>> this goes!
>>>
>>> Release notes:
>>> http://people.apache.org/~aching/giraph-1.0-RC0/RELEASE_NOTES.html
>>>
>>> Release artifacts:
>>> http://people.apache.org/~aching/giraph-1.0-RC0/
>>>
>>> Corresponding git tag:
>>> https://git-wip-us.apache.org/repos/asf?p=giraph.git;a=shortlog;h=refs/tags/release-1.0-RC0
>>>
>>>
>>>
>>> Signing keys:
>>> http://people.apache.org/keys/group/giraph.asc
>>>
>>> The vote runs for 72 hours, until Monday 4pm PST.
>>>
>>> Thanks everyone for your patience with this release!
>>>
>>> Avery
> 


Re: [VOTE][CHANGED] Release Giraph 1.0 (rc1)

Posted by Avery Ching <ac...@apache.org>.
That's great Sebastian.  I would also recommend taking a look at the 
PageRankBenchmark for a performance comparison.  It has been a lot of 
speed improvements that should be a bunch faster than PageRankVertex.  
Even that though, is not totally optimized.  Hopefully we'll be adding a 
"how to optimize performance" guide in the near future.  Should we delay 
the release or simply just ship a 1.1, say in the next month with this 
fix and supporting YARN's 2.0.4?  I'd like to get on a more normal 
release cycle rather than once a year =).

Avery

On 4/13/13 3:02 AM, Sebastian Schelter wrote:
> Hi there,
>
> I got some good and bad news, I tested PageRankVertex (not the Benchmark
> but the example implementation o.a.g.examples.PageRankVertex) from trunk
> compiled for Hadoop 1.0 on a cluster of 26 machines with 208 cores.
>
> I used the Webbase2001 dataset [1] which has 115M vertices and more than
> 1B edges and got some awesome running times, average superstep takes 15
> seconds (!!!). Awesome work, I have to say!
>
> Unfortunately, there seems to be an issue with the convergence
> detection, as it didn't get the correct convergence behavior. I'd like
> to have a look into that this week, so we can ship a performant PageRank
> implementation which automatically runs an appropriate number of
> supersteps. Hope this doesn't delay the release too much.
>
> Best,
> Sebastian
>
>
> [1] http://law.di.unimi.it/webdata/webbase-2001/
>
>
> On 13.04.2013 07:39, Avery Ching wrote:
>> Thanks to the quick feedback from Roman and Lewis, we have cut a new RC1
>> that addresses the following issues.
>>
>> * Got rid of .git repo in tarball
>> * Fixed issue with not compiling without git repo (GIRAPH-628)
>> * Used gnutar in OSX rather than tar to generate the tarball and get rid
>> of warnings
>> * Pushed GIRAPH-627 to support the yarn profile better
>> * Tarball name changed to the final artifact name (giraph-1.0.tar.gz)
>>
>> Release notes:
>> http://people.apache.org/~aching/giraph-1.0-RC1/RELEASE_NOTES.html
>>
>> Release artifacts:
>> http://people.apache.org/~aching/giraph-1.0-RC1/
>>
>> Corresponding git tag:
>> https://git-wip-us.apache.org/repos/asf?p=giraph.git;a=shortlog;h=refs/tags/release-1.0-RC1
>>
>>
>> Signing keys:
>> http://people.apache.org/keys/group/giraph.asc
>>
>> The vote runs for 72 hours, until Monday 11pm PST.
>>
>> Thanks,
>>
>> Avery
>>
>> Original message below regarding rc0:
>>
>> -------------------------------
>>
>> Fellow Giraphers,
>>
>> We have a our first release candidate since graduating from incubation.
>>   This is a source release, primarily due to the different versions of
>> Hadoop we support with munge (similar to the 0.1 release).  Since 0.1,
>> we've made A TON of progress on overall performance, optimizing memory
>> use, split vertex/edge inputs, easy interoperability with Apache Hive,
>> and a bunch of other areas.  In many ways, this is an almost totally
>> different codebase.  Thanks everyone for your hard work!
>>
>> Apache Giraph has been running in production at Facebook (against
>> Facebook's Corona implementation of Hadoop -
>> https://github.com/facebook/hadoop-20/tree/master/src/contrib/corona)
>> since around last December.  It has proven to be very scalable,
>> performant, and enables a bunch of new applications.  Based on the
>> drastic improvements and the use of Giraph in production, it seems
>> appropriate to bump up our version to 1.0.
>>
>> While anyone can vote, the ASF requires majority approval from the PMC
>> -- i.e., at least three PMC members must vote affirmatively for release,
>> and there must be more positive than negative votes. Releases may not be
>> vetoed. Before voting +1 PMC members are required to download the signed
>> source code package, compile it as provided, and test the resulting
>> executable on their own platform, along with also verifying that the
>> package meets the requirements of the ASF policy on releases.
>>
>> Please test this against many other Hadoop versions and let us know how
>> this goes!
>>
>> Release notes:
>> http://people.apache.org/~aching/giraph-1.0-RC0/RELEASE_NOTES.html
>>
>> Release artifacts:
>> http://people.apache.org/~aching/giraph-1.0-RC0/
>>
>> Corresponding git tag:
>> https://git-wip-us.apache.org/repos/asf?p=giraph.git;a=shortlog;h=refs/tags/release-1.0-RC0
>>
>>
>> Signing keys:
>> http://people.apache.org/keys/group/giraph.asc
>>
>> The vote runs for 72 hours, until Monday 4pm PST.
>>
>> Thanks everyone for your patience with this release!
>>
>> Avery


Re: [VOTE][CHANGED] Release Giraph 1.0 (rc1)

Posted by Avery Ching <ac...@apache.org>.
That's great Sebastian.  I would also recommend taking a look at the 
PageRankBenchmark for a performance comparison.  It has been a lot of 
speed improvements that should be a bunch faster than PageRankVertex.  
Even that though, is not totally optimized.  Hopefully we'll be adding a 
"how to optimize performance" guide in the near future.  Should we delay 
the release or simply just ship a 1.1, say in the next month with this 
fix and supporting YARN's 2.0.4?  I'd like to get on a more normal 
release cycle rather than once a year =).

Avery

On 4/13/13 3:02 AM, Sebastian Schelter wrote:
> Hi there,
>
> I got some good and bad news, I tested PageRankVertex (not the Benchmark
> but the example implementation o.a.g.examples.PageRankVertex) from trunk
> compiled for Hadoop 1.0 on a cluster of 26 machines with 208 cores.
>
> I used the Webbase2001 dataset [1] which has 115M vertices and more than
> 1B edges and got some awesome running times, average superstep takes 15
> seconds (!!!). Awesome work, I have to say!
>
> Unfortunately, there seems to be an issue with the convergence
> detection, as it didn't get the correct convergence behavior. I'd like
> to have a look into that this week, so we can ship a performant PageRank
> implementation which automatically runs an appropriate number of
> supersteps. Hope this doesn't delay the release too much.
>
> Best,
> Sebastian
>
>
> [1] http://law.di.unimi.it/webdata/webbase-2001/
>
>
> On 13.04.2013 07:39, Avery Ching wrote:
>> Thanks to the quick feedback from Roman and Lewis, we have cut a new RC1
>> that addresses the following issues.
>>
>> * Got rid of .git repo in tarball
>> * Fixed issue with not compiling without git repo (GIRAPH-628)
>> * Used gnutar in OSX rather than tar to generate the tarball and get rid
>> of warnings
>> * Pushed GIRAPH-627 to support the yarn profile better
>> * Tarball name changed to the final artifact name (giraph-1.0.tar.gz)
>>
>> Release notes:
>> http://people.apache.org/~aching/giraph-1.0-RC1/RELEASE_NOTES.html
>>
>> Release artifacts:
>> http://people.apache.org/~aching/giraph-1.0-RC1/
>>
>> Corresponding git tag:
>> https://git-wip-us.apache.org/repos/asf?p=giraph.git;a=shortlog;h=refs/tags/release-1.0-RC1
>>
>>
>> Signing keys:
>> http://people.apache.org/keys/group/giraph.asc
>>
>> The vote runs for 72 hours, until Monday 11pm PST.
>>
>> Thanks,
>>
>> Avery
>>
>> Original message below regarding rc0:
>>
>> -------------------------------
>>
>> Fellow Giraphers,
>>
>> We have a our first release candidate since graduating from incubation.
>>   This is a source release, primarily due to the different versions of
>> Hadoop we support with munge (similar to the 0.1 release).  Since 0.1,
>> we've made A TON of progress on overall performance, optimizing memory
>> use, split vertex/edge inputs, easy interoperability with Apache Hive,
>> and a bunch of other areas.  In many ways, this is an almost totally
>> different codebase.  Thanks everyone for your hard work!
>>
>> Apache Giraph has been running in production at Facebook (against
>> Facebook's Corona implementation of Hadoop -
>> https://github.com/facebook/hadoop-20/tree/master/src/contrib/corona)
>> since around last December.  It has proven to be very scalable,
>> performant, and enables a bunch of new applications.  Based on the
>> drastic improvements and the use of Giraph in production, it seems
>> appropriate to bump up our version to 1.0.
>>
>> While anyone can vote, the ASF requires majority approval from the PMC
>> -- i.e., at least three PMC members must vote affirmatively for release,
>> and there must be more positive than negative votes. Releases may not be
>> vetoed. Before voting +1 PMC members are required to download the signed
>> source code package, compile it as provided, and test the resulting
>> executable on their own platform, along with also verifying that the
>> package meets the requirements of the ASF policy on releases.
>>
>> Please test this against many other Hadoop versions and let us know how
>> this goes!
>>
>> Release notes:
>> http://people.apache.org/~aching/giraph-1.0-RC0/RELEASE_NOTES.html
>>
>> Release artifacts:
>> http://people.apache.org/~aching/giraph-1.0-RC0/
>>
>> Corresponding git tag:
>> https://git-wip-us.apache.org/repos/asf?p=giraph.git;a=shortlog;h=refs/tags/release-1.0-RC0
>>
>>
>> Signing keys:
>> http://people.apache.org/keys/group/giraph.asc
>>
>> The vote runs for 72 hours, until Monday 4pm PST.
>>
>> Thanks everyone for your patience with this release!
>>
>> Avery


Re: [VOTE][CHANGED] Release Giraph 1.0 (rc1)

Posted by Sebastian Schelter <ss...@apache.org>.
Hi there,

I got some good and bad news, I tested PageRankVertex (not the Benchmark
but the example implementation o.a.g.examples.PageRankVertex) from trunk
compiled for Hadoop 1.0 on a cluster of 26 machines with 208 cores.

I used the Webbase2001 dataset [1] which has 115M vertices and more than
1B edges and got some awesome running times, average superstep takes 15
seconds (!!!). Awesome work, I have to say!

Unfortunately, there seems to be an issue with the convergence
detection, as it didn't get the correct convergence behavior. I'd like
to have a look into that this week, so we can ship a performant PageRank
implementation which automatically runs an appropriate number of
supersteps. Hope this doesn't delay the release too much.

Best,
Sebastian


[1] http://law.di.unimi.it/webdata/webbase-2001/


On 13.04.2013 07:39, Avery Ching wrote:
> Thanks to the quick feedback from Roman and Lewis, we have cut a new RC1
> that addresses the following issues.
> 
> * Got rid of .git repo in tarball
> * Fixed issue with not compiling without git repo (GIRAPH-628)
> * Used gnutar in OSX rather than tar to generate the tarball and get rid
> of warnings
> * Pushed GIRAPH-627 to support the yarn profile better
> * Tarball name changed to the final artifact name (giraph-1.0.tar.gz)
> 
> Release notes:
> http://people.apache.org/~aching/giraph-1.0-RC1/RELEASE_NOTES.html
> 
> Release artifacts:
> http://people.apache.org/~aching/giraph-1.0-RC1/
> 
> Corresponding git tag:
> https://git-wip-us.apache.org/repos/asf?p=giraph.git;a=shortlog;h=refs/tags/release-1.0-RC1
> 
> 
> Signing keys:
> http://people.apache.org/keys/group/giraph.asc
> 
> The vote runs for 72 hours, until Monday 11pm PST.
> 
> Thanks,
> 
> Avery
> 
> Original message below regarding rc0:
> 
> -------------------------------
> 
> Fellow Giraphers,
> 
> We have a our first release candidate since graduating from incubation.
>  This is a source release, primarily due to the different versions of
> Hadoop we support with munge (similar to the 0.1 release).  Since 0.1,
> we've made A TON of progress on overall performance, optimizing memory
> use, split vertex/edge inputs, easy interoperability with Apache Hive,
> and a bunch of other areas.  In many ways, this is an almost totally
> different codebase.  Thanks everyone for your hard work!
> 
> Apache Giraph has been running in production at Facebook (against
> Facebook's Corona implementation of Hadoop -
> https://github.com/facebook/hadoop-20/tree/master/src/contrib/corona)
> since around last December.  It has proven to be very scalable,
> performant, and enables a bunch of new applications.  Based on the
> drastic improvements and the use of Giraph in production, it seems
> appropriate to bump up our version to 1.0.
> 
> While anyone can vote, the ASF requires majority approval from the PMC
> -- i.e., at least three PMC members must vote affirmatively for release,
> and there must be more positive than negative votes. Releases may not be
> vetoed. Before voting +1 PMC members are required to download the signed
> source code package, compile it as provided, and test the resulting
> executable on their own platform, along with also verifying that the
> package meets the requirements of the ASF policy on releases.
> 
> Please test this against many other Hadoop versions and let us know how
> this goes!
> 
> Release notes:
> http://people.apache.org/~aching/giraph-1.0-RC0/RELEASE_NOTES.html
> 
> Release artifacts:
> http://people.apache.org/~aching/giraph-1.0-RC0/
> 
> Corresponding git tag:
> https://git-wip-us.apache.org/repos/asf?p=giraph.git;a=shortlog;h=refs/tags/release-1.0-RC0
> 
> 
> Signing keys:
> http://people.apache.org/keys/group/giraph.asc
> 
> The vote runs for 72 hours, until Monday 4pm PST.
> 
> Thanks everyone for your patience with this release!
> 
> Avery


Re: [VOTE][CHANGED] Release Giraph 1.0.0 (rc3)

Posted by Eli Reisman <ap...@gmail.com>.
My instinct is to immediately say yes but I don't think I should commit to
that right now. I would be happy to be part of the discussion as to ideas,
proposed methods to replace/remove it. But mentoring might spread me too
thin this summer, as you know ;)


On Sun, Apr 21, 2013 at 2:44 PM, Claudio Martella <
claudio.martella@gmail.com> wrote:

> Also, just to make sure we're on the same page on the munge topic and GSoC,
> I started the bureaucracy for the project, but I won't be able to mentor
> also that project. I plan to mentor the tinkerpop project, in case we get a
> student. In case we get students for both projects, somebody would step in
> for the munge one?
>
>
> On Sun, Apr 21, 2013 at 3:06 AM, Claudio Martella <
> claudio.martella@gmail.com> wrote:
>
> > I am afraid we're not going to get students this round. We really moved
> at
> > the last minute. On this topic, I sent a couple of emails to master
> > students mailing lists to a couple of universities, but if you guys could
> > also forward some recruiting emails to whom you think would be suitable,
> it
> > would probably help. Maybe also Sebastian can help here at TU Berlin?
> >
> >
> > On Sun, Apr 21, 2013 at 2:55 AM, Eli Reisman <apache.mailbox@gmail.com
> >wrote:
> >
> >> Thats a good point. It would represent a temporary solution. I'd like to
> >> see us done with munge for all sorts of reasons if its possible. Might
> be
> >> a
> >> difficult student project to complete cleanly though. Any other thoughts
> >> on
> >> this?
> >>
> >>
> >>
> >> On Sat, Apr 20, 2013 at 3:08 PM, Nitay Joffe <ni...@apache.org> wrote:
> >>
> >> > Just FYI our use of munge does not prevent us from doing binary jar
> >> > release. We can use classifier to release multiple Giraph jars (each
> >> built
> >> > for different Hadoop).
> >> >
> >> > See e.g.
> >> >
> >> >
> >>
> http://maven.apache.org/plugins/maven-deploy-plugin/examples/deploying-with-classifiers.html
> >> >
> >> >
> >>
> http://stackoverflow.com/questions/3092085/building-same-project-in-maven-with-different-artifactid-based-on-jdk-used
> >> >
> >> > So we would have like giraph-1.0.0-hadoop-0.20.203.0.jar,
> >> > giraph-1.0.0-hadoop-2.0.0.jar, etc
> >> >
> >> > Seems to me like we don't know how long it will take to get rid of
> >> munge.
> >> > Perhaps we should do this as pulling in Giraph into Maven projects
> would
> >> > make it a lot easier for others to work with and lessen the barrier to
> >> > entry?
> >> >
> >> > - Nitay
> >> >
> >> > On Apr 16, 2013, at 1:36 AM, Avery Ching <ac...@apache.org> wrote:
> >> >
> >> > > Okay, we're on our 3rd RC now, mainly to incorporate Roman's
> >> suggestion
> >> > about fixing the maven version number from 0.2-SNAPSHOT => 1.0.0.
> Roman,
> >> > this is a source code release, we can't really deploy jars to maven
> >> since
> >> > we use munge to support different versions of Hadoop.  Once we get rid
> >> of
> >> > munge, we'll be able to do this in the future.
> >> > >
> >> > > Release notes:
> >> > >
> http://people.apache.org/~aching/giraph-1.0.0-RC3/RELEASE_NOTES.html
> >> > >
> >> > > Release artifacts:
> >> > > http://people.apache.org/~aching/giraph-1.0.0-RC3/
> >> > >
> >> > > Corresponding git tag:
> >> > >
> >> >
> >>
> https://git-wip-us.apache.org/repos/asf?p=giraph.git;a=shortlog;h=refs/tags/release-1.0.0-RC3
> >> > >
> >> > > Signing keys:
> >> > > http://people.apache.org/keys/group/giraph.asc
> >> > >
> >> > > The vote runs for 72 hours, until Thusday 4pm PST.
> >> > >
> >> > > Thanks for everyone's input!
> >> > >
> >> > > Avery
> >> >
> >> >
> >>
> >
> >
> >
> > --
> >    Claudio Martella
> >    claudio.martella@gmail.com
> >
>
>
>
> --
>    Claudio Martella
>    claudio.martella@gmail.com
>

Re: [VOTE][CHANGED] Release Giraph 1.0.0 (rc3)

Posted by Claudio Martella <cl...@gmail.com>.
Also, just to make sure we're on the same page on the munge topic and GSoC,
I started the bureaucracy for the project, but I won't be able to mentor
also that project. I plan to mentor the tinkerpop project, in case we get a
student. In case we get students for both projects, somebody would step in
for the munge one?


On Sun, Apr 21, 2013 at 3:06 AM, Claudio Martella <
claudio.martella@gmail.com> wrote:

> I am afraid we're not going to get students this round. We really moved at
> the last minute. On this topic, I sent a couple of emails to master
> students mailing lists to a couple of universities, but if you guys could
> also forward some recruiting emails to whom you think would be suitable, it
> would probably help. Maybe also Sebastian can help here at TU Berlin?
>
>
> On Sun, Apr 21, 2013 at 2:55 AM, Eli Reisman <ap...@gmail.com>wrote:
>
>> Thats a good point. It would represent a temporary solution. I'd like to
>> see us done with munge for all sorts of reasons if its possible. Might be
>> a
>> difficult student project to complete cleanly though. Any other thoughts
>> on
>> this?
>>
>>
>>
>> On Sat, Apr 20, 2013 at 3:08 PM, Nitay Joffe <ni...@apache.org> wrote:
>>
>> > Just FYI our use of munge does not prevent us from doing binary jar
>> > release. We can use classifier to release multiple Giraph jars (each
>> built
>> > for different Hadoop).
>> >
>> > See e.g.
>> >
>> >
>> http://maven.apache.org/plugins/maven-deploy-plugin/examples/deploying-with-classifiers.html
>> >
>> >
>> http://stackoverflow.com/questions/3092085/building-same-project-in-maven-with-different-artifactid-based-on-jdk-used
>> >
>> > So we would have like giraph-1.0.0-hadoop-0.20.203.0.jar,
>> > giraph-1.0.0-hadoop-2.0.0.jar, etc
>> >
>> > Seems to me like we don't know how long it will take to get rid of
>> munge.
>> > Perhaps we should do this as pulling in Giraph into Maven projects would
>> > make it a lot easier for others to work with and lessen the barrier to
>> > entry?
>> >
>> > - Nitay
>> >
>> > On Apr 16, 2013, at 1:36 AM, Avery Ching <ac...@apache.org> wrote:
>> >
>> > > Okay, we're on our 3rd RC now, mainly to incorporate Roman's
>> suggestion
>> > about fixing the maven version number from 0.2-SNAPSHOT => 1.0.0. Roman,
>> > this is a source code release, we can't really deploy jars to maven
>> since
>> > we use munge to support different versions of Hadoop.  Once we get rid
>> of
>> > munge, we'll be able to do this in the future.
>> > >
>> > > Release notes:
>> > > http://people.apache.org/~aching/giraph-1.0.0-RC3/RELEASE_NOTES.html
>> > >
>> > > Release artifacts:
>> > > http://people.apache.org/~aching/giraph-1.0.0-RC3/
>> > >
>> > > Corresponding git tag:
>> > >
>> >
>> https://git-wip-us.apache.org/repos/asf?p=giraph.git;a=shortlog;h=refs/tags/release-1.0.0-RC3
>> > >
>> > > Signing keys:
>> > > http://people.apache.org/keys/group/giraph.asc
>> > >
>> > > The vote runs for 72 hours, until Thusday 4pm PST.
>> > >
>> > > Thanks for everyone's input!
>> > >
>> > > Avery
>> >
>> >
>>
>
>
>
> --
>    Claudio Martella
>    claudio.martella@gmail.com
>



-- 
   Claudio Martella
   claudio.martella@gmail.com

Re: [VOTE][CHANGED] Release Giraph 1.0.0 (rc3)

Posted by Claudio Martella <cl...@gmail.com>.
I am afraid we're not going to get students this round. We really moved at
the last minute. On this topic, I sent a couple of emails to master
students mailing lists to a couple of universities, but if you guys could
also forward some recruiting emails to whom you think would be suitable, it
would probably help. Maybe also Sebastian can help here at TU Berlin?


On Sun, Apr 21, 2013 at 2:55 AM, Eli Reisman <ap...@gmail.com>wrote:

> Thats a good point. It would represent a temporary solution. I'd like to
> see us done with munge for all sorts of reasons if its possible. Might be a
> difficult student project to complete cleanly though. Any other thoughts on
> this?
>
>
>
> On Sat, Apr 20, 2013 at 3:08 PM, Nitay Joffe <ni...@apache.org> wrote:
>
> > Just FYI our use of munge does not prevent us from doing binary jar
> > release. We can use classifier to release multiple Giraph jars (each
> built
> > for different Hadoop).
> >
> > See e.g.
> >
> >
> http://maven.apache.org/plugins/maven-deploy-plugin/examples/deploying-with-classifiers.html
> >
> >
> http://stackoverflow.com/questions/3092085/building-same-project-in-maven-with-different-artifactid-based-on-jdk-used
> >
> > So we would have like giraph-1.0.0-hadoop-0.20.203.0.jar,
> > giraph-1.0.0-hadoop-2.0.0.jar, etc
> >
> > Seems to me like we don't know how long it will take to get rid of munge.
> > Perhaps we should do this as pulling in Giraph into Maven projects would
> > make it a lot easier for others to work with and lessen the barrier to
> > entry?
> >
> > - Nitay
> >
> > On Apr 16, 2013, at 1:36 AM, Avery Ching <ac...@apache.org> wrote:
> >
> > > Okay, we're on our 3rd RC now, mainly to incorporate Roman's suggestion
> > about fixing the maven version number from 0.2-SNAPSHOT => 1.0.0. Roman,
> > this is a source code release, we can't really deploy jars to maven since
> > we use munge to support different versions of Hadoop.  Once we get rid of
> > munge, we'll be able to do this in the future.
> > >
> > > Release notes:
> > > http://people.apache.org/~aching/giraph-1.0.0-RC3/RELEASE_NOTES.html
> > >
> > > Release artifacts:
> > > http://people.apache.org/~aching/giraph-1.0.0-RC3/
> > >
> > > Corresponding git tag:
> > >
> >
> https://git-wip-us.apache.org/repos/asf?p=giraph.git;a=shortlog;h=refs/tags/release-1.0.0-RC3
> > >
> > > Signing keys:
> > > http://people.apache.org/keys/group/giraph.asc
> > >
> > > The vote runs for 72 hours, until Thusday 4pm PST.
> > >
> > > Thanks for everyone's input!
> > >
> > > Avery
> >
> >
>



-- 
   Claudio Martella
   claudio.martella@gmail.com

Re: [VOTE][CHANGED] Release Giraph 1.0.0 (rc3)

Posted by Eli Reisman <ap...@gmail.com>.
Thats a good point. It would represent a temporary solution. I'd like to
see us done with munge for all sorts of reasons if its possible. Might be a
difficult student project to complete cleanly though. Any other thoughts on
this?



On Sat, Apr 20, 2013 at 3:08 PM, Nitay Joffe <ni...@apache.org> wrote:

> Just FYI our use of munge does not prevent us from doing binary jar
> release. We can use classifier to release multiple Giraph jars (each built
> for different Hadoop).
>
> See e.g.
>
> http://maven.apache.org/plugins/maven-deploy-plugin/examples/deploying-with-classifiers.html
>
> http://stackoverflow.com/questions/3092085/building-same-project-in-maven-with-different-artifactid-based-on-jdk-used
>
> So we would have like giraph-1.0.0-hadoop-0.20.203.0.jar,
> giraph-1.0.0-hadoop-2.0.0.jar, etc
>
> Seems to me like we don't know how long it will take to get rid of munge.
> Perhaps we should do this as pulling in Giraph into Maven projects would
> make it a lot easier for others to work with and lessen the barrier to
> entry?
>
> - Nitay
>
> On Apr 16, 2013, at 1:36 AM, Avery Ching <ac...@apache.org> wrote:
>
> > Okay, we're on our 3rd RC now, mainly to incorporate Roman's suggestion
> about fixing the maven version number from 0.2-SNAPSHOT => 1.0.0. Roman,
> this is a source code release, we can't really deploy jars to maven since
> we use munge to support different versions of Hadoop.  Once we get rid of
> munge, we'll be able to do this in the future.
> >
> > Release notes:
> > http://people.apache.org/~aching/giraph-1.0.0-RC3/RELEASE_NOTES.html
> >
> > Release artifacts:
> > http://people.apache.org/~aching/giraph-1.0.0-RC3/
> >
> > Corresponding git tag:
> >
> https://git-wip-us.apache.org/repos/asf?p=giraph.git;a=shortlog;h=refs/tags/release-1.0.0-RC3
> >
> > Signing keys:
> > http://people.apache.org/keys/group/giraph.asc
> >
> > The vote runs for 72 hours, until Thusday 4pm PST.
> >
> > Thanks for everyone's input!
> >
> > Avery
>
>

Re: [VOTE][CHANGED] Release Giraph 1.0.0 (rc3)

Posted by Nitay Joffe <ni...@apache.org>.
Just FYI our use of munge does not prevent us from doing binary jar release. We can use classifier to release multiple Giraph jars (each built for different Hadoop).

See e.g.
http://maven.apache.org/plugins/maven-deploy-plugin/examples/deploying-with-classifiers.html
http://stackoverflow.com/questions/3092085/building-same-project-in-maven-with-different-artifactid-based-on-jdk-used

So we would have like giraph-1.0.0-hadoop-0.20.203.0.jar, giraph-1.0.0-hadoop-2.0.0.jar, etc

Seems to me like we don't know how long it will take to get rid of munge.
Perhaps we should do this as pulling in Giraph into Maven projects would make it a lot easier for others to work with and lessen the barrier to entry?

- Nitay

On Apr 16, 2013, at 1:36 AM, Avery Ching <ac...@apache.org> wrote:

> Okay, we're on our 3rd RC now, mainly to incorporate Roman's suggestion about fixing the maven version number from 0.2-SNAPSHOT => 1.0.0. Roman, this is a source code release, we can't really deploy jars to maven since we use munge to support different versions of Hadoop.  Once we get rid of munge, we'll be able to do this in the future.
> 
> Release notes:
> http://people.apache.org/~aching/giraph-1.0.0-RC3/RELEASE_NOTES.html
> 
> Release artifacts:
> http://people.apache.org/~aching/giraph-1.0.0-RC3/
> 
> Corresponding git tag:
> https://git-wip-us.apache.org/repos/asf?p=giraph.git;a=shortlog;h=refs/tags/release-1.0.0-RC3 
> 
> Signing keys:
> http://people.apache.org/keys/group/giraph.asc
> 
> The vote runs for 72 hours, until Thusday 4pm PST.
> 
> Thanks for everyone's input!
> 
> Avery


Re: [VOTE][RESULT] Release Giraph 1.0.0 (rc3)

Posted by Eli Reisman <ap...@gmail.com>.
Awesome!


On Sat, Apr 20, 2013 at 1:24 AM, Avery Ching <ac...@apache.org> wrote:

> Vote result:
>
> Binding:
> +1 Avery
> +1 Eli
> +1 Claudio
> +1 Sebastian
>
> Non-binding:
> +1 Lewis
> +1 Roman
>
> The vote passes!  I will work on getting the tarball out in the open!
>
> Thanks everyone for all your help with this release.
>
> Avery
>
>
> On 4/18/13 9:52 AM, Sebastian Schelter wrote:
>
>> +1 on the release, let's go!
>>
>> On 18.04.2013 18:49, Avery Ching wrote:
>>
>>> Currently the vote on Giraph-1.0.0 (rc3) is
>>>
>>> Binding:
>>> +1 (Avery)
>>> +1 (Eli)
>>>
>>> Non-binding:
>>> +1 (Lewis)
>>> +1 Roman
>>>
>>> We need at least one more PMC +1 to make this release official.
>>> Sebastian?  Claudio?
>>>
>>> Deadline is Thursday (today) at 10:30 PM PST.
>>>
>>> Avery
>>>
>>> On 4/17/13 10:00 PM, Roman Shaposhnik wrote:
>>>
>>>> On Mon, Apr 15, 2013 at 10:36 PM, Avery Ching <ac...@apache.org>
>>>> wrote:
>>>>
>>>>> Okay, we're on our 3rd RC now, mainly to incorporate Roman's suggestion
>>>>> about fixing the maven version number from 0.2-SNAPSHOT => 1.0.0.
>>>>> Roman,
>>>>> this is a source code release, we can't really deploy jars to maven
>>>>> since we
>>>>> use munge to support different versions of Hadoop.  Once we get rid of
>>>>> munge, we'll be able to do this in the future.
>>>>>
>>>> Understood.
>>>>
>>>>  The vote runs for 72 hours, until Thusday 4pm PST.
>>>>>
>>>>> Thanks for everyone's input!
>>>>>
>>>> +1 (non-binding).
>>>>
>>>> Built packages as part of Bigtop:
>>>>      http://bigtop01.cloudera.org:**8080/view/Upstream-tests/job/**
>>>> Giraph-1.0.0/<http://bigtop01.cloudera.org:8080/view/Upstream-tests/job/Giraph-1.0.0/>
>>>> tested on a 4 node fully distributed, secure
>>>> and unsecure Hadoop 2.0.4-alpha cluster.
>>>>
>>>> Thanks,
>>>> Roman.
>>>>
>>>
>

Re: [VOTE][RESULT] Release Giraph 1.0.0 (rc3)

Posted by Avery Ching <ac...@apache.org>.
Vote result:

Binding:
+1 Avery
+1 Eli
+1 Claudio
+1 Sebastian

Non-binding:
+1 Lewis
+1 Roman

The vote passes!  I will work on getting the tarball out in the open!

Thanks everyone for all your help with this release.

Avery

On 4/18/13 9:52 AM, Sebastian Schelter wrote:
> +1 on the release, let's go!
>
> On 18.04.2013 18:49, Avery Ching wrote:
>> Currently the vote on Giraph-1.0.0 (rc3) is
>>
>> Binding:
>> +1 (Avery)
>> +1 (Eli)
>>
>> Non-binding:
>> +1 (Lewis)
>> +1 Roman
>>
>> We need at least one more PMC +1 to make this release official.
>> Sebastian?  Claudio?
>>
>> Deadline is Thursday (today) at 10:30 PM PST.
>>
>> Avery
>>
>> On 4/17/13 10:00 PM, Roman Shaposhnik wrote:
>>> On Mon, Apr 15, 2013 at 10:36 PM, Avery Ching <ac...@apache.org> wrote:
>>>> Okay, we're on our 3rd RC now, mainly to incorporate Roman's suggestion
>>>> about fixing the maven version number from 0.2-SNAPSHOT => 1.0.0.
>>>> Roman,
>>>> this is a source code release, we can't really deploy jars to maven
>>>> since we
>>>> use munge to support different versions of Hadoop.  Once we get rid of
>>>> munge, we'll be able to do this in the future.
>>> Understood.
>>>
>>>> The vote runs for 72 hours, until Thusday 4pm PST.
>>>>
>>>> Thanks for everyone's input!
>>> +1 (non-binding).
>>>
>>> Built packages as part of Bigtop:
>>>      
>>> http://bigtop01.cloudera.org:8080/view/Upstream-tests/job/Giraph-1.0.0/
>>> tested on a 4 node fully distributed, secure
>>> and unsecure Hadoop 2.0.4-alpha cluster.
>>>
>>> Thanks,
>>> Roman.


Re: [VOTE][RESULT] Release Giraph 1.0.0 (rc3)

Posted by Avery Ching <ac...@apache.org>.
Vote result:

Binding:
+1 Avery
+1 Eli
+1 Claudio
+1 Sebastian

Non-binding:
+1 Lewis
+1 Roman

The vote passes!  I will work on getting the tarball out in the open!

Thanks everyone for all your help with this release.

Avery

On 4/18/13 9:52 AM, Sebastian Schelter wrote:
> +1 on the release, let's go!
>
> On 18.04.2013 18:49, Avery Ching wrote:
>> Currently the vote on Giraph-1.0.0 (rc3) is
>>
>> Binding:
>> +1 (Avery)
>> +1 (Eli)
>>
>> Non-binding:
>> +1 (Lewis)
>> +1 Roman
>>
>> We need at least one more PMC +1 to make this release official.
>> Sebastian?  Claudio?
>>
>> Deadline is Thursday (today) at 10:30 PM PST.
>>
>> Avery
>>
>> On 4/17/13 10:00 PM, Roman Shaposhnik wrote:
>>> On Mon, Apr 15, 2013 at 10:36 PM, Avery Ching <ac...@apache.org> wrote:
>>>> Okay, we're on our 3rd RC now, mainly to incorporate Roman's suggestion
>>>> about fixing the maven version number from 0.2-SNAPSHOT => 1.0.0.
>>>> Roman,
>>>> this is a source code release, we can't really deploy jars to maven
>>>> since we
>>>> use munge to support different versions of Hadoop.  Once we get rid of
>>>> munge, we'll be able to do this in the future.
>>> Understood.
>>>
>>>> The vote runs for 72 hours, until Thusday 4pm PST.
>>>>
>>>> Thanks for everyone's input!
>>> +1 (non-binding).
>>>
>>> Built packages as part of Bigtop:
>>>      
>>> http://bigtop01.cloudera.org:8080/view/Upstream-tests/job/Giraph-1.0.0/
>>> tested on a 4 node fully distributed, secure
>>> and unsecure Hadoop 2.0.4-alpha cluster.
>>>
>>> Thanks,
>>> Roman.


Re: [VOTE][CHANGED] Release Giraph 1.0.0 (rc3)

Posted by Sebastian Schelter <ss...@apache.org>.
+1 on the release, let's go!

On 18.04.2013 18:49, Avery Ching wrote:
> Currently the vote on Giraph-1.0.0 (rc3) is
> 
> Binding:
> +1 (Avery)
> +1 (Eli)
> 
> Non-binding:
> +1 (Lewis)
> +1 Roman
> 
> We need at least one more PMC +1 to make this release official.
> Sebastian?  Claudio?
> 
> Deadline is Thursday (today) at 10:30 PM PST.
> 
> Avery
> 
> On 4/17/13 10:00 PM, Roman Shaposhnik wrote:
>> On Mon, Apr 15, 2013 at 10:36 PM, Avery Ching <ac...@apache.org> wrote:
>>> Okay, we're on our 3rd RC now, mainly to incorporate Roman's suggestion
>>> about fixing the maven version number from 0.2-SNAPSHOT => 1.0.0. 
>>> Roman,
>>> this is a source code release, we can't really deploy jars to maven
>>> since we
>>> use munge to support different versions of Hadoop.  Once we get rid of
>>> munge, we'll be able to do this in the future.
>> Understood.
>>
>>> The vote runs for 72 hours, until Thusday 4pm PST.
>>>
>>> Thanks for everyone's input!
>> +1 (non-binding).
>>
>> Built packages as part of Bigtop:
>>     
>> http://bigtop01.cloudera.org:8080/view/Upstream-tests/job/Giraph-1.0.0/
>> tested on a 4 node fully distributed, secure
>> and unsecure Hadoop 2.0.4-alpha cluster.
>>
>> Thanks,
>> Roman.
> 


Re: [VOTE][CHANGED] Release Giraph 1.0.0 (rc3)

Posted by Claudio Martella <cl...@gmail.com>.
I just tested it on our small cluster and it builds cleanly.

+1.


On Thu, Apr 18, 2013 at 6:49 PM, Avery Ching <ac...@apache.org> wrote:

> Currently the vote on Giraph-1.0.0 (rc3) is
>
> Binding:
> +1 (Avery)
> +1 (Eli)
>
> Non-binding:
> +1 (Lewis)
> +1 Roman
>
> We need at least one more PMC +1 to make this release official. Sebastian?
>  Claudio?
>
> Deadline is Thursday (today) at 10:30 PM PST.
>
> Avery
>
>
> On 4/17/13 10:00 PM, Roman Shaposhnik wrote:
>
>> On Mon, Apr 15, 2013 at 10:36 PM, Avery Ching <ac...@apache.org> wrote:
>>
>>> Okay, we're on our 3rd RC now, mainly to incorporate Roman's suggestion
>>> about fixing the maven version number from 0.2-SNAPSHOT => 1.0.0.  Roman,
>>> this is a source code release, we can't really deploy jars to maven
>>> since we
>>> use munge to support different versions of Hadoop.  Once we get rid of
>>> munge, we'll be able to do this in the future.
>>>
>> Understood.
>>
>>  The vote runs for 72 hours, until Thusday 4pm PST.
>>>
>>> Thanks for everyone's input!
>>>
>> +1 (non-binding).
>>
>> Built packages as part of Bigtop:
>>      http://bigtop01.cloudera.org:**8080/view/Upstream-tests/job/**
>> Giraph-1.0.0/<http://bigtop01.cloudera.org:8080/view/Upstream-tests/job/Giraph-1.0.0/>
>> tested on a 4 node fully distributed, secure
>> and unsecure Hadoop 2.0.4-alpha cluster.
>>
>> Thanks,
>> Roman.
>>
>
>


-- 
   Claudio Martella
   claudio.martella@gmail.com

Re: [VOTE][CHANGED] Release Giraph 1.0.0 (rc3)

Posted by Avery Ching <ac...@apache.org>.
Currently the vote on Giraph-1.0.0 (rc3) is

Binding:
+1 (Avery)
+1 (Eli)

Non-binding:
+1 (Lewis)
+1 Roman

We need at least one more PMC +1 to make this release official. 
Sebastian?  Claudio?

Deadline is Thursday (today) at 10:30 PM PST.

Avery

On 4/17/13 10:00 PM, Roman Shaposhnik wrote:
> On Mon, Apr 15, 2013 at 10:36 PM, Avery Ching <ac...@apache.org> wrote:
>> Okay, we're on our 3rd RC now, mainly to incorporate Roman's suggestion
>> about fixing the maven version number from 0.2-SNAPSHOT => 1.0.0.  Roman,
>> this is a source code release, we can't really deploy jars to maven since we
>> use munge to support different versions of Hadoop.  Once we get rid of
>> munge, we'll be able to do this in the future.
> Understood.
>
>> The vote runs for 72 hours, until Thusday 4pm PST.
>>
>> Thanks for everyone's input!
> +1 (non-binding).
>
> Built packages as part of Bigtop:
>      http://bigtop01.cloudera.org:8080/view/Upstream-tests/job/Giraph-1.0.0/
> tested on a 4 node fully distributed, secure
> and unsecure Hadoop 2.0.4-alpha cluster.
>
> Thanks,
> Roman.


Re: [VOTE][CHANGED] Release Giraph 1.0.0 (rc3)

Posted by Avery Ching <ac...@apache.org>.
Currently the vote on Giraph-1.0.0 (rc3) is

Binding:
+1 (Avery)
+1 (Eli)

Non-binding:
+1 (Lewis)
+1 Roman

We need at least one more PMC +1 to make this release official. 
Sebastian?  Claudio?

Deadline is Thursday (today) at 10:30 PM PST.

Avery

On 4/17/13 10:00 PM, Roman Shaposhnik wrote:
> On Mon, Apr 15, 2013 at 10:36 PM, Avery Ching <ac...@apache.org> wrote:
>> Okay, we're on our 3rd RC now, mainly to incorporate Roman's suggestion
>> about fixing the maven version number from 0.2-SNAPSHOT => 1.0.0.  Roman,
>> this is a source code release, we can't really deploy jars to maven since we
>> use munge to support different versions of Hadoop.  Once we get rid of
>> munge, we'll be able to do this in the future.
> Understood.
>
>> The vote runs for 72 hours, until Thusday 4pm PST.
>>
>> Thanks for everyone's input!
> +1 (non-binding).
>
> Built packages as part of Bigtop:
>      http://bigtop01.cloudera.org:8080/view/Upstream-tests/job/Giraph-1.0.0/
> tested on a 4 node fully distributed, secure
> and unsecure Hadoop 2.0.4-alpha cluster.
>
> Thanks,
> Roman.


Re: [VOTE][CHANGED] Release Giraph 1.0.0 (rc3)

Posted by Roman Shaposhnik <rv...@apache.org>.
On Mon, Apr 15, 2013 at 10:36 PM, Avery Ching <ac...@apache.org> wrote:
> Okay, we're on our 3rd RC now, mainly to incorporate Roman's suggestion
> about fixing the maven version number from 0.2-SNAPSHOT => 1.0.0.  Roman,
> this is a source code release, we can't really deploy jars to maven since we
> use munge to support different versions of Hadoop.  Once we get rid of
> munge, we'll be able to do this in the future.

Understood.

> The vote runs for 72 hours, until Thusday 4pm PST.
>
> Thanks for everyone's input!

+1 (non-binding).

Built packages as part of Bigtop:
    http://bigtop01.cloudera.org:8080/view/Upstream-tests/job/Giraph-1.0.0/
tested on a 4 node fully distributed, secure
and unsecure Hadoop 2.0.4-alpha cluster.

Thanks,
Roman.

Re: [VOTE][CHANGED] Release Giraph 1.0.0 (rc3)

Posted by Roman Shaposhnik <rv...@apache.org>.
On Mon, Apr 15, 2013 at 10:36 PM, Avery Ching <ac...@apache.org> wrote:
> Okay, we're on our 3rd RC now, mainly to incorporate Roman's suggestion
> about fixing the maven version number from 0.2-SNAPSHOT => 1.0.0.  Roman,
> this is a source code release, we can't really deploy jars to maven since we
> use munge to support different versions of Hadoop.  Once we get rid of
> munge, we'll be able to do this in the future.

Understood.

> The vote runs for 72 hours, until Thusday 4pm PST.
>
> Thanks for everyone's input!

+1 (non-binding).

Built packages as part of Bigtop:
    http://bigtop01.cloudera.org:8080/view/Upstream-tests/job/Giraph-1.0.0/
tested on a 4 node fully distributed, secure
and unsecure Hadoop 2.0.4-alpha cluster.

Thanks,
Roman.

Re: [VOTE][CHANGED] Release Giraph 1.0.0 (rc3)

Posted by Nitay Joffe <ni...@gmail.com>.
Just FYI we munge does not prevent us from doing jar release. We can use classifier to release multiple Giraph jars (each built for different Hadoop).

See e.g.
http://maven.apache.org/plugins/maven-deploy-plugin/examples/deploying-with-classifiers.html
http://stackoverflow.com/questions/3092085/building-same-project-in-maven-with-different-artifactid-based-on-jdk-used

So we would have like giraph-1.0.0-hadoop-0.20.203.0.jar, giraph-1.0.0-hadoop-2.0.0.jar, etc

Seems to me like we don't know how long it will take to get rid of munge.
Perhaps we should do this as pulling in Giraph into Maven projects would make it a lot easier for others to work with and lessen the barrier to entry?

- Nitay

On Apr 16, 2013, at 1:36 AM, Avery Ching <ac...@apache.org> wrote:

> Okay, we're on our 3rd RC now, mainly to incorporate Roman's suggestion about fixing the maven version number from 0.2-SNAPSHOT => 1.0.0.  Roman, this is a source code release, we can't really deploy jars to maven since we use munge to support different versions of Hadoop.  Once we get rid of munge, we'll be able to do this in the future.
> 
> Release notes:
> http://people.apache.org/~aching/giraph-1.0.0-RC3/RELEASE_NOTES.html
> 
> Release artifacts:
> http://people.apache.org/~aching/giraph-1.0.0-RC3/
> 
> Corresponding git tag:
> https://git-wip-us.apache.org/repos/asf?p=giraph.git;a=shortlog;h=refs/tags/release-1.0.0-RC3 
> 
> Signing keys:
> http://people.apache.org/keys/group/giraph.asc
> 
> The vote runs for 72 hours, until Thusday 4pm PST.
> 
> Thanks for everyone's input!
> 
> Avery
> 
> On 4/14/13 7:12 PM, Avery Ching wrote:
>> Hopefully last RC release.  This patch includes GIRAPH-630.  I also changed the version number to be 1.0.0 so our next release can be 1.0.1.
>> 
>> Release notes:
>> http://people.apache.org/~aching/giraph-1.0.0-RC2/RELEASE_NOTES.html
>> 
>> Release artifacts:
>> http://people.apache.org/~aching/giraph-1.0.0-RC2/
>> 
>> Corresponding git tag:
>> https://git-wip-us.apache.org/repos/asf?p=giraph.git;a=shortlog;h=refs/tags/release-1.0.0-RC2 
>> 
>> Signing keys:
>> http://people.apache.org/keys/group/giraph.asc
>> 
>> The vote runs for 72 hours, until Wednesday 7:15pm PST.
>> 
>> Thanks,
>> 
>> Avery
>> 
>> On 4/12/13 10:39 PM, Avery Ching wrote:
>>> Thanks to the quick feedback from Roman and Lewis, we have cut a new RC1 that addresses the following issues.
>>> 
>>> * Got rid of .git repo in tarball
>>> * Fixed issue with not compiling without git repo (GIRAPH-628)
>>> * Used gnutar in OSX rather than tar to generate the tarball and get rid of warnings
>>> * Pushed GIRAPH-627 to support the yarn profile better
>>> * Tarball name changed to the final artifact name (giraph-1.0.tar.gz)
>>> 
>>> Release notes:
>>> http://people.apache.org/~aching/giraph-1.0-RC1/RELEASE_NOTES.html
>>> 
>>> Release artifacts:
>>> http://people.apache.org/~aching/giraph-1.0-RC1/
>>> 
>>> Corresponding git tag:
>>> https://git-wip-us.apache.org/repos/asf?p=giraph.git;a=shortlog;h=refs/tags/release-1.0-RC1 
>>> 
>>> Signing keys:
>>> http://people.apache.org/keys/group/giraph.asc
>>> 
>>> The vote runs for 72 hours, until Monday 11pm PST.
>>> 
>>> Thanks,
>>> 
>>> Avery
>>> 
>>> Original message below regarding rc0:
>>> 
>>> -------------------------------
>>> 
>>> Fellow Giraphers,
>>> 
>>> We have a our first release candidate since graduating from incubation.  This is a source release, primarily due to the different versions of Hadoop we support with munge (similar to the 0.1 release).  Since 0.1, we've made A TON of progress on overall performance, optimizing memory use, split vertex/edge inputs, easy interoperability with Apache Hive, and a bunch of other areas.  In many ways, this is an almost totally different codebase.  Thanks everyone for your hard work!
>>> 
>>> Apache Giraph has been running in production at Facebook (against Facebook's Corona implementation of Hadoop - https://github.com/facebook/hadoop-20/tree/master/src/contrib/corona) since around last December.  It has proven to be very scalable, performant, and enables a bunch of new applications.  Based on the drastic improvements and the use of Giraph in production, it seems appropriate to bump up our version to 1.0.
>>> 
>>> While anyone can vote, the ASF requires majority approval from the PMC -- i.e., at least three PMC members must vote affirmatively for release, and there must be more positive than negative votes. Releases may not be vetoed. Before voting +1 PMC members are required to download the signed source code package, compile it as provided, and test the resulting executable on their own platform, along with also verifying that the package meets the requirements of the ASF policy on releases.
>>> 
>>> Please test this against many other Hadoop versions and let us know how this goes!
>>> 
>>> Release notes:
>>> http://people.apache.org/~aching/giraph-1.0-RC0/RELEASE_NOTES.html
>>> 
>>> Release artifacts:
>>> http://people.apache.org/~aching/giraph-1.0-RC0/
>>> 
>>> Corresponding git tag:
>>> https://git-wip-us.apache.org/repos/asf?p=giraph.git;a=shortlog;h=refs/tags/release-1.0-RC0 
>>> 
>>> Signing keys:
>>> http://people.apache.org/keys/group/giraph.asc
>>> 
>>> The vote runs for 72 hours, until Monday 4pm PST.
>>> 
>>> Thanks everyone for your patience with this release!
>>> 
>>> Avery
>> 
> 


Re: [VOTE][CHANGED] Release Giraph 1.0.0 (rc3)

Posted by Eli Reisman <ap...@gmail.com>.
+1


On Tue, Apr 16, 2013 at 1:36 AM, Avery Ching <ac...@apache.org> wrote:

> Okay, we're on our 3rd RC now, mainly to incorporate Roman's suggestion
> about fixing the maven version number from 0.2-SNAPSHOT => 1.0.0.  Roman,
> this is a source code release, we can't really deploy jars to maven since
> we use munge to support different versions of Hadoop.  Once we get rid of
> munge, we'll be able to do this in the future.
>
> Release notes:
> http://people.apache.org/~**aching/giraph-1.0.0-RC3/**RELEASE_NOTES.html<http://people.apache.org/~aching/giraph-1.0.0-RC3/RELEASE_NOTES.html>
>
> Release artifacts:
> http://people.apache.org/~**aching/giraph-1.0.0-RC3/<http://people.apache.org/~aching/giraph-1.0.0-RC3/>
>
> Corresponding git tag:
> https://git-wip-us.apache.org/**repos/asf?p=giraph.git;a=**
> shortlog;h=refs/tags/release-**1.0.0-RC3<https://git-wip-us.apache.org/repos/asf?p=giraph.git;a=shortlog;h=refs/tags/release-1.0.0-RC3>
>
> Signing keys:
> http://people.apache.org/keys/**group/giraph.asc<http://people.apache.org/keys/group/giraph.asc>
>
> The vote runs for 72 hours, until Thusday 4pm PST.
>
> Thanks for everyone's input!
>
> Avery
>
> On 4/14/13 7:12 PM, Avery Ching wrote:
>
>> Hopefully last RC release.  This patch includes GIRAPH-630.  I also
>> changed the version number to be 1.0.0 so our next release can be 1.0.1.
>>
>> Release notes:
>> http://people.apache.org/~**aching/giraph-1.0.0-RC2/**RELEASE_NOTES.html<http://people.apache.org/~aching/giraph-1.0.0-RC2/RELEASE_NOTES.html>
>>
>> Release artifacts:
>> http://people.apache.org/~**aching/giraph-1.0.0-RC2/<http://people.apache.org/~aching/giraph-1.0.0-RC2/>
>>
>> Corresponding git tag:
>> https://git-wip-us.apache.org/**repos/asf?p=giraph.git;a=**
>> shortlog;h=refs/tags/release-**1.0.0-RC2<https://git-wip-us.apache.org/repos/asf?p=giraph.git;a=shortlog;h=refs/tags/release-1.0.0-RC2>
>>
>> Signing keys:
>> http://people.apache.org/keys/**group/giraph.asc<http://people.apache.org/keys/group/giraph.asc>
>>
>> The vote runs for 72 hours, until Wednesday 7:15pm PST.
>>
>> Thanks,
>>
>> Avery
>>
>> On 4/12/13 10:39 PM, Avery Ching wrote:
>>
>>> Thanks to the quick feedback from Roman and Lewis, we have cut a new RC1
>>> that addresses the following issues.
>>>
>>> * Got rid of .git repo in tarball
>>> * Fixed issue with not compiling without git repo (GIRAPH-628)
>>> * Used gnutar in OSX rather than tar to generate the tarball and get rid
>>> of warnings
>>> * Pushed GIRAPH-627 to support the yarn profile better
>>> * Tarball name changed to the final artifact name (giraph-1.0.tar.gz)
>>>
>>> Release notes:
>>> http://people.apache.org/~**aching/giraph-1.0-RC1/RELEASE_**NOTES.html<http://people.apache.org/~aching/giraph-1.0-RC1/RELEASE_NOTES.html>
>>>
>>> Release artifacts:
>>> http://people.apache.org/~**aching/giraph-1.0-RC1/<http://people.apache.org/~aching/giraph-1.0-RC1/>
>>>
>>> Corresponding git tag:
>>> https://git-wip-us.apache.org/**repos/asf?p=giraph.git;a=**
>>> shortlog;h=refs/tags/release-**1.0-RC1<https://git-wip-us.apache.org/repos/asf?p=giraph.git;a=shortlog;h=refs/tags/release-1.0-RC1>
>>>
>>> Signing keys:
>>> http://people.apache.org/keys/**group/giraph.asc<http://people.apache.org/keys/group/giraph.asc>
>>>
>>> The vote runs for 72 hours, until Monday 11pm PST.
>>>
>>> Thanks,
>>>
>>> Avery
>>>
>>> Original message below regarding rc0:
>>>
>>> ------------------------------**-
>>>
>>> Fellow Giraphers,
>>>
>>> We have a our first release candidate since graduating from incubation.
>>>  This is a source release, primarily due to the different versions of
>>> Hadoop we support with munge (similar to the 0.1 release).  Since 0.1,
>>> we've made A TON of progress on overall performance, optimizing memory use,
>>> split vertex/edge inputs, easy interoperability with Apache Hive, and a
>>> bunch of other areas.  In many ways, this is an almost totally different
>>> codebase.  Thanks everyone for your hard work!
>>>
>>> Apache Giraph has been running in production at Facebook (against
>>> Facebook's Corona implementation of Hadoop -
>>> https://github.com/facebook/**hadoop-20/tree/master/src/**contrib/corona<https://github.com/facebook/hadoop-20/tree/master/src/contrib/corona>)
>>> since around last December.  It has proven to be very scalable, performant,
>>> and enables a bunch of new applications.  Based on the drastic improvements
>>> and the use of Giraph in production, it seems appropriate to bump up our
>>> version to 1.0.
>>>
>>> While anyone can vote, the ASF requires majority approval from the PMC
>>> -- i.e., at least three PMC members must vote affirmatively for release,
>>> and there must be more positive than negative votes. Releases may not be
>>> vetoed. Before voting +1 PMC members are required to download the signed
>>> source code package, compile it as provided, and test the resulting
>>> executable on their own platform, along with also verifying that the
>>> package meets the requirements of the ASF policy on releases.
>>>
>>> Please test this against many other Hadoop versions and let us know how
>>> this goes!
>>>
>>> Release notes:
>>> http://people.apache.org/~**aching/giraph-1.0-RC0/RELEASE_**NOTES.html<http://people.apache.org/~aching/giraph-1.0-RC0/RELEASE_NOTES.html>
>>>
>>> Release artifacts:
>>> http://people.apache.org/~**aching/giraph-1.0-RC0/<http://people.apache.org/~aching/giraph-1.0-RC0/>
>>>
>>> Corresponding git tag:
>>> https://git-wip-us.apache.org/**repos/asf?p=giraph.git;a=**
>>> shortlog;h=refs/tags/release-**1.0-RC0<https://git-wip-us.apache.org/repos/asf?p=giraph.git;a=shortlog;h=refs/tags/release-1.0-RC0>
>>>
>>> Signing keys:
>>> http://people.apache.org/keys/**group/giraph.asc<http://people.apache.org/keys/group/giraph.asc>
>>>
>>> The vote runs for 72 hours, until Monday 4pm PST.
>>>
>>> Thanks everyone for your patience with this release!
>>>
>>> Avery
>>>
>>
>>
>

[VOTE][CHANGED] Release Giraph 1.0.0 (rc3)

Posted by Avery Ching <ac...@apache.org>.
Okay, we're on our 3rd RC now, mainly to incorporate Roman's suggestion 
about fixing the maven version number from 0.2-SNAPSHOT => 1.0.0.  
Roman, this is a source code release, we can't really deploy jars to 
maven since we use munge to support different versions of Hadoop.  Once 
we get rid of munge, we'll be able to do this in the future.

Release notes:
http://people.apache.org/~aching/giraph-1.0.0-RC3/RELEASE_NOTES.html

Release artifacts:
http://people.apache.org/~aching/giraph-1.0.0-RC3/

Corresponding git tag:
https://git-wip-us.apache.org/repos/asf?p=giraph.git;a=shortlog;h=refs/tags/release-1.0.0-RC3 


Signing keys:
http://people.apache.org/keys/group/giraph.asc

The vote runs for 72 hours, until Thusday 4pm PST.

Thanks for everyone's input!

Avery

On 4/14/13 7:12 PM, Avery Ching wrote:
> Hopefully last RC release.  This patch includes GIRAPH-630.  I also 
> changed the version number to be 1.0.0 so our next release can be 1.0.1.
>
> Release notes:
> http://people.apache.org/~aching/giraph-1.0.0-RC2/RELEASE_NOTES.html
>
> Release artifacts:
> http://people.apache.org/~aching/giraph-1.0.0-RC2/
>
> Corresponding git tag:
> https://git-wip-us.apache.org/repos/asf?p=giraph.git;a=shortlog;h=refs/tags/release-1.0.0-RC2 
>
>
> Signing keys:
> http://people.apache.org/keys/group/giraph.asc
>
> The vote runs for 72 hours, until Wednesday 7:15pm PST.
>
> Thanks,
>
> Avery
>
> On 4/12/13 10:39 PM, Avery Ching wrote:
>> Thanks to the quick feedback from Roman and Lewis, we have cut a new 
>> RC1 that addresses the following issues.
>>
>> * Got rid of .git repo in tarball
>> * Fixed issue with not compiling without git repo (GIRAPH-628)
>> * Used gnutar in OSX rather than tar to generate the tarball and get 
>> rid of warnings
>> * Pushed GIRAPH-627 to support the yarn profile better
>> * Tarball name changed to the final artifact name (giraph-1.0.tar.gz)
>>
>> Release notes:
>> http://people.apache.org/~aching/giraph-1.0-RC1/RELEASE_NOTES.html
>>
>> Release artifacts:
>> http://people.apache.org/~aching/giraph-1.0-RC1/
>>
>> Corresponding git tag:
>> https://git-wip-us.apache.org/repos/asf?p=giraph.git;a=shortlog;h=refs/tags/release-1.0-RC1 
>>
>>
>> Signing keys:
>> http://people.apache.org/keys/group/giraph.asc
>>
>> The vote runs for 72 hours, until Monday 11pm PST.
>>
>> Thanks,
>>
>> Avery
>>
>> Original message below regarding rc0:
>>
>> -------------------------------
>>
>> Fellow Giraphers,
>>
>> We have a our first release candidate since graduating from 
>> incubation.  This is a source release, primarily due to the different 
>> versions of Hadoop we support with munge (similar to the 0.1 
>> release).  Since 0.1, we've made A TON of progress on overall 
>> performance, optimizing memory use, split vertex/edge inputs, easy 
>> interoperability with Apache Hive, and a bunch of other areas.  In 
>> many ways, this is an almost totally different codebase.  Thanks 
>> everyone for your hard work!
>>
>> Apache Giraph has been running in production at Facebook (against 
>> Facebook's Corona implementation of Hadoop - 
>> https://github.com/facebook/hadoop-20/tree/master/src/contrib/corona) 
>> since around last December.  It has proven to be very scalable, 
>> performant, and enables a bunch of new applications.  Based on the 
>> drastic improvements and the use of Giraph in production, it seems 
>> appropriate to bump up our version to 1.0.
>>
>> While anyone can vote, the ASF requires majority approval from the 
>> PMC -- i.e., at least three PMC members must vote affirmatively for 
>> release, and there must be more positive than negative votes. 
>> Releases may not be vetoed. Before voting +1 PMC members are required 
>> to download the signed source code package, compile it as provided, 
>> and test the resulting executable on their own platform, along with 
>> also verifying that the package meets the requirements of the ASF 
>> policy on releases.
>>
>> Please test this against many other Hadoop versions and let us know 
>> how this goes!
>>
>> Release notes:
>> http://people.apache.org/~aching/giraph-1.0-RC0/RELEASE_NOTES.html
>>
>> Release artifacts:
>> http://people.apache.org/~aching/giraph-1.0-RC0/
>>
>> Corresponding git tag:
>> https://git-wip-us.apache.org/repos/asf?p=giraph.git;a=shortlog;h=refs/tags/release-1.0-RC0 
>>
>>
>> Signing keys:
>> http://people.apache.org/keys/group/giraph.asc
>>
>> The vote runs for 72 hours, until Monday 4pm PST.
>>
>> Thanks everyone for your patience with this release!
>>
>> Avery
>


Re: [VOTE][CHANGED] Release Giraph 1.0.0 (rc2)

Posted by Sebastian Schelter <ss...@apache.org>.
Downloaded the RC, built it for profile hadoop_1.0, used it to compute
PageRank on a 1B edges webgraph on a cluster of 26 machines running
Hadoop 1.0.4.

Everything went well, convergence detection worked, so +1 on releasing
this RC from my side.

Best,
Sebastian


On 15.04.2013 04:12, Avery Ching wrote:
> Hopefully last RC release.  This patch includes GIRAPH-630.  I also
> changed the version number to be 1.0.0 so our next release can be 1.0.1.
> 
> Release notes:
> http://people.apache.org/~aching/giraph-1.0.0-RC2/RELEASE_NOTES.html
> 
> Release artifacts:
> http://people.apache.org/~aching/giraph-1.0.0-RC2/
> 
> Corresponding git tag:
> https://git-wip-us.apache.org/repos/asf?p=giraph.git;a=shortlog;h=refs/tags/release-1.0.0-RC2
> 
> 
> Signing keys:
> http://people.apache.org/keys/group/giraph.asc
> 
> The vote runs for 72 hours, until Wednesday 7:15pm PST.
> 
> Thanks,
> 
> Avery
> 
> On 4/12/13 10:39 PM, Avery Ching wrote:
>> Thanks to the quick feedback from Roman and Lewis, we have cut a new
>> RC1 that addresses the following issues.
>>
>> * Got rid of .git repo in tarball
>> * Fixed issue with not compiling without git repo (GIRAPH-628)
>> * Used gnutar in OSX rather than tar to generate the tarball and get
>> rid of warnings
>> * Pushed GIRAPH-627 to support the yarn profile better
>> * Tarball name changed to the final artifact name (giraph-1.0.tar.gz)
>>
>> Release notes:
>> http://people.apache.org/~aching/giraph-1.0-RC1/RELEASE_NOTES.html
>>
>> Release artifacts:
>> http://people.apache.org/~aching/giraph-1.0-RC1/
>>
>> Corresponding git tag:
>> https://git-wip-us.apache.org/repos/asf?p=giraph.git;a=shortlog;h=refs/tags/release-1.0-RC1
>>
>>
>> Signing keys:
>> http://people.apache.org/keys/group/giraph.asc
>>
>> The vote runs for 72 hours, until Monday 11pm PST.
>>
>> Thanks,
>>
>> Avery
>>
>> Original message below regarding rc0:
>>
>> -------------------------------
>>
>> Fellow Giraphers,
>>
>> We have a our first release candidate since graduating from
>> incubation.  This is a source release, primarily due to the different
>> versions of Hadoop we support with munge (similar to the 0.1
>> release).  Since 0.1, we've made A TON of progress on overall
>> performance, optimizing memory use, split vertex/edge inputs, easy
>> interoperability with Apache Hive, and a bunch of other areas.  In
>> many ways, this is an almost totally different codebase.  Thanks
>> everyone for your hard work!
>>
>> Apache Giraph has been running in production at Facebook (against
>> Facebook's Corona implementation of Hadoop -
>> https://github.com/facebook/hadoop-20/tree/master/src/contrib/corona)
>> since around last December.  It has proven to be very scalable,
>> performant, and enables a bunch of new applications.  Based on the
>> drastic improvements and the use of Giraph in production, it seems
>> appropriate to bump up our version to 1.0.
>>
>> While anyone can vote, the ASF requires majority approval from the PMC
>> -- i.e., at least three PMC members must vote affirmatively for
>> release, and there must be more positive than negative votes. Releases
>> may not be vetoed. Before voting +1 PMC members are required to
>> download the signed source code package, compile it as provided, and
>> test the resulting executable on their own platform, along with also
>> verifying that the package meets the requirements of the ASF policy on
>> releases.
>>
>> Please test this against many other Hadoop versions and let us know
>> how this goes!
>>
>> Release notes:
>> http://people.apache.org/~aching/giraph-1.0-RC0/RELEASE_NOTES.html
>>
>> Release artifacts:
>> http://people.apache.org/~aching/giraph-1.0-RC0/
>>
>> Corresponding git tag:
>> https://git-wip-us.apache.org/repos/asf?p=giraph.git;a=shortlog;h=refs/tags/release-1.0-RC0
>>
>>
>> Signing keys:
>> http://people.apache.org/keys/group/giraph.asc
>>
>> The vote runs for 72 hours, until Monday 4pm PST.
>>
>> Thanks everyone for your patience with this release!
>>
>> Avery
> 


Re: [VOTE][CHANGED] Release Giraph 1.0.0 (rc2)

Posted by Eli Reisman <ap...@gmail.com>.
You have a running +1 from me, consider it a standing vote for any more
RC's needed to make the release happen.

@Roman/Avery: In regards to the YARN versioning, if you'd rather (for the
short term) have me bump the hardcoded hadoop version to 2.0.4-alpha in
anticipation of a more enlightened Maven fix to select versions via -D
options, I can do that tonight/tomorrow to get it into RC3. If you don't
hear back tonight, you can CC me at ereisman@etsy.com and I'll get it.

I will also update the wiki soon to explain how to run the YARN version,
how it works at a high level, and even (to be removed later) the 2 or 3
spots in 2 POM files you need to change the hardcoded Hadoop version if
someone wants to build against the current YARN impl that will end up in
our release. Its not appropriate as a replacement for the correct
command-line selection, but it is certainly quick and easy to do on the
fly, especially if users will be building from source anyway. There is no
Hadoop versioning set in code in GIRAPH-13, so the Maven scripts are the
only changes required.




On Mon, Apr 15, 2013 at 4:55 PM, Avery Ching <ac...@apache.org> wrote:

> Thanks for pointing this out, Roman. I'll push out RC3 sometime today with
> those changes.
>
> Avery
>
>
> On 4/15/13 1:46 PM, Roman Shaposhnik wrote:
>
>> Avery,
>>
>> when I download the latest RC as:
>>      http://people.apache.org/~**aching/giraph-1.0.0-RC2/**
>> giraph-1.0.0.tar.gz<http://people.apache.org/~aching/giraph-1.0.0-RC2/giraph-1.0.0.tar.gz>
>>
>> it looks like the internal Maven versions in POM files
>> are still 0.2-SNAPSHOT -- that needs to be fixed.
>>
>> Also, once that is fixed, could you please make the
>> staged Maven artifacts available as well so that
>> I can test those:
>>      http://www.apache.org/dev/**publishing-maven-artifacts.**
>> html#staging-maven<http://www.apache.org/dev/publishing-maven-artifacts.html#staging-maven>
>>
>> As it stands, unfortunately, I think we might need RC3.
>>
>> Thanks,
>> Roman.
>>
>> On Sun, Apr 14, 2013 at 7:12 PM, Avery Ching <ac...@apache.org> wrote:
>>
>>> Hopefully last RC release.  This patch includes GIRAPH-630.  I also
>>> changed
>>> the version number to be 1.0.0 so our next release can be 1.0.1.
>>>
>>> Release notes:
>>> http://people.apache.org/~**aching/giraph-1.0.0-RC2/**RELEASE_NOTES.html<http://people.apache.org/~aching/giraph-1.0.0-RC2/RELEASE_NOTES.html>
>>>
>>> Release artifacts:
>>> http://people.apache.org/~**aching/giraph-1.0.0-RC2/<http://people.apache.org/~aching/giraph-1.0.0-RC2/>
>>>
>>> Corresponding git tag:
>>> https://git-wip-us.apache.org/**repos/asf?p=giraph.git;a=**
>>> shortlog;h=refs/tags/release-**1.0.0-RC2<https://git-wip-us.apache.org/repos/asf?p=giraph.git;a=shortlog;h=refs/tags/release-1.0.0-RC2>
>>>
>>> Signing keys:
>>> http://people.apache.org/keys/**group/giraph.asc<http://people.apache.org/keys/group/giraph.asc>
>>>
>>> The vote runs for 72 hours, until Wednesday 7:15pm PST.
>>>
>>> Thanks,
>>>
>>> Avery
>>>
>>> On 4/12/13 10:39 PM, Avery Ching wrote:
>>>
>>>> Thanks to the quick feedback from Roman and Lewis, we have cut a new RC1
>>>> that addresses the following issues.
>>>>
>>>> * Got rid of .git repo in tarball
>>>> * Fixed issue with not compiling without git repo (GIRAPH-628)
>>>> * Used gnutar in OSX rather than tar to generate the tarball and get rid
>>>> of warnings
>>>> * Pushed GIRAPH-627 to support the yarn profile better
>>>> * Tarball name changed to the final artifact name (giraph-1.0.tar.gz)
>>>>
>>>> Release notes:
>>>> http://people.apache.org/~**aching/giraph-1.0-RC1/RELEASE_**NOTES.html<http://people.apache.org/~aching/giraph-1.0-RC1/RELEASE_NOTES.html>
>>>>
>>>> Release artifacts:
>>>> http://people.apache.org/~**aching/giraph-1.0-RC1/<http://people.apache.org/~aching/giraph-1.0-RC1/>
>>>>
>>>> Corresponding git tag:
>>>>
>>>> https://git-wip-us.apache.org/**repos/asf?p=giraph.git;a=**
>>>> shortlog;h=refs/tags/release-**1.0-RC1<https://git-wip-us.apache.org/repos/asf?p=giraph.git;a=shortlog;h=refs/tags/release-1.0-RC1>
>>>>
>>>> Signing keys:
>>>> http://people.apache.org/keys/**group/giraph.asc<http://people.apache.org/keys/group/giraph.asc>
>>>>
>>>> The vote runs for 72 hours, until Monday 11pm PST.
>>>>
>>>> Thanks,
>>>>
>>>> Avery
>>>>
>>>> Original message below regarding rc0:
>>>>
>>>> ------------------------------**-
>>>>
>>>> Fellow Giraphers,
>>>>
>>>> We have a our first release candidate since graduating from incubation.
>>>> This is a source release, primarily due to the different versions of
>>>> Hadoop
>>>> we support with munge (similar to the 0.1 release).  Since 0.1, we've
>>>> made A
>>>> TON of progress on overall performance, optimizing memory use, split
>>>> vertex/edge inputs, easy interoperability with Apache Hive, and a bunch
>>>> of
>>>> other areas.  In many ways, this is an almost totally different
>>>> codebase.
>>>> Thanks everyone for your hard work!
>>>>
>>>> Apache Giraph has been running in production at Facebook (against
>>>> Facebook's Corona implementation of Hadoop -
>>>> https://github.com/facebook/**hadoop-20/tree/master/src/**
>>>> contrib/corona<https://github.com/facebook/hadoop-20/tree/master/src/contrib/corona>)
>>>> since
>>>> around last December.  It has proven to be very scalable, performant,
>>>> and
>>>> enables a bunch of new applications.  Based on the drastic improvements
>>>> and
>>>> the use of Giraph in production, it seems appropriate to bump up our
>>>> version
>>>> to 1.0.
>>>>
>>>> While anyone can vote, the ASF requires majority approval from the PMC
>>>> --
>>>> i.e., at least three PMC members must vote affirmatively for release,
>>>> and
>>>> there must be more positive than negative votes. Releases may not be
>>>> vetoed.
>>>> Before voting +1 PMC members are required to download the signed source
>>>> code
>>>> package, compile it as provided, and test the resulting executable on
>>>> their
>>>> own platform, along with also verifying that the package meets the
>>>> requirements of the ASF policy on releases.
>>>>
>>>> Please test this against many other Hadoop versions and let us know how
>>>> this goes!
>>>>
>>>> Release notes:
>>>> http://people.apache.org/~**aching/giraph-1.0-RC0/RELEASE_**NOTES.html<http://people.apache.org/~aching/giraph-1.0-RC0/RELEASE_NOTES.html>
>>>>
>>>> Release artifacts:
>>>> http://people.apache.org/~**aching/giraph-1.0-RC0/<http://people.apache.org/~aching/giraph-1.0-RC0/>
>>>>
>>>> Corresponding git tag:
>>>>
>>>> https://git-wip-us.apache.org/**repos/asf?p=giraph.git;a=**
>>>> shortlog;h=refs/tags/release-**1.0-RC0<https://git-wip-us.apache.org/repos/asf?p=giraph.git;a=shortlog;h=refs/tags/release-1.0-RC0>
>>>>
>>>> Signing keys:
>>>> http://people.apache.org/keys/**group/giraph.asc<http://people.apache.org/keys/group/giraph.asc>
>>>>
>>>> The vote runs for 72 hours, until Monday 4pm PST.
>>>>
>>>> Thanks everyone for your patience with this release!
>>>>
>>>> Avery
>>>>
>>>
>>>
>

Re: [VOTE][CHANGED] Release Giraph 1.0.0 (rc2)

Posted by Avery Ching <ac...@apache.org>.
Thanks for pointing this out, Roman. I'll push out RC3 sometime today 
with those changes.

Avery

On 4/15/13 1:46 PM, Roman Shaposhnik wrote:
> Avery,
>
> when I download the latest RC as:
>      http://people.apache.org/~aching/giraph-1.0.0-RC2/giraph-1.0.0.tar.gz
>
> it looks like the internal Maven versions in POM files
> are still 0.2-SNAPSHOT -- that needs to be fixed.
>
> Also, once that is fixed, could you please make the
> staged Maven artifacts available as well so that
> I can test those:
>      http://www.apache.org/dev/publishing-maven-artifacts.html#staging-maven
>
> As it stands, unfortunately, I think we might need RC3.
>
> Thanks,
> Roman.
>
> On Sun, Apr 14, 2013 at 7:12 PM, Avery Ching <ac...@apache.org> wrote:
>> Hopefully last RC release.  This patch includes GIRAPH-630.  I also changed
>> the version number to be 1.0.0 so our next release can be 1.0.1.
>>
>> Release notes:
>> http://people.apache.org/~aching/giraph-1.0.0-RC2/RELEASE_NOTES.html
>>
>> Release artifacts:
>> http://people.apache.org/~aching/giraph-1.0.0-RC2/
>>
>> Corresponding git tag:
>> https://git-wip-us.apache.org/repos/asf?p=giraph.git;a=shortlog;h=refs/tags/release-1.0.0-RC2
>>
>> Signing keys:
>> http://people.apache.org/keys/group/giraph.asc
>>
>> The vote runs for 72 hours, until Wednesday 7:15pm PST.
>>
>> Thanks,
>>
>> Avery
>>
>> On 4/12/13 10:39 PM, Avery Ching wrote:
>>> Thanks to the quick feedback from Roman and Lewis, we have cut a new RC1
>>> that addresses the following issues.
>>>
>>> * Got rid of .git repo in tarball
>>> * Fixed issue with not compiling without git repo (GIRAPH-628)
>>> * Used gnutar in OSX rather than tar to generate the tarball and get rid
>>> of warnings
>>> * Pushed GIRAPH-627 to support the yarn profile better
>>> * Tarball name changed to the final artifact name (giraph-1.0.tar.gz)
>>>
>>> Release notes:
>>> http://people.apache.org/~aching/giraph-1.0-RC1/RELEASE_NOTES.html
>>>
>>> Release artifacts:
>>> http://people.apache.org/~aching/giraph-1.0-RC1/
>>>
>>> Corresponding git tag:
>>>
>>> https://git-wip-us.apache.org/repos/asf?p=giraph.git;a=shortlog;h=refs/tags/release-1.0-RC1
>>>
>>> Signing keys:
>>> http://people.apache.org/keys/group/giraph.asc
>>>
>>> The vote runs for 72 hours, until Monday 11pm PST.
>>>
>>> Thanks,
>>>
>>> Avery
>>>
>>> Original message below regarding rc0:
>>>
>>> -------------------------------
>>>
>>> Fellow Giraphers,
>>>
>>> We have a our first release candidate since graduating from incubation.
>>> This is a source release, primarily due to the different versions of Hadoop
>>> we support with munge (similar to the 0.1 release).  Since 0.1, we've made A
>>> TON of progress on overall performance, optimizing memory use, split
>>> vertex/edge inputs, easy interoperability with Apache Hive, and a bunch of
>>> other areas.  In many ways, this is an almost totally different codebase.
>>> Thanks everyone for your hard work!
>>>
>>> Apache Giraph has been running in production at Facebook (against
>>> Facebook's Corona implementation of Hadoop -
>>> https://github.com/facebook/hadoop-20/tree/master/src/contrib/corona) since
>>> around last December.  It has proven to be very scalable, performant, and
>>> enables a bunch of new applications.  Based on the drastic improvements and
>>> the use of Giraph in production, it seems appropriate to bump up our version
>>> to 1.0.
>>>
>>> While anyone can vote, the ASF requires majority approval from the PMC --
>>> i.e., at least three PMC members must vote affirmatively for release, and
>>> there must be more positive than negative votes. Releases may not be vetoed.
>>> Before voting +1 PMC members are required to download the signed source code
>>> package, compile it as provided, and test the resulting executable on their
>>> own platform, along with also verifying that the package meets the
>>> requirements of the ASF policy on releases.
>>>
>>> Please test this against many other Hadoop versions and let us know how
>>> this goes!
>>>
>>> Release notes:
>>> http://people.apache.org/~aching/giraph-1.0-RC0/RELEASE_NOTES.html
>>>
>>> Release artifacts:
>>> http://people.apache.org/~aching/giraph-1.0-RC0/
>>>
>>> Corresponding git tag:
>>>
>>> https://git-wip-us.apache.org/repos/asf?p=giraph.git;a=shortlog;h=refs/tags/release-1.0-RC0
>>>
>>> Signing keys:
>>> http://people.apache.org/keys/group/giraph.asc
>>>
>>> The vote runs for 72 hours, until Monday 4pm PST.
>>>
>>> Thanks everyone for your patience with this release!
>>>
>>> Avery
>>


Re: [VOTE][CHANGED] Release Giraph 1.0.0 (rc2)

Posted by Avery Ching <ac...@apache.org>.
Thanks for pointing this out, Roman. I'll push out RC3 sometime today 
with those changes.

Avery

On 4/15/13 1:46 PM, Roman Shaposhnik wrote:
> Avery,
>
> when I download the latest RC as:
>      http://people.apache.org/~aching/giraph-1.0.0-RC2/giraph-1.0.0.tar.gz
>
> it looks like the internal Maven versions in POM files
> are still 0.2-SNAPSHOT -- that needs to be fixed.
>
> Also, once that is fixed, could you please make the
> staged Maven artifacts available as well so that
> I can test those:
>      http://www.apache.org/dev/publishing-maven-artifacts.html#staging-maven
>
> As it stands, unfortunately, I think we might need RC3.
>
> Thanks,
> Roman.
>
> On Sun, Apr 14, 2013 at 7:12 PM, Avery Ching <ac...@apache.org> wrote:
>> Hopefully last RC release.  This patch includes GIRAPH-630.  I also changed
>> the version number to be 1.0.0 so our next release can be 1.0.1.
>>
>> Release notes:
>> http://people.apache.org/~aching/giraph-1.0.0-RC2/RELEASE_NOTES.html
>>
>> Release artifacts:
>> http://people.apache.org/~aching/giraph-1.0.0-RC2/
>>
>> Corresponding git tag:
>> https://git-wip-us.apache.org/repos/asf?p=giraph.git;a=shortlog;h=refs/tags/release-1.0.0-RC2
>>
>> Signing keys:
>> http://people.apache.org/keys/group/giraph.asc
>>
>> The vote runs for 72 hours, until Wednesday 7:15pm PST.
>>
>> Thanks,
>>
>> Avery
>>
>> On 4/12/13 10:39 PM, Avery Ching wrote:
>>> Thanks to the quick feedback from Roman and Lewis, we have cut a new RC1
>>> that addresses the following issues.
>>>
>>> * Got rid of .git repo in tarball
>>> * Fixed issue with not compiling without git repo (GIRAPH-628)
>>> * Used gnutar in OSX rather than tar to generate the tarball and get rid
>>> of warnings
>>> * Pushed GIRAPH-627 to support the yarn profile better
>>> * Tarball name changed to the final artifact name (giraph-1.0.tar.gz)
>>>
>>> Release notes:
>>> http://people.apache.org/~aching/giraph-1.0-RC1/RELEASE_NOTES.html
>>>
>>> Release artifacts:
>>> http://people.apache.org/~aching/giraph-1.0-RC1/
>>>
>>> Corresponding git tag:
>>>
>>> https://git-wip-us.apache.org/repos/asf?p=giraph.git;a=shortlog;h=refs/tags/release-1.0-RC1
>>>
>>> Signing keys:
>>> http://people.apache.org/keys/group/giraph.asc
>>>
>>> The vote runs for 72 hours, until Monday 11pm PST.
>>>
>>> Thanks,
>>>
>>> Avery
>>>
>>> Original message below regarding rc0:
>>>
>>> -------------------------------
>>>
>>> Fellow Giraphers,
>>>
>>> We have a our first release candidate since graduating from incubation.
>>> This is a source release, primarily due to the different versions of Hadoop
>>> we support with munge (similar to the 0.1 release).  Since 0.1, we've made A
>>> TON of progress on overall performance, optimizing memory use, split
>>> vertex/edge inputs, easy interoperability with Apache Hive, and a bunch of
>>> other areas.  In many ways, this is an almost totally different codebase.
>>> Thanks everyone for your hard work!
>>>
>>> Apache Giraph has been running in production at Facebook (against
>>> Facebook's Corona implementation of Hadoop -
>>> https://github.com/facebook/hadoop-20/tree/master/src/contrib/corona) since
>>> around last December.  It has proven to be very scalable, performant, and
>>> enables a bunch of new applications.  Based on the drastic improvements and
>>> the use of Giraph in production, it seems appropriate to bump up our version
>>> to 1.0.
>>>
>>> While anyone can vote, the ASF requires majority approval from the PMC --
>>> i.e., at least three PMC members must vote affirmatively for release, and
>>> there must be more positive than negative votes. Releases may not be vetoed.
>>> Before voting +1 PMC members are required to download the signed source code
>>> package, compile it as provided, and test the resulting executable on their
>>> own platform, along with also verifying that the package meets the
>>> requirements of the ASF policy on releases.
>>>
>>> Please test this against many other Hadoop versions and let us know how
>>> this goes!
>>>
>>> Release notes:
>>> http://people.apache.org/~aching/giraph-1.0-RC0/RELEASE_NOTES.html
>>>
>>> Release artifacts:
>>> http://people.apache.org/~aching/giraph-1.0-RC0/
>>>
>>> Corresponding git tag:
>>>
>>> https://git-wip-us.apache.org/repos/asf?p=giraph.git;a=shortlog;h=refs/tags/release-1.0-RC0
>>>
>>> Signing keys:
>>> http://people.apache.org/keys/group/giraph.asc
>>>
>>> The vote runs for 72 hours, until Monday 4pm PST.
>>>
>>> Thanks everyone for your patience with this release!
>>>
>>> Avery
>>


Re: [VOTE][CHANGED] Release Giraph 1.0.0 (rc2)

Posted by Roman Shaposhnik <rv...@apache.org>.
Avery,

when I download the latest RC as:
    http://people.apache.org/~aching/giraph-1.0.0-RC2/giraph-1.0.0.tar.gz

it looks like the internal Maven versions in POM files
are still 0.2-SNAPSHOT -- that needs to be fixed.

Also, once that is fixed, could you please make the
staged Maven artifacts available as well so that
I can test those:
    http://www.apache.org/dev/publishing-maven-artifacts.html#staging-maven

As it stands, unfortunately, I think we might need RC3.

Thanks,
Roman.

On Sun, Apr 14, 2013 at 7:12 PM, Avery Ching <ac...@apache.org> wrote:
> Hopefully last RC release.  This patch includes GIRAPH-630.  I also changed
> the version number to be 1.0.0 so our next release can be 1.0.1.
>
> Release notes:
> http://people.apache.org/~aching/giraph-1.0.0-RC2/RELEASE_NOTES.html
>
> Release artifacts:
> http://people.apache.org/~aching/giraph-1.0.0-RC2/
>
> Corresponding git tag:
> https://git-wip-us.apache.org/repos/asf?p=giraph.git;a=shortlog;h=refs/tags/release-1.0.0-RC2
>
> Signing keys:
> http://people.apache.org/keys/group/giraph.asc
>
> The vote runs for 72 hours, until Wednesday 7:15pm PST.
>
> Thanks,
>
> Avery
>
> On 4/12/13 10:39 PM, Avery Ching wrote:
>>
>> Thanks to the quick feedback from Roman and Lewis, we have cut a new RC1
>> that addresses the following issues.
>>
>> * Got rid of .git repo in tarball
>> * Fixed issue with not compiling without git repo (GIRAPH-628)
>> * Used gnutar in OSX rather than tar to generate the tarball and get rid
>> of warnings
>> * Pushed GIRAPH-627 to support the yarn profile better
>> * Tarball name changed to the final artifact name (giraph-1.0.tar.gz)
>>
>> Release notes:
>> http://people.apache.org/~aching/giraph-1.0-RC1/RELEASE_NOTES.html
>>
>> Release artifacts:
>> http://people.apache.org/~aching/giraph-1.0-RC1/
>>
>> Corresponding git tag:
>>
>> https://git-wip-us.apache.org/repos/asf?p=giraph.git;a=shortlog;h=refs/tags/release-1.0-RC1
>>
>> Signing keys:
>> http://people.apache.org/keys/group/giraph.asc
>>
>> The vote runs for 72 hours, until Monday 11pm PST.
>>
>> Thanks,
>>
>> Avery
>>
>> Original message below regarding rc0:
>>
>> -------------------------------
>>
>> Fellow Giraphers,
>>
>> We have a our first release candidate since graduating from incubation.
>> This is a source release, primarily due to the different versions of Hadoop
>> we support with munge (similar to the 0.1 release).  Since 0.1, we've made A
>> TON of progress on overall performance, optimizing memory use, split
>> vertex/edge inputs, easy interoperability with Apache Hive, and a bunch of
>> other areas.  In many ways, this is an almost totally different codebase.
>> Thanks everyone for your hard work!
>>
>> Apache Giraph has been running in production at Facebook (against
>> Facebook's Corona implementation of Hadoop -
>> https://github.com/facebook/hadoop-20/tree/master/src/contrib/corona) since
>> around last December.  It has proven to be very scalable, performant, and
>> enables a bunch of new applications.  Based on the drastic improvements and
>> the use of Giraph in production, it seems appropriate to bump up our version
>> to 1.0.
>>
>> While anyone can vote, the ASF requires majority approval from the PMC --
>> i.e., at least three PMC members must vote affirmatively for release, and
>> there must be more positive than negative votes. Releases may not be vetoed.
>> Before voting +1 PMC members are required to download the signed source code
>> package, compile it as provided, and test the resulting executable on their
>> own platform, along with also verifying that the package meets the
>> requirements of the ASF policy on releases.
>>
>> Please test this against many other Hadoop versions and let us know how
>> this goes!
>>
>> Release notes:
>> http://people.apache.org/~aching/giraph-1.0-RC0/RELEASE_NOTES.html
>>
>> Release artifacts:
>> http://people.apache.org/~aching/giraph-1.0-RC0/
>>
>> Corresponding git tag:
>>
>> https://git-wip-us.apache.org/repos/asf?p=giraph.git;a=shortlog;h=refs/tags/release-1.0-RC0
>>
>> Signing keys:
>> http://people.apache.org/keys/group/giraph.asc
>>
>> The vote runs for 72 hours, until Monday 4pm PST.
>>
>> Thanks everyone for your patience with this release!
>>
>> Avery
>
>

Re: [VOTE][CHANGED] Release Giraph 1.0.0 (rc2)

Posted by Roman Shaposhnik <rv...@apache.org>.
Avery,

when I download the latest RC as:
    http://people.apache.org/~aching/giraph-1.0.0-RC2/giraph-1.0.0.tar.gz

it looks like the internal Maven versions in POM files
are still 0.2-SNAPSHOT -- that needs to be fixed.

Also, once that is fixed, could you please make the
staged Maven artifacts available as well so that
I can test those:
    http://www.apache.org/dev/publishing-maven-artifacts.html#staging-maven

As it stands, unfortunately, I think we might need RC3.

Thanks,
Roman.

On Sun, Apr 14, 2013 at 7:12 PM, Avery Ching <ac...@apache.org> wrote:
> Hopefully last RC release.  This patch includes GIRAPH-630.  I also changed
> the version number to be 1.0.0 so our next release can be 1.0.1.
>
> Release notes:
> http://people.apache.org/~aching/giraph-1.0.0-RC2/RELEASE_NOTES.html
>
> Release artifacts:
> http://people.apache.org/~aching/giraph-1.0.0-RC2/
>
> Corresponding git tag:
> https://git-wip-us.apache.org/repos/asf?p=giraph.git;a=shortlog;h=refs/tags/release-1.0.0-RC2
>
> Signing keys:
> http://people.apache.org/keys/group/giraph.asc
>
> The vote runs for 72 hours, until Wednesday 7:15pm PST.
>
> Thanks,
>
> Avery
>
> On 4/12/13 10:39 PM, Avery Ching wrote:
>>
>> Thanks to the quick feedback from Roman and Lewis, we have cut a new RC1
>> that addresses the following issues.
>>
>> * Got rid of .git repo in tarball
>> * Fixed issue with not compiling without git repo (GIRAPH-628)
>> * Used gnutar in OSX rather than tar to generate the tarball and get rid
>> of warnings
>> * Pushed GIRAPH-627 to support the yarn profile better
>> * Tarball name changed to the final artifact name (giraph-1.0.tar.gz)
>>
>> Release notes:
>> http://people.apache.org/~aching/giraph-1.0-RC1/RELEASE_NOTES.html
>>
>> Release artifacts:
>> http://people.apache.org/~aching/giraph-1.0-RC1/
>>
>> Corresponding git tag:
>>
>> https://git-wip-us.apache.org/repos/asf?p=giraph.git;a=shortlog;h=refs/tags/release-1.0-RC1
>>
>> Signing keys:
>> http://people.apache.org/keys/group/giraph.asc
>>
>> The vote runs for 72 hours, until Monday 11pm PST.
>>
>> Thanks,
>>
>> Avery
>>
>> Original message below regarding rc0:
>>
>> -------------------------------
>>
>> Fellow Giraphers,
>>
>> We have a our first release candidate since graduating from incubation.
>> This is a source release, primarily due to the different versions of Hadoop
>> we support with munge (similar to the 0.1 release).  Since 0.1, we've made A
>> TON of progress on overall performance, optimizing memory use, split
>> vertex/edge inputs, easy interoperability with Apache Hive, and a bunch of
>> other areas.  In many ways, this is an almost totally different codebase.
>> Thanks everyone for your hard work!
>>
>> Apache Giraph has been running in production at Facebook (against
>> Facebook's Corona implementation of Hadoop -
>> https://github.com/facebook/hadoop-20/tree/master/src/contrib/corona) since
>> around last December.  It has proven to be very scalable, performant, and
>> enables a bunch of new applications.  Based on the drastic improvements and
>> the use of Giraph in production, it seems appropriate to bump up our version
>> to 1.0.
>>
>> While anyone can vote, the ASF requires majority approval from the PMC --
>> i.e., at least three PMC members must vote affirmatively for release, and
>> there must be more positive than negative votes. Releases may not be vetoed.
>> Before voting +1 PMC members are required to download the signed source code
>> package, compile it as provided, and test the resulting executable on their
>> own platform, along with also verifying that the package meets the
>> requirements of the ASF policy on releases.
>>
>> Please test this against many other Hadoop versions and let us know how
>> this goes!
>>
>> Release notes:
>> http://people.apache.org/~aching/giraph-1.0-RC0/RELEASE_NOTES.html
>>
>> Release artifacts:
>> http://people.apache.org/~aching/giraph-1.0-RC0/
>>
>> Corresponding git tag:
>>
>> https://git-wip-us.apache.org/repos/asf?p=giraph.git;a=shortlog;h=refs/tags/release-1.0-RC0
>>
>> Signing keys:
>> http://people.apache.org/keys/group/giraph.asc
>>
>> The vote runs for 72 hours, until Monday 4pm PST.
>>
>> Thanks everyone for your patience with this release!
>>
>> Avery
>
>

[VOTE][CHANGED] Release Giraph 1.0.0 (rc3)

Posted by Avery Ching <ac...@apache.org>.
Okay, we're on our 3rd RC now, mainly to incorporate Roman's suggestion 
about fixing the maven version number from 0.2-SNAPSHOT => 1.0.0.  
Roman, this is a source code release, we can't really deploy jars to 
maven since we use munge to support different versions of Hadoop.  Once 
we get rid of munge, we'll be able to do this in the future.

Release notes:
http://people.apache.org/~aching/giraph-1.0.0-RC3/RELEASE_NOTES.html

Release artifacts:
http://people.apache.org/~aching/giraph-1.0.0-RC3/

Corresponding git tag:
https://git-wip-us.apache.org/repos/asf?p=giraph.git;a=shortlog;h=refs/tags/release-1.0.0-RC3 


Signing keys:
http://people.apache.org/keys/group/giraph.asc

The vote runs for 72 hours, until Thusday 4pm PST.

Thanks for everyone's input!

Avery

On 4/14/13 7:12 PM, Avery Ching wrote:
> Hopefully last RC release.  This patch includes GIRAPH-630.  I also 
> changed the version number to be 1.0.0 so our next release can be 1.0.1.
>
> Release notes:
> http://people.apache.org/~aching/giraph-1.0.0-RC2/RELEASE_NOTES.html
>
> Release artifacts:
> http://people.apache.org/~aching/giraph-1.0.0-RC2/
>
> Corresponding git tag:
> https://git-wip-us.apache.org/repos/asf?p=giraph.git;a=shortlog;h=refs/tags/release-1.0.0-RC2 
>
>
> Signing keys:
> http://people.apache.org/keys/group/giraph.asc
>
> The vote runs for 72 hours, until Wednesday 7:15pm PST.
>
> Thanks,
>
> Avery
>
> On 4/12/13 10:39 PM, Avery Ching wrote:
>> Thanks to the quick feedback from Roman and Lewis, we have cut a new 
>> RC1 that addresses the following issues.
>>
>> * Got rid of .git repo in tarball
>> * Fixed issue with not compiling without git repo (GIRAPH-628)
>> * Used gnutar in OSX rather than tar to generate the tarball and get 
>> rid of warnings
>> * Pushed GIRAPH-627 to support the yarn profile better
>> * Tarball name changed to the final artifact name (giraph-1.0.tar.gz)
>>
>> Release notes:
>> http://people.apache.org/~aching/giraph-1.0-RC1/RELEASE_NOTES.html
>>
>> Release artifacts:
>> http://people.apache.org/~aching/giraph-1.0-RC1/
>>
>> Corresponding git tag:
>> https://git-wip-us.apache.org/repos/asf?p=giraph.git;a=shortlog;h=refs/tags/release-1.0-RC1 
>>
>>
>> Signing keys:
>> http://people.apache.org/keys/group/giraph.asc
>>
>> The vote runs for 72 hours, until Monday 11pm PST.
>>
>> Thanks,
>>
>> Avery
>>
>> Original message below regarding rc0:
>>
>> -------------------------------
>>
>> Fellow Giraphers,
>>
>> We have a our first release candidate since graduating from 
>> incubation.  This is a source release, primarily due to the different 
>> versions of Hadoop we support with munge (similar to the 0.1 
>> release).  Since 0.1, we've made A TON of progress on overall 
>> performance, optimizing memory use, split vertex/edge inputs, easy 
>> interoperability with Apache Hive, and a bunch of other areas.  In 
>> many ways, this is an almost totally different codebase.  Thanks 
>> everyone for your hard work!
>>
>> Apache Giraph has been running in production at Facebook (against 
>> Facebook's Corona implementation of Hadoop - 
>> https://github.com/facebook/hadoop-20/tree/master/src/contrib/corona) 
>> since around last December.  It has proven to be very scalable, 
>> performant, and enables a bunch of new applications.  Based on the 
>> drastic improvements and the use of Giraph in production, it seems 
>> appropriate to bump up our version to 1.0.
>>
>> While anyone can vote, the ASF requires majority approval from the 
>> PMC -- i.e., at least three PMC members must vote affirmatively for 
>> release, and there must be more positive than negative votes. 
>> Releases may not be vetoed. Before voting +1 PMC members are required 
>> to download the signed source code package, compile it as provided, 
>> and test the resulting executable on their own platform, along with 
>> also verifying that the package meets the requirements of the ASF 
>> policy on releases.
>>
>> Please test this against many other Hadoop versions and let us know 
>> how this goes!
>>
>> Release notes:
>> http://people.apache.org/~aching/giraph-1.0-RC0/RELEASE_NOTES.html
>>
>> Release artifacts:
>> http://people.apache.org/~aching/giraph-1.0-RC0/
>>
>> Corresponding git tag:
>> https://git-wip-us.apache.org/repos/asf?p=giraph.git;a=shortlog;h=refs/tags/release-1.0-RC0 
>>
>>
>> Signing keys:
>> http://people.apache.org/keys/group/giraph.asc
>>
>> The vote runs for 72 hours, until Monday 4pm PST.
>>
>> Thanks everyone for your patience with this release!
>>
>> Avery
>


Re: [VOTE][CHANGED] Release Giraph 1.0.0 (rc2)

Posted by Avery Ching <ac...@apache.org>.
Hopefully last RC release.  This patch includes GIRAPH-630.  I also 
changed the version number to be 1.0.0 so our next release can be 1.0.1.

Release notes:
http://people.apache.org/~aching/giraph-1.0.0-RC2/RELEASE_NOTES.html

Release artifacts:
http://people.apache.org/~aching/giraph-1.0.0-RC2/

Corresponding git tag:
https://git-wip-us.apache.org/repos/asf?p=giraph.git;a=shortlog;h=refs/tags/release-1.0.0-RC2 


Signing keys:
http://people.apache.org/keys/group/giraph.asc

The vote runs for 72 hours, until Wednesday 7:15pm PST.

Thanks,

Avery

On 4/12/13 10:39 PM, Avery Ching wrote:
> Thanks to the quick feedback from Roman and Lewis, we have cut a new 
> RC1 that addresses the following issues.
>
> * Got rid of .git repo in tarball
> * Fixed issue with not compiling without git repo (GIRAPH-628)
> * Used gnutar in OSX rather than tar to generate the tarball and get 
> rid of warnings
> * Pushed GIRAPH-627 to support the yarn profile better
> * Tarball name changed to the final artifact name (giraph-1.0.tar.gz)
>
> Release notes:
> http://people.apache.org/~aching/giraph-1.0-RC1/RELEASE_NOTES.html
>
> Release artifacts:
> http://people.apache.org/~aching/giraph-1.0-RC1/
>
> Corresponding git tag:
> https://git-wip-us.apache.org/repos/asf?p=giraph.git;a=shortlog;h=refs/tags/release-1.0-RC1 
>
>
> Signing keys:
> http://people.apache.org/keys/group/giraph.asc
>
> The vote runs for 72 hours, until Monday 11pm PST.
>
> Thanks,
>
> Avery
>
> Original message below regarding rc0:
>
> -------------------------------
>
> Fellow Giraphers,
>
> We have a our first release candidate since graduating from 
> incubation.  This is a source release, primarily due to the different 
> versions of Hadoop we support with munge (similar to the 0.1 
> release).  Since 0.1, we've made A TON of progress on overall 
> performance, optimizing memory use, split vertex/edge inputs, easy 
> interoperability with Apache Hive, and a bunch of other areas.  In 
> many ways, this is an almost totally different codebase.  Thanks 
> everyone for your hard work!
>
> Apache Giraph has been running in production at Facebook (against 
> Facebook's Corona implementation of Hadoop - 
> https://github.com/facebook/hadoop-20/tree/master/src/contrib/corona) 
> since around last December.  It has proven to be very scalable, 
> performant, and enables a bunch of new applications.  Based on the 
> drastic improvements and the use of Giraph in production, it seems 
> appropriate to bump up our version to 1.0.
>
> While anyone can vote, the ASF requires majority approval from the PMC 
> -- i.e., at least three PMC members must vote affirmatively for 
> release, and there must be more positive than negative votes. Releases 
> may not be vetoed. Before voting +1 PMC members are required to 
> download the signed source code package, compile it as provided, and 
> test the resulting executable on their own platform, along with also 
> verifying that the package meets the requirements of the ASF policy on 
> releases.
>
> Please test this against many other Hadoop versions and let us know 
> how this goes!
>
> Release notes:
> http://people.apache.org/~aching/giraph-1.0-RC0/RELEASE_NOTES.html
>
> Release artifacts:
> http://people.apache.org/~aching/giraph-1.0-RC0/
>
> Corresponding git tag:
> https://git-wip-us.apache.org/repos/asf?p=giraph.git;a=shortlog;h=refs/tags/release-1.0-RC0 
>
>
> Signing keys:
> http://people.apache.org/keys/group/giraph.asc
>
> The vote runs for 72 hours, until Monday 4pm PST.
>
> Thanks everyone for your patience with this release!
>
> Avery


Re: [VOTE][CHANGED] Release Giraph 1.0.0 (rc2)

Posted by Avery Ching <ac...@apache.org>.
Hopefully last RC release.  This patch includes GIRAPH-630.  I also 
changed the version number to be 1.0.0 so our next release can be 1.0.1.

Release notes:
http://people.apache.org/~aching/giraph-1.0.0-RC2/RELEASE_NOTES.html

Release artifacts:
http://people.apache.org/~aching/giraph-1.0.0-RC2/

Corresponding git tag:
https://git-wip-us.apache.org/repos/asf?p=giraph.git;a=shortlog;h=refs/tags/release-1.0.0-RC2 


Signing keys:
http://people.apache.org/keys/group/giraph.asc

The vote runs for 72 hours, until Wednesday 7:15pm PST.

Thanks,

Avery

On 4/12/13 10:39 PM, Avery Ching wrote:
> Thanks to the quick feedback from Roman and Lewis, we have cut a new 
> RC1 that addresses the following issues.
>
> * Got rid of .git repo in tarball
> * Fixed issue with not compiling without git repo (GIRAPH-628)
> * Used gnutar in OSX rather than tar to generate the tarball and get 
> rid of warnings
> * Pushed GIRAPH-627 to support the yarn profile better
> * Tarball name changed to the final artifact name (giraph-1.0.tar.gz)
>
> Release notes:
> http://people.apache.org/~aching/giraph-1.0-RC1/RELEASE_NOTES.html
>
> Release artifacts:
> http://people.apache.org/~aching/giraph-1.0-RC1/
>
> Corresponding git tag:
> https://git-wip-us.apache.org/repos/asf?p=giraph.git;a=shortlog;h=refs/tags/release-1.0-RC1 
>
>
> Signing keys:
> http://people.apache.org/keys/group/giraph.asc
>
> The vote runs for 72 hours, until Monday 11pm PST.
>
> Thanks,
>
> Avery
>
> Original message below regarding rc0:
>
> -------------------------------
>
> Fellow Giraphers,
>
> We have a our first release candidate since graduating from 
> incubation.  This is a source release, primarily due to the different 
> versions of Hadoop we support with munge (similar to the 0.1 
> release).  Since 0.1, we've made A TON of progress on overall 
> performance, optimizing memory use, split vertex/edge inputs, easy 
> interoperability with Apache Hive, and a bunch of other areas.  In 
> many ways, this is an almost totally different codebase.  Thanks 
> everyone for your hard work!
>
> Apache Giraph has been running in production at Facebook (against 
> Facebook's Corona implementation of Hadoop - 
> https://github.com/facebook/hadoop-20/tree/master/src/contrib/corona) 
> since around last December.  It has proven to be very scalable, 
> performant, and enables a bunch of new applications.  Based on the 
> drastic improvements and the use of Giraph in production, it seems 
> appropriate to bump up our version to 1.0.
>
> While anyone can vote, the ASF requires majority approval from the PMC 
> -- i.e., at least three PMC members must vote affirmatively for 
> release, and there must be more positive than negative votes. Releases 
> may not be vetoed. Before voting +1 PMC members are required to 
> download the signed source code package, compile it as provided, and 
> test the resulting executable on their own platform, along with also 
> verifying that the package meets the requirements of the ASF policy on 
> releases.
>
> Please test this against many other Hadoop versions and let us know 
> how this goes!
>
> Release notes:
> http://people.apache.org/~aching/giraph-1.0-RC0/RELEASE_NOTES.html
>
> Release artifacts:
> http://people.apache.org/~aching/giraph-1.0-RC0/
>
> Corresponding git tag:
> https://git-wip-us.apache.org/repos/asf?p=giraph.git;a=shortlog;h=refs/tags/release-1.0-RC0 
>
>
> Signing keys:
> http://people.apache.org/keys/group/giraph.asc
>
> The vote runs for 72 hours, until Monday 4pm PST.
>
> Thanks everyone for your patience with this release!
>
> Avery