You are viewing a plain text version of this content. The canonical link for it is here.
Posted to general@incubator.apache.org by Jochen Wiedmann <jo...@gmail.com> on 2015/07/14 01:24:56 UTC

Podling request: Gerrit

Hi,

I am writing as one of the Mentors of the AsterixDB podling.

It recently came to my attention, that there are, in fact, multiple
Git repositories, which are used by the project, one of them being
located externally of the ASF. I understand the structure to be like
this:

  +--------------+  Commits   +------------------+  Mirrrors
+----------------+
   |  Gerrit      | --------------> | Git (External) | ------------->
| Git (ASF)    |
   +-------------+                    +------------------+
     +----------------+

The structure is made like this, because the project members desire
that no commits can enter without a review, which is done in Gerrit. [2]
(In the past, this was ensured by a commit hook in the external
repository. That commit hook possibly still exists, but it doesn't
prevent
code to enter the ASF repository directly without a review. This lack
of security is currently discussed by the podlings project members.)

I understand the desire, and, to me, it makes sense. OTOH,  I suspect
that this issue might affect a successful incubation. Hence this mail.

As Git is slowly gaining ground within the ASF, I'd suggest that a
possible resolution might be to have a Gerrit instance within the ASF.
Given how Github pull requests are already discussed by many projects,
I can imagine that many projects would like to adopt a similar policy.

How about that?

Thanks,

Jochen






[1] http://mail-archives.apache.org/mod_mbox/incubator-asterixdb-dev/201507.mbox/%3CCAN_YF5zRWZijKOQyYx59%2B7wUyXkPg0P2d-c2hBrx64mNFd4hBg%40mail.gmail.com%3E
[2] https://en.wikipedia.org/wiki/Gerrit_(software)

-- 
Any world that can produce the Taj Mahal, William Shakespeare,
and Stripe toothpaste can't be all bad. (C.R. MacNamara, One Two Three)

---------------------------------------------------------------------
To unsubscribe, e-mail: general-unsubscribe@incubator.apache.org
For additional commands, e-mail: general-help@incubator.apache.org


Re: Podling request: Gerrit

Posted by Chris Hillery <ch...@lambda.nu>.
On Wed, Jul 15, 2015 at 2:04 AM, Jochen Wiedmann <jo...@gmail.com>
wrote:

> On Wed, Jul 15, 2015 at 10:40 AM, Chris Hillery <ch...@hillery.land>
> wrote:
>
> > I feel sure that
> > with an ASF-hosted Gerrit, we wouldn't be able to install any hooks or
> > plugins, or manage permissions, or anything like that in the way that we
> > find useful.
>
> Not initially, to be sure. But in the medium term, and with a bit of
> flexibility, I'd bet against you on that.
>

Well, that's encouraging. To be honest, my initial comment was more on the
limitations of Gerrit rather than a commentary on anything ASF-specific.
Gerrit plugins are kind of a nightmare to manage. Gerrit hooks are easy
enough, but are (like Git hooks) shell scripts that run on the Gerrit host,
which I'm guessing would be extremely challenging to manage safely for a
large multi-client installation.

Again, though, I don't think having ASF host Gerrit is really necessary. I
do think having a mutually-agreeable working method available for teams
that would like to use their own hosted Gerrit for code-review purposes is
necessary, though. I can't currently see any reasons why that should run
afoul of ASF policies...

Ceej
aka Chris Hillery

Re: Podling request: Gerrit

Posted by Chris Hillery <ch...@lambda.nu>.
On Wed, Jul 15, 2015 at 2:04 AM, Jochen Wiedmann <jo...@gmail.com>
wrote:

> On Wed, Jul 15, 2015 at 10:40 AM, Chris Hillery <ch...@hillery.land>
> wrote:
>
> > I feel sure that
> > with an ASF-hosted Gerrit, we wouldn't be able to install any hooks or
> > plugins, or manage permissions, or anything like that in the way that we
> > find useful.
>
> Not initially, to be sure. But in the medium term, and with a bit of
> flexibility, I'd bet against you on that.
>

Well, that's encouraging. To be honest, my initial comment was more on the
limitations of Gerrit rather than a commentary on anything ASF-specific.
Gerrit plugins are kind of a nightmare to manage. Gerrit hooks are easy
enough, but are (like Git hooks) shell scripts that run on the Gerrit host,
which I'm guessing would be extremely challenging to manage safely for a
large multi-client installation.

Again, though, I don't think having ASF host Gerrit is really necessary. I
do think having a mutually-agreeable working method available for teams
that would like to use their own hosted Gerrit for code-review purposes is
necessary, though. I can't currently see any reasons why that should run
afoul of ASF policies...

Ceej
aka Chris Hillery

Re: Podling request: Gerrit

Posted by Jochen Wiedmann <jo...@gmail.com>.
On Wed, Jul 15, 2015 at 10:40 AM, Chris Hillery <ch...@hillery.land> wrote:

> I feel sure that
> with an ASF-hosted Gerrit, we wouldn't be able to install any hooks or
> plugins, or manage permissions, or anything like that in the way that we
> find useful.

Not initially, to be sure. But in the medium term, and with a bit of
flexibility, I'd bet against you on that.

Jochen




-- 
Any world that can produce the Taj Mahal, William Shakespeare,
and Stripe toothpaste can't be all bad. (C.R. MacNamara, One Two Three)

---------------------------------------------------------------------
To unsubscribe, e-mail: general-unsubscribe@incubator.apache.org
For additional commands, e-mail: general-help@incubator.apache.org


Re: Podling request: Gerrit

Posted by Jochen Wiedmann <jo...@gmail.com>.
On Wed, Jul 15, 2015 at 10:40 AM, Chris Hillery <ch...@hillery.land> wrote:

> I feel sure that
> with an ASF-hosted Gerrit, we wouldn't be able to install any hooks or
> plugins, or manage permissions, or anything like that in the way that we
> find useful.

Not initially, to be sure. But in the medium term, and with a bit of
flexibility, I'd bet against you on that.

Jochen




-- 
Any world that can produce the Taj Mahal, William Shakespeare,
and Stripe toothpaste can't be all bad. (C.R. MacNamara, One Two Three)

Re: Podling request: Gerrit

Posted by Chris Hillery <ch...@hillery.land>.
On Mon, Jul 13, 2015 at 5:01 PM, Roman Shaposhnik <ro...@shaposhnik.org>
wrote:

> IIRC, the problem with Gerrit workflow is that actually
> pushes into the repo are actually done by the bot.


That's how we (the Asterix project) had things set up, but it isn't a
requirement of Gerrit. In fact we had to kind of duct-tape that facility
onto Gerrit.


> This
> runs against ASF's desire to keep push logs that actually
> make sense.
>

When we switched to using the ASF repositories as canonical, we disabled
that auto-push, and I wrote a script that tried to simplify the process of
manually transferring the changes from Gerrit to the ASF repositories. That
should satisfy your requirements as I understand them; the actual "git
push" will be performed manually by a specific individual on the commiters
list, and can be logged as such.


Responding to the rest of the conversation about having a hosted Gerrit
instance at the ASF: I don't see that that will be necessary, and to be
honest I don't think it would even be desirable for us. I feel sure that
with an ASF-hosted Gerrit, we wouldn't be able to install any hooks or
plugins, or manage permissions, or anything like that in the way that we
find useful.

Ceej
aka Chris Hillery

Re: Podling request: Gerrit

Posted by Chris Hillery <ch...@hillery.land>.
On Mon, Jul 13, 2015 at 5:01 PM, Roman Shaposhnik <ro...@shaposhnik.org>
wrote:

> IIRC, the problem with Gerrit workflow is that actually
> pushes into the repo are actually done by the bot.


That's how we (the Asterix project) had things set up, but it isn't a
requirement of Gerrit. In fact we had to kind of duct-tape that facility
onto Gerrit.


> This
> runs against ASF's desire to keep push logs that actually
> make sense.
>

When we switched to using the ASF repositories as canonical, we disabled
that auto-push, and I wrote a script that tried to simplify the process of
manually transferring the changes from Gerrit to the ASF repositories. That
should satisfy your requirements as I understand them; the actual "git
push" will be performed manually by a specific individual on the commiters
list, and can be logged as such.


Responding to the rest of the conversation about having a hosted Gerrit
instance at the ASF: I don't see that that will be necessary, and to be
honest I don't think it would even be desirable for us. I feel sure that
with an ASF-hosted Gerrit, we wouldn't be able to install any hooks or
plugins, or manage permissions, or anything like that in the way that we
find useful.

Ceej
aka Chris Hillery

Re: Podling request: Gerrit

Posted by Roman Shaposhnik <ro...@shaposhnik.org>.
IIRC, the problem with Gerrit workflow is that actually
pushes into the repo are actually done by the bot. This
runs against ASF's desire to keep push logs that actually
make sense.

The setup that you're describing (although ASCII art
came broken via Gmail) seems to be addressing that
very problem: having a Gerrit-specific repo that a human
being then synchronizes with the ASF canonical Git
repo. This seems like a pretty reasonable way to
accomodate both constraints.

Thanks,
Roman.

On Mon, Jul 13, 2015 at 4:24 PM, Jochen Wiedmann
<jo...@gmail.com> wrote:
> Hi,
>
> I am writing as one of the Mentors of the AsterixDB podling.
>
> It recently came to my attention, that there are, in fact, multiple
> Git repositories, which are used by the project, one of them being
> located externally of the ASF. I understand the structure to be like
> this:
>
>   +--------------+  Commits   +------------------+  Mirrrors
> +----------------+
>    |  Gerrit      | --------------> | Git (External) | ------------->
> | Git (ASF)    |
>    +-------------+                    +------------------+
>      +----------------+
>
> The structure is made like this, because the project members desire
> that no commits can enter without a review, which is done in Gerrit. [2]
> (In the past, this was ensured by a commit hook in the external
> repository. That commit hook possibly still exists, but it doesn't
> prevent
> code to enter the ASF repository directly without a review. This lack
> of security is currently discussed by the podlings project members.)
>
> I understand the desire, and, to me, it makes sense. OTOH,  I suspect
> that this issue might affect a successful incubation. Hence this mail.
>
> As Git is slowly gaining ground within the ASF, I'd suggest that a
> possible resolution might be to have a Gerrit instance within the ASF.
> Given how Github pull requests are already discussed by many projects,
> I can imagine that many projects would like to adopt a similar policy.
>
> How about that?
>
> Thanks,
>
> Jochen
>
>
>
>
>
>
> [1] http://mail-archives.apache.org/mod_mbox/incubator-asterixdb-dev/201507.mbox/%3CCAN_YF5zRWZijKOQyYx59%2B7wUyXkPg0P2d-c2hBrx64mNFd4hBg%40mail.gmail.com%3E
> [2] https://en.wikipedia.org/wiki/Gerrit_(software)
>
> --
> Any world that can produce the Taj Mahal, William Shakespeare,
> and Stripe toothpaste can't be all bad. (C.R. MacNamara, One Two Three)
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: general-unsubscribe@incubator.apache.org
> For additional commands, e-mail: general-help@incubator.apache.org
>

---------------------------------------------------------------------
To unsubscribe, e-mail: general-unsubscribe@incubator.apache.org
For additional commands, e-mail: general-help@incubator.apache.org


Re: Podling request: Gerrit

Posted by Ian Maxon <im...@uci.edu>.
To add onto Jochen's comments, even something lesser than a hosted
Gerrit instance might suffice. The core issue for integrating our
previous Git workflow is that, as I understand it, there's no way to
have "robot" committers to ASF git. We previously had Gerrit acting on
behalf of whoever submitted the commit, and whoever submitted the
commit in this case is necessarily an Apache committer. Gerrit is just
a safe intermediary for performing what otherwise is a more error
prone process of cherry-picking and pushing commits. The person who
submitted and pushed the review is still recorded, via the committer
and author fields of the commit, as well as the reviewed-by fields in
the comment of the commit.

- Ian

On Mon, Jul 13, 2015 at 4:24 PM, Jochen Wiedmann
<jo...@gmail.com> wrote:
> Hi,
>
> I am writing as one of the Mentors of the AsterixDB podling.
>
> It recently came to my attention, that there are, in fact, multiple
> Git repositories, which are used by the project, one of them being
> located externally of the ASF. I understand the structure to be like
> this:
>
>   +--------------+  Commits   +------------------+  Mirrrors
> +----------------+
>    |  Gerrit      | --------------> | Git (External) | ------------->
> | Git (ASF)    |
>    +-------------+                    +------------------+
>      +----------------+
>
> The structure is made like this, because the project members desire
> that no commits can enter without a review, which is done in Gerrit. [2]
> (In the past, this was ensured by a commit hook in the external
> repository. That commit hook possibly still exists, but it doesn't
> prevent
> code to enter the ASF repository directly without a review. This lack
> of security is currently discussed by the podlings project members.)
>
> I understand the desire, and, to me, it makes sense. OTOH,  I suspect
> that this issue might affect a successful incubation. Hence this mail.
>
> As Git is slowly gaining ground within the ASF, I'd suggest that a
> possible resolution might be to have a Gerrit instance within the ASF.
> Given how Github pull requests are already discussed by many projects,
> I can imagine that many projects would like to adopt a similar policy.
>
> How about that?
>
> Thanks,
>
> Jochen
>
>
>
>
>
>
> [1] http://mail-archives.apache.org/mod_mbox/incubator-asterixdb-dev/201507.mbox/%3CCAN_YF5zRWZijKOQyYx59%2B7wUyXkPg0P2d-c2hBrx64mNFd4hBg%40mail.gmail.com%3E
> [2] https://en.wikipedia.org/wiki/Gerrit_(software)
>
> --
> Any world that can produce the Taj Mahal, William Shakespeare,
> and Stripe toothpaste can't be all bad. (C.R. MacNamara, One Two Three)
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: general-unsubscribe@incubator.apache.org
> For additional commands, e-mail: general-help@incubator.apache.org
>

---------------------------------------------------------------------
To unsubscribe, e-mail: general-unsubscribe@incubator.apache.org
For additional commands, e-mail: general-help@incubator.apache.org


Re: Podling request: Gerrit

Posted by David Nalley <da...@gnsa.us>.
On Wed, Jul 15, 2015 at 2:31 AM, Jochen Wiedmann
<jo...@gmail.com> wrote:
> On Wed, Jul 15, 2015 at 5:56 AM, David Nalley <da...@gnsa.us> wrote:
>
>> We've explored gerrit 2-3 times in the past 24 months. We have seen
>> several projects request it over the years. As I've mentioned
>> elsewhere in this thread, our most recent exploration was in December,
>> and there are a number of issues that would make an ASF-wide instance
>> of gerrit to be impractically costly to deploy.
>
> Is there a record, or anything the like, of that exploration, so that
> I could understand what those issues are?
>
> Thanks,
>
> Jochen
>


Here's a reference I found quickly:
https://mail-search.apache.org/members/private-arch/infrastructure/201412.mbox/%3cCAPbPdOa=pJyE6cvPSmVx2hyL863Yv2ng_GMfP-LsCEKrObi_cQ@mail.gmail.com%3e

The people you want to talk to are Jake Farrell and Andrew Bayer;
they've both looked into it during separate instances IIRC.

--David

---------------------------------------------------------------------
To unsubscribe, e-mail: general-unsubscribe@incubator.apache.org
For additional commands, e-mail: general-help@incubator.apache.org


Re: Podling request: Gerrit

Posted by Jochen Wiedmann <jo...@gmail.com>.
On Wed, Jul 15, 2015 at 5:56 AM, David Nalley <da...@gnsa.us> wrote:

> We've explored gerrit 2-3 times in the past 24 months. We have seen
> several projects request it over the years. As I've mentioned
> elsewhere in this thread, our most recent exploration was in December,
> and there are a number of issues that would make an ASF-wide instance
> of gerrit to be impractically costly to deploy.

Is there a record, or anything the like, of that exploration, so that
I could understand what those issues are?

Thanks,

Jochen

---------------------------------------------------------------------
To unsubscribe, e-mail: general-unsubscribe@incubator.apache.org
For additional commands, e-mail: general-help@incubator.apache.org


Re: Podling request: Gerrit

Posted by David Nalley <da...@gnsa.us>.
Chris:

We've discussed Crucible in the past. (There's a ticket or two for it in Jira).
We've talked with some folks who run it, who tell us it's pretty
sluggish for environments a fraction of the size of the ASF.

That said - the other issue is that we have 200 projects. There is
very little uniformity in what projects want. (Which is why we have a
plethora of CI options, several issue tracking options, etc)
To date, only two projects have requested it, which isn't enough to
justify the investment.

Gerrit, has had more interest, I think the original count I had was 10
projects, though many of those abandoned pursuing it when it became
obvious that it would have been multiple months to get all of the
supporting infrastructure in place and tested.

--David

On Wed, Jul 15, 2015 at 1:58 PM, Chris Douglas <cd...@apache.org> wrote:
> David-
>
> In another conversation, you mentioned Crucible [1] as another tool
> for code reviews. Is that a viable option?
>
> Has anyone on the list had any experience using it? -C
>
> [1] https://www.atlassian.com/software/crucible/overview/
>
> On Tue, Jul 14, 2015 at 8:56 PM, David Nalley <da...@gnsa.us> wrote:
>> On Tue, Jul 14, 2015 at 8:08 PM, Till Westmann <ti...@apache.org> wrote:
>>>
>>> On 14 Jul 2015, at 15:31, David Nalley wrote:
>>>
>>>> On Tue, Jul 14, 2015 at 1:14 AM, Ian Maxon <im...@uci.edu> wrote:
>>>>
>>>>> We use Gerrit as
>>>>> a tool to do code reviews and to organize the commits, as well as to
>>>>> facilitate easy testing. However that's all it's used for- we still
>>>>> clone from repositories that come downstream from ASF, not the other
>>>>> way around. I'd be interested to understand how this would be
>>>>> considered any different than what is done with Github Pull Requests.
>>>>>
>>>>
>>>> So GH PR have a subtle distinction (at least in the way that they are
>>>> handled at the ASF). Projects can't merge pull requests into the repo
>>>> at github. Non-committers see a workflow that is the Github workflow,
>>>> because that's very familiar, and lowers the barrier to contribution.
>>>> Committers, however, have a very different workflow than the folks who
>>>> typically review and close pull requests on github. They have to take
>>>> the patch [1], and merge it into the canonical repository at the ASF,
>>>> which then appears in the github repository because of the mirror
>>>> process.  This stops the problem of diverging codebases that you are
>>>> currently experiencing, calls to rewrite history to align the ASF repo
>>>> with the external repo, etc.
>>>
>>>
>>> As Ian indicated AsterixDB's process also requires manual interaction of
>>> a committer. The current steps are now documented on the website [2].
>>>
>>
>> So, that's marginally better than some previous examples of similar behavior.
>> But I think there are still multiple problems, and I'll try and be
>> more explicit about them:
>>
>> 1. People are not clearly contributing to Apache AsterixDB when
>> submitting a patch via Gerrit at UCI.edu. Think about Section 5 of
>> ASLv2.
>> 2. The ASF has no record of any contributions that are happening on
>> the Gerrit instance at UCI, until a committer decides to push code to
>> the ASF repo. And from a provenance perspective, we have no records of
>> submission of contributions at all.
>> 3. Discussion and code review is happening at UCI, within their Gerrit
>> instance, there is no record of those discussions at the ASF. (With
>> reviews.a.o, Jira, GH Pull Requests, all of that information gets
>> copied to one of the project's mailing list for posterity.)
>> 4. And this is the real issue for me. Gerrit is possessive of git
>> repos it manages by nature; it needs and wants control. The very
>> nature of Gerrit demands that it be the canonical repo. We can play
>> word games and say that it isn't, or that the repo of record that
>> releases are produced from is the ASF repo, but there are a number of
>> realities that reflect that it isn't. First, when the mirroring goes
>> wrong, the initial call is to rewrite history on the ASF repo [3].
>> This suggests to me that the gerrit repo is the de facto repo for the
>> project. Second, Gerrit is where everything is really happening:
>> contributions, code review, testing (from a Jenkins instance at UCI).
>>
>>
>>>> There are some other problems, that aren't necessarily as worrisome,
>>>> but should be something to consider. First, you're relying on a third
>>>> party to provide that resource. That's not inherently a problem, but
>>>> we have a number of examples of projects using external tools and
>>>> those being shut down or phased out which causes tremendous disruption
>>>> to projects. It's also at the old project's home, which might cause
>>>> some folks to question whether the project is truly independent, or
>>>> not.
>>>
>>>
>>> In my view Gerrit is "just" a tool that the AsterixDB community chose
>>> to keep when starting the incubation process. It is is non-essential and
>>> has been used by developers from different organizations before the
>>> incubation started. But I think that its use was and is very beneficial
>>> to the project.
>>>
>>> When we started incubation it seemed to us, that keeping the existing
>>> tool would be a good idea as it
>>> a) allows for a smoother transition and
>>> b) would not put additional requirements on the ASF infrastructure.
>>>
>>
>> I personally like Gerrit. I think it's probably one of the more robust
>> review tools in existence, and it's certainly the most extensible
>> based on what I've seen. That said, its use in this case is not
>> without problems.
>>
>>> However, I do agree that a shut down of the service (which seems very
>>> unlikely at the current point in time) could be a disruption to the
>>> project.
>>
>> We would have said the same thing about Codehaus not too many years ago.
>>
>>> So it might be better to run this tool on the ASF
>>> infrastructure.
>>> Should we pursue this?
>>
>> We've explored gerrit 2-3 times in the past 24 months. We have seen
>> several projects request it over the years. As I've mentioned
>> elsewhere in this thread, our most recent exploration was in December,
>> and there are a number of issues that would make an ASF-wide instance
>> of gerrit to be impractically costly to deploy. I also think that due
>> to the provenance requirements that come with version control as I
>> understand them, as well as some of the other issues that would come
>> into play, that infrastructure would not permit a project-specific
>> instance of Gerrit to be run on ASF infrastructure.
>>
>>> Or is it acceptable to keep the tool on external hardware for now?
>>> Or do you see fundamental issues with AsterixDB's use of Gerrit?
>>>
>>
>> I do not think it's acceptable to use the tool on external hardware. I
>> don't see inherent issues with the tool itself, but also don't think
>> it's pragmatic to have running internally. I know that's a bad
>> position that seems to be inflexible for the project itself, but with
>> around 200 active projects a bit of flexibility is assumed to be lost.
>>
>>
>> --David
>>
>>
>>>
>>>> [1]
>>>> https://patch-diff.githubusercontent.com/raw/apache/airavata/pull/18.patch
>>>
>>>
>>> [2] https://asterixdb.incubator.apache.org/pushing.html
>>
>> [3] http://mail-archives.apache.org/mod_mbox/incubator-asterixdb-dev/201507.mbox/%3cCAN_YF5ztLpaKLnnRSdTeSqB+mJ8Sk6aJ58p_NG9Scx=kBQJ00Q@mail.gmail.com%3e
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: general-unsubscribe@incubator.apache.org
>> For additional commands, e-mail: general-help@incubator.apache.org
>>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: general-unsubscribe@incubator.apache.org
> For additional commands, e-mail: general-help@incubator.apache.org
>

---------------------------------------------------------------------
To unsubscribe, e-mail: general-unsubscribe@incubator.apache.org
For additional commands, e-mail: general-help@incubator.apache.org


Re: Podling request: Gerrit

Posted by Till Westmann <ti...@apache.org>.
+1

> On Jul 15, 2015, at 9:05 PM, Benson Margulies <bi...@gmail.com> wrote:
> 
> I've used crucible. It's horrible. And it comes from Atlassian, which
> means that infra@ is predisposed against it, as their general feeling
> is that the Atlassian products are very heavy.
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: general-unsubscribe@incubator.apache.org
> For additional commands, e-mail: general-help@incubator.apache.org
> 


---------------------------------------------------------------------
To unsubscribe, e-mail: general-unsubscribe@incubator.apache.org
For additional commands, e-mail: general-help@incubator.apache.org


Re: Podling request: Gerrit

Posted by Benson Margulies <bi...@gmail.com>.
I've used crucible. It's horrible. And it comes from Atlassian, which
means that infra@ is predisposed against it, as their general feeling
is that the Atlassian products are very heavy.

---------------------------------------------------------------------
To unsubscribe, e-mail: general-unsubscribe@incubator.apache.org
For additional commands, e-mail: general-help@incubator.apache.org


Re: Podling request: Gerrit

Posted by Chris Douglas <cd...@apache.org>.
David-

In another conversation, you mentioned Crucible [1] as another tool
for code reviews. Is that a viable option?

Has anyone on the list had any experience using it? -C

[1] https://www.atlassian.com/software/crucible/overview/

On Tue, Jul 14, 2015 at 8:56 PM, David Nalley <da...@gnsa.us> wrote:
> On Tue, Jul 14, 2015 at 8:08 PM, Till Westmann <ti...@apache.org> wrote:
>>
>> On 14 Jul 2015, at 15:31, David Nalley wrote:
>>
>>> On Tue, Jul 14, 2015 at 1:14 AM, Ian Maxon <im...@uci.edu> wrote:
>>>
>>>> We use Gerrit as
>>>> a tool to do code reviews and to organize the commits, as well as to
>>>> facilitate easy testing. However that's all it's used for- we still
>>>> clone from repositories that come downstream from ASF, not the other
>>>> way around. I'd be interested to understand how this would be
>>>> considered any different than what is done with Github Pull Requests.
>>>>
>>>
>>> So GH PR have a subtle distinction (at least in the way that they are
>>> handled at the ASF). Projects can't merge pull requests into the repo
>>> at github. Non-committers see a workflow that is the Github workflow,
>>> because that's very familiar, and lowers the barrier to contribution.
>>> Committers, however, have a very different workflow than the folks who
>>> typically review and close pull requests on github. They have to take
>>> the patch [1], and merge it into the canonical repository at the ASF,
>>> which then appears in the github repository because of the mirror
>>> process.  This stops the problem of diverging codebases that you are
>>> currently experiencing, calls to rewrite history to align the ASF repo
>>> with the external repo, etc.
>>
>>
>> As Ian indicated AsterixDB's process also requires manual interaction of
>> a committer. The current steps are now documented on the website [2].
>>
>
> So, that's marginally better than some previous examples of similar behavior.
> But I think there are still multiple problems, and I'll try and be
> more explicit about them:
>
> 1. People are not clearly contributing to Apache AsterixDB when
> submitting a patch via Gerrit at UCI.edu. Think about Section 5 of
> ASLv2.
> 2. The ASF has no record of any contributions that are happening on
> the Gerrit instance at UCI, until a committer decides to push code to
> the ASF repo. And from a provenance perspective, we have no records of
> submission of contributions at all.
> 3. Discussion and code review is happening at UCI, within their Gerrit
> instance, there is no record of those discussions at the ASF. (With
> reviews.a.o, Jira, GH Pull Requests, all of that information gets
> copied to one of the project's mailing list for posterity.)
> 4. And this is the real issue for me. Gerrit is possessive of git
> repos it manages by nature; it needs and wants control. The very
> nature of Gerrit demands that it be the canonical repo. We can play
> word games and say that it isn't, or that the repo of record that
> releases are produced from is the ASF repo, but there are a number of
> realities that reflect that it isn't. First, when the mirroring goes
> wrong, the initial call is to rewrite history on the ASF repo [3].
> This suggests to me that the gerrit repo is the de facto repo for the
> project. Second, Gerrit is where everything is really happening:
> contributions, code review, testing (from a Jenkins instance at UCI).
>
>
>>> There are some other problems, that aren't necessarily as worrisome,
>>> but should be something to consider. First, you're relying on a third
>>> party to provide that resource. That's not inherently a problem, but
>>> we have a number of examples of projects using external tools and
>>> those being shut down or phased out which causes tremendous disruption
>>> to projects. It's also at the old project's home, which might cause
>>> some folks to question whether the project is truly independent, or
>>> not.
>>
>>
>> In my view Gerrit is "just" a tool that the AsterixDB community chose
>> to keep when starting the incubation process. It is is non-essential and
>> has been used by developers from different organizations before the
>> incubation started. But I think that its use was and is very beneficial
>> to the project.
>>
>> When we started incubation it seemed to us, that keeping the existing
>> tool would be a good idea as it
>> a) allows for a smoother transition and
>> b) would not put additional requirements on the ASF infrastructure.
>>
>
> I personally like Gerrit. I think it's probably one of the more robust
> review tools in existence, and it's certainly the most extensible
> based on what I've seen. That said, its use in this case is not
> without problems.
>
>> However, I do agree that a shut down of the service (which seems very
>> unlikely at the current point in time) could be a disruption to the
>> project.
>
> We would have said the same thing about Codehaus not too many years ago.
>
>> So it might be better to run this tool on the ASF
>> infrastructure.
>> Should we pursue this?
>
> We've explored gerrit 2-3 times in the past 24 months. We have seen
> several projects request it over the years. As I've mentioned
> elsewhere in this thread, our most recent exploration was in December,
> and there are a number of issues that would make an ASF-wide instance
> of gerrit to be impractically costly to deploy. I also think that due
> to the provenance requirements that come with version control as I
> understand them, as well as some of the other issues that would come
> into play, that infrastructure would not permit a project-specific
> instance of Gerrit to be run on ASF infrastructure.
>
>> Or is it acceptable to keep the tool on external hardware for now?
>> Or do you see fundamental issues with AsterixDB's use of Gerrit?
>>
>
> I do not think it's acceptable to use the tool on external hardware. I
> don't see inherent issues with the tool itself, but also don't think
> it's pragmatic to have running internally. I know that's a bad
> position that seems to be inflexible for the project itself, but with
> around 200 active projects a bit of flexibility is assumed to be lost.
>
>
> --David
>
>
>>
>>> [1]
>>> https://patch-diff.githubusercontent.com/raw/apache/airavata/pull/18.patch
>>
>>
>> [2] https://asterixdb.incubator.apache.org/pushing.html
>
> [3] http://mail-archives.apache.org/mod_mbox/incubator-asterixdb-dev/201507.mbox/%3cCAN_YF5ztLpaKLnnRSdTeSqB+mJ8Sk6aJ58p_NG9Scx=kBQJ00Q@mail.gmail.com%3e
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: general-unsubscribe@incubator.apache.org
> For additional commands, e-mail: general-help@incubator.apache.org
>

---------------------------------------------------------------------
To unsubscribe, e-mail: general-unsubscribe@incubator.apache.org
For additional commands, e-mail: general-help@incubator.apache.org


Re: Podling request: Gerrit

Posted by Ian Maxon <im...@uci.edu>.
> The legal issue at hand is that one must reasonably assume that the
> contributor offers his patch with an implicit license to the ASF for
> distribution under the terms of the ASL 2.
>
> For example, if you add a patch to a Jira issue, then there is a
> select box, which allows to express consent to that license grant. By
> selecting "yes", the user even provides an explicit license grant, and
> we are supposed to ignore the patch, if that isn't provided.

I see. Does that explicit consent happen with pull requests somehow
too? In any case though I think we could devise a way to notify a user
to become aware of this whenever they would register an account on the
server.


- Ian

On Wed, Jul 15, 2015 at 5:53 AM, Jochen Wiedmann
<jo...@gmail.com> wrote:
> On Wed, Jul 15, 2015 at 12:51 PM, Ian Maxon <im...@uci.edu> wrote:
>
>> I guess there's some legal issue I'm ignorant of here then. How would
>> one submit a patch, without cloning from the official mirror, and
>> hence becoming just as aware of ASF involvement as they would
>> otherwise?
>
> The legal issue at hand is that one must reasonably assume that the
> contributor offers his patch with an implicit license to the ASF for
> distribution under the terms of the ASL 2.
>
> For example, if you add a patch to a Jira issue, then there is a
> select box, which allows to express consent to that license grant. By
> selecting "yes", the user even provides an explicit license grant, and
> we are supposed to ignore the patch, if that isn't provided.
>
>
>>> Well, for starters: There's a certain degree of integration between
>>> Github and the ASF infrastructure. For example, I am reading about
>>> pull requests on ASF mailing lists. Likewise, I follow the discussion
>>> on the same mailing lists.
>>>
>>> Or, in other words: By reading those mailing lists, I am fully informed.
>>
>> If that's an issue it's an easy one to resolve (as stated before).
>> Both Gerrit and Jenkins can output that type of mail without trouble.
>
> It is not upon me to declare that as acceptable, or sufficient. What
> do others think?
>
>
> Jochen
>
>
>
> --
> Any world that can produce the Taj Mahal, William Shakespeare,
> and Stripe toothpaste can't be all bad. (C.R. MacNamara, One Two Three)
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: general-unsubscribe@incubator.apache.org
> For additional commands, e-mail: general-help@incubator.apache.org
>

---------------------------------------------------------------------
To unsubscribe, e-mail: general-unsubscribe@incubator.apache.org
For additional commands, e-mail: general-help@incubator.apache.org


Re: Podling request: Gerrit

Posted by Jochen Wiedmann <jo...@gmail.com>.
On Wed, Jul 15, 2015 at 12:51 PM, Ian Maxon <im...@uci.edu> wrote:

> I guess there's some legal issue I'm ignorant of here then. How would
> one submit a patch, without cloning from the official mirror, and
> hence becoming just as aware of ASF involvement as they would
> otherwise?

The legal issue at hand is that one must reasonably assume that the
contributor offers his patch with an implicit license to the ASF for
distribution under the terms of the ASL 2.

For example, if you add a patch to a Jira issue, then there is a
select box, which allows to express consent to that license grant. By
selecting "yes", the user even provides an explicit license grant, and
we are supposed to ignore the patch, if that isn't provided.


>> Well, for starters: There's a certain degree of integration between
>> Github and the ASF infrastructure. For example, I am reading about
>> pull requests on ASF mailing lists. Likewise, I follow the discussion
>> on the same mailing lists.
>>
>> Or, in other words: By reading those mailing lists, I am fully informed.
>
> If that's an issue it's an easy one to resolve (as stated before).
> Both Gerrit and Jenkins can output that type of mail without trouble.

It is not upon me to declare that as acceptable, or sufficient. What
do others think?


Jochen



-- 
Any world that can produce the Taj Mahal, William Shakespeare,
and Stripe toothpaste can't be all bad. (C.R. MacNamara, One Two Three)

---------------------------------------------------------------------
To unsubscribe, e-mail: general-unsubscribe@incubator.apache.org
For additional commands, e-mail: general-help@incubator.apache.org


Re: Podling request: Gerrit

Posted by Ian Maxon <im...@uci.edu>.
> That is the question, indeed. And, please, keep in mind that the
> answer must satisfy not a humble developer with "no red tape" in mind,
> but a lawyer.

I guess there's some legal issue I'm ignorant of here then. How would
one submit a patch, without cloning from the official mirror, and
hence becoming just as aware of ASF involvement as they would
otherwise?

> Well, for starters: There's a certain degree of integration between
> Github and the ASF infrastructure. For example, I am reading about
> pull requests on ASF mailing lists. Likewise, I follow the discussion
> on the same mailing lists.
>
> Or, in other words: By reading those mailing lists, I am fully informed.

If that's an issue it's an easy one to resolve (as stated before).
Both Gerrit and Jenkins can output that type of mail without trouble.


- Ian

On Wed, Jul 15, 2015 at 3:28 AM, Jochen Wiedmann
<jo...@gmail.com> wrote:
> On Wed, Jul 15, 2015 at 12:13 PM, Ian Maxon <im...@uci.edu> wrote:
>
>> Then what are they submitting a patch for review to, exactly?
>
> That is the question, indeed. And, please, keep in mind that the
> answer must satisfy not a humble developer with "no red tape" in mind,
> but a lawyer.
>
>
>>> Second, Gerrit is where everything is really happening:
>>> contributions, code review, testing (from a Jenkins instance at UCI).
>>
>> What, per se, is unique about that? I could point at any number of
>> Apache projects where the activity is happening mostly in Github pull
>> requests, and the testing in Travis CI. These are all external
>> services that the community decided worked best for them. We have
>> external services that we like too, just different ones.
>
> Well, for starters: There's a certain degree of integration between
> Github and the ASF infrastructure. For example, I am reading about
> pull requests on ASF mailing lists. Likewise, I follow the discussion
> on the same mailing lists.
>
> Or, in other words: By reading those mailing lists, I am fully informed.
>
> Jochen
>
>
> --
> Any world that can produce the Taj Mahal, William Shakespeare,
> and Stripe toothpaste can't be all bad. (C.R. MacNamara, One Two Three)
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: general-unsubscribe@incubator.apache.org
> For additional commands, e-mail: general-help@incubator.apache.org
>

---------------------------------------------------------------------
To unsubscribe, e-mail: general-unsubscribe@incubator.apache.org
For additional commands, e-mail: general-help@incubator.apache.org


Re: Podling request: Gerrit

Posted by Jochen Wiedmann <jo...@gmail.com>.
On Wed, Jul 15, 2015 at 12:13 PM, Ian Maxon <im...@uci.edu> wrote:

> Then what are they submitting a patch for review to, exactly?

That is the question, indeed. And, please, keep in mind that the
answer must satisfy not a humble developer with "no red tape" in mind,
but a lawyer.


>> Second, Gerrit is where everything is really happening:
>> contributions, code review, testing (from a Jenkins instance at UCI).
>
> What, per se, is unique about that? I could point at any number of
> Apache projects where the activity is happening mostly in Github pull
> requests, and the testing in Travis CI. These are all external
> services that the community decided worked best for them. We have
> external services that we like too, just different ones.

Well, for starters: There's a certain degree of integration between
Github and the ASF infrastructure. For example, I am reading about
pull requests on ASF mailing lists. Likewise, I follow the discussion
on the same mailing lists.

Or, in other words: By reading those mailing lists, I am fully informed.

Jochen


-- 
Any world that can produce the Taj Mahal, William Shakespeare,
and Stripe toothpaste can't be all bad. (C.R. MacNamara, One Two Three)

---------------------------------------------------------------------
To unsubscribe, e-mail: general-unsubscribe@incubator.apache.org
For additional commands, e-mail: general-help@incubator.apache.org


Re: Podling request: Gerrit

Posted by "Mattmann, Chris A (3980)" <ch...@jpl.nasa.gov>.
Got it, thanks.

I’m CC’ing dev@asterixdb.incubator.apache.org.

Till, others: is David’s summarization below accurate?
Anything to respond here?

Thanks,
Chris

++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
Chris Mattmann, Ph.D.
Chief Architect
Instrument Software and Science Data Systems Section (398)
NASA Jet Propulsion Laboratory Pasadena, CA 91109 USA
Office: 168-519, Mailstop: 168-527
Email: chris.a.mattmann@nasa.gov
WWW:  http://sunset.usc.edu/~mattmann/
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
Adjunct Associate Professor, Computer Science Department
University of Southern California, Los Angeles, CA 90089 USA
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++





-----Original Message-----
From: David Nalley <da...@gnsa.us>
Reply-To: "general@incubator.apache.org" <ge...@incubator.apache.org>
Date: Wednesday, July 15, 2015 at 3:25 PM
To: "general@incubator.apache.org" <ge...@incubator.apache.org>
Subject: Re: Podling request: Gerrit

>On Wed, Jul 15, 2015 at 5:48 PM, Mattmann, Chris A (3980)
><ch...@jpl.nasa.gov> wrote:
>> Hi Folks,
>>
>> Can someone clarify in simple terms what the issue is here?
>>
>
>There's a few issues Chris:
>
>1. Contributions are being submitted, discussed, and accepted
>externally. No record of the submission, discussion, or acceptance is
>currently maintained at the ASF.
>2. As in 1) contributions are being accepted externally, and then
>synced, to the ASF repo, essentially making it the mirror, rather than
>the required canonical copy.
>
>There are other things, but these are the key issues.
>
>
>--David
>
>> I’m sorry I’m just catching up on this thread, but I want to make
>> sure the podling community for AsterixDB is being supported.
>>
>
>---------------------------------------------------------------------
>To unsubscribe, e-mail: general-unsubscribe@incubator.apache.org
>For additional commands, e-mail: general-help@incubator.apache.org
>


Re: Podling request: Gerrit

Posted by "Mattmann, Chris A (3980)" <ch...@jpl.nasa.gov>.
Got it, thanks.

I’m CC’ing dev@asterixdb.incubator.apache.org.

Till, others: is David’s summarization below accurate?
Anything to respond here?

Thanks,
Chris

++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
Chris Mattmann, Ph.D.
Chief Architect
Instrument Software and Science Data Systems Section (398)
NASA Jet Propulsion Laboratory Pasadena, CA 91109 USA
Office: 168-519, Mailstop: 168-527
Email: chris.a.mattmann@nasa.gov
WWW:  http://sunset.usc.edu/~mattmann/
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
Adjunct Associate Professor, Computer Science Department
University of Southern California, Los Angeles, CA 90089 USA
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++





-----Original Message-----
From: David Nalley <da...@gnsa.us>
Reply-To: "general@incubator.apache.org" <ge...@incubator.apache.org>
Date: Wednesday, July 15, 2015 at 3:25 PM
To: "general@incubator.apache.org" <ge...@incubator.apache.org>
Subject: Re: Podling request: Gerrit

>On Wed, Jul 15, 2015 at 5:48 PM, Mattmann, Chris A (3980)
><ch...@jpl.nasa.gov> wrote:
>> Hi Folks,
>>
>> Can someone clarify in simple terms what the issue is here?
>>
>
>There's a few issues Chris:
>
>1. Contributions are being submitted, discussed, and accepted
>externally. No record of the submission, discussion, or acceptance is
>currently maintained at the ASF.
>2. As in 1) contributions are being accepted externally, and then
>synced, to the ASF repo, essentially making it the mirror, rather than
>the required canonical copy.
>
>There are other things, but these are the key issues.
>
>
>--David
>
>> I’m sorry I’m just catching up on this thread, but I want to make
>> sure the podling community for AsterixDB is being supported.
>>
>
>---------------------------------------------------------------------
>To unsubscribe, e-mail: general-unsubscribe@incubator.apache.org
>For additional commands, e-mail: general-help@incubator.apache.org
>


---------------------------------------------------------------------
To unsubscribe, e-mail: general-unsubscribe@incubator.apache.org
For additional commands, e-mail: general-help@incubator.apache.org

Re: Podling request: Gerrit

Posted by Mike Carey <dt...@gmail.com>.
Sounds like a plan.  :-)  (Moved this reply over.)


On 7/15/15 5:37 PM, Mattmann, Chris A (3980) wrote:
> Hi Till,
>
> We should probably move this discussion on to the
> dev@asterixdb.incubator.apache.org list.
>
> In short, we shouldn’t have situations in which there are contributors
> who contributions are “shepherded in” by Apache AsterixDB Incubating
> PPMC members whose contributions have an indirect middle man at
> UCI. All development on ASF projects must happen at the ASF.
>
> We went to great lengths to get the Github workflow integrated into
> our mailing lists for provenance and for foundational tracking
> perspective and ultimately so that we can tell people who use Apache
> software that it’s from a plan and provenance they can trust. Infra
> did a lot of work to make sure contributions have at least an email
> address that flows through to the mailing list.
>
> Here in the Gerrit situation, it could be similar to Github I suppose
> if we make sure all communication from that Gerrit instance is
> mirror’ed to the list (dev@, or some similar list, probably issues@,
> or something that folks can choose to subscribe to).
>
> Ideally we need ICLAs on file for anything bigger than smallish
> contributions that have clear mailing list provenance. So, one thing
> you guys are doing is potentially circumventing that review from
> an ASF perspective without this mailing list mirror’ing at the very
> least.
>
> If ASF infra is willing to throw up a Gerrit instance, that’s the
> most ideal situation. If they are not there is precedence for what
> you guys are doing ( e.g., with Github; and also with build farm
> machines, e.g., such as those contributed by Y! initially when
> Hadoop started, etc.) But this is our core product; code, and it’s
> not something to be taken lightly especially in light of things
> “not happening at the ASF” and for provenance purposes. It’s nice
> that this is going on at UCI, and we appreciate their use of
> resources, however, ASF projects develop and “occur” at the ASF.
> Period.
>
> Here are the immediate actions:
>
> 1. Mirror all UCI Gerrit to Apache AsterixDB mailing list (discussed
> in dev@asterixdb.i.a.o and agreed upon by the PPMC within 24-48
> hours) 2. Work with ASF infra (David Nalley is the VP of infra, so
> you have his attention here) to see if they are willing to run a
> Gerrit instance. It’s my understanding David, that the AsterixDB
> folks have a few lingering issues here where they have not heard
> back so if you could reply on those I’d appreciate it.  3. Contributions
> from non AsterixDB PPMC members need to be recognized as such as
> we should be looking to have a discussion about who should be added
> to the PPMC based on the work that’s been going on.
>
> OK, that sound like a plan? This discussion should move to
> dev@asterixdb.i.a.o if there is nothing more here. This is a community
> and teaching issue that doesn’t need to be on general@.
>
> Cheers,
> Chris
>
>
> ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
> Chris Mattmann, Ph.D.
> Chief Architect
> Instrument Software and Science Data Systems Section (398)
> NASA Jet Propulsion Laboratory Pasadena, CA 91109 USA
> Office: 168-519, Mailstop: 168-527
> Email: chris.a.mattmann@nasa.gov
> WWW:  http://sunset.usc.edu/~mattmann/
> ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
> Adjunct Associate Professor, Computer Science Department
> University of Southern California, Los Angeles, CA 90089 USA
> ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
>
>
>
>
>
> -----Original Message-----
> From: Till Westmann <ti...@apache.org>
> Reply-To: "general@incubator.apache.org" <ge...@incubator.apache.org>
> Date: Wednesday, July 15, 2015 at 4:18 PM
> To: "general@incubator.apache.org" <ge...@incubator.apache.org>
> Subject: Re: Podling request: Gerrit
>
>>> On Jul 16, 2015, at 12:25 AM, David Nalley <da...@gnsa.us> wrote:
>>>
>>> On Wed, Jul 15, 2015 at 5:48 PM, Mattmann, Chris A (3980)
>>> <ch...@jpl.nasa.gov> wrote:
>>>> Hi Folks,
>>>>
>>>> Can someone clarify in simple terms what the issue is here?
>>>>
>>> There's a few issues Chris:
>> Let me try to describe this in terms how most members of the AsterixDB
>> community probably see it.
>>
>>> 1. Contributions are being submitted, discussed, and accepted
>>> externally. No record of the submission, discussion, or acceptance is
>>> currently maintained at the ASF.
>> AsterixDB uses a Gerrit instance hosted at UCI as a code review tool
>> before submitting to the master branch.
>> Discussions on modifications are indeed happening in Gerrit and are
>> currently not forwarded to the ASF mailing lists, but forwarding those
>> discussions should be possible.
>> After discussion, review, and acceptance in the review tool, an AsterixDB
>> committer manually commits the reviewed modification to the master branch
>> in the ASF repository.
>> If the original author of the modification of the code was an AsterixDB
>> committer, the commit should be done by the original author.
>> If the original author was another contributor, the commit should be done
>> by the AsterixDB committer who reviewed and validated the modification.
>>
>>> 2. As in 1) contributions are being accepted externally, and then
>>> synced, to the ASF repo, essentially making it the mirror, rather than
>>> the required canonical copy.
>> Contributions are accepted by an AsterixDB committer on a tool that is
>> not hosted by the ASF.
>> It is not clear why that makes the acceptance external to the project or
>> the ASF.
>> After acceptance, an AsterixDB committer commits the modifications to the
>> ASF repo.
>> The master branch of the ASF repository is considered to be the source of
>> truth and the basis for releases.
>> It is not obvious, why the fact that the modifications were reviewed in
>> Gerrit before being committed to the ASF repo would make the ASF repo a
>> “non-canonical” copy.
>>
>> Till
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: general-unsubscribe@incubator.apache.org
>> For additional commands, e-mail: general-help@incubator.apache.org
>>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: general-unsubscribe@incubator.apache.org
> For additional commands, e-mail: general-help@incubator.apache.org


Re: Podling request: Gerrit

Posted by David Nalley <da...@gnsa.us>.
On Wed, Jul 15, 2015 at 8:37 PM, Mattmann, Chris A (3980)
<ch...@jpl.nasa.gov> wrote:
> Hi Till,
>
> We should probably move this discussion on to the
> dev@asterixdb.incubator.apache.org list.
>
> In short, we shouldn’t have situations in which there are contributors
> who contributions are “shepherded in” by Apache AsterixDB Incubating
> PPMC members whose contributions have an indirect middle man at
> UCI. All development on ASF projects must happen at the ASF.
>
> We went to great lengths to get the Github workflow integrated into
> our mailing lists for provenance and for foundational tracking
> perspective and ultimately so that we can tell people who use Apache
> software that it’s from a plan and provenance they can trust. Infra
> did a lot of work to make sure contributions have at least an email
> address that flows through to the mailing list.
>
> Here in the Gerrit situation, it could be similar to Github I suppose
> if we make sure all communication from that Gerrit instance is
> mirror’ed to the list (dev@, or some similar list, probably issues@,
> or something that folks can choose to subscribe to).
>
> Ideally we need ICLAs on file for anything bigger than smallish
> contributions that have clear mailing list provenance. So, one thing
> you guys are doing is potentially circumventing that review from
> an ASF perspective without this mailing list mirror’ing at the very
> least.
>
> If ASF infra is willing to throw up a Gerrit instance, that’s the
> most ideal situation. If they are not there is precedence for what
> you guys are doing ( e.g., with Github; and also with build farm
> machines, e.g., such as those contributed by Y! initially when
> Hadoop started, etc.) But this is our core product; code, and it’s
> not something to be taken lightly especially in light of things
> “not happening at the ASF” and for provenance purposes. It’s nice
> that this is going on at UCI, and we appreciate their use of
> resources, however, ASF projects develop and “occur” at the ASF.
> Period.
>
> Here are the immediate actions:
>
> 1. Mirror all UCI Gerrit to Apache AsterixDB mailing list (discussed
> in dev@asterixdb.i.a.o and agreed upon by the PPMC within 24-48
> hours) 2. Work with ASF infra (David Nalley is the VP of infra, so
> you have his attention here) to see if they are willing to run a
> Gerrit instance. It’s my understanding David, that the AsterixDB
> folks have a few lingering issues here where they have not heard
> back so if you could reply on those I’d appreciate it.

I just replied to a flurry of emails, but I have no doubt I've missed
things. Please don't hesitate to call out if I still haven't answered
a question.
dev@asterixdb will work well.

--David

---------------------------------------------------------------------
To unsubscribe, e-mail: general-unsubscribe@incubator.apache.org
For additional commands, e-mail: general-help@incubator.apache.org


Re: Podling request: Gerrit

Posted by David Nalley <da...@gnsa.us>.
On Wed, Jul 15, 2015 at 8:37 PM, Mattmann, Chris A (3980)
<ch...@jpl.nasa.gov> wrote:
> Hi Till,
>
> We should probably move this discussion on to the
> dev@asterixdb.incubator.apache.org list.
>
> In short, we shouldn’t have situations in which there are contributors
> who contributions are “shepherded in” by Apache AsterixDB Incubating
> PPMC members whose contributions have an indirect middle man at
> UCI. All development on ASF projects must happen at the ASF.
>
> We went to great lengths to get the Github workflow integrated into
> our mailing lists for provenance and for foundational tracking
> perspective and ultimately so that we can tell people who use Apache
> software that it’s from a plan and provenance they can trust. Infra
> did a lot of work to make sure contributions have at least an email
> address that flows through to the mailing list.
>
> Here in the Gerrit situation, it could be similar to Github I suppose
> if we make sure all communication from that Gerrit instance is
> mirror’ed to the list (dev@, or some similar list, probably issues@,
> or something that folks can choose to subscribe to).
>
> Ideally we need ICLAs on file for anything bigger than smallish
> contributions that have clear mailing list provenance. So, one thing
> you guys are doing is potentially circumventing that review from
> an ASF perspective without this mailing list mirror’ing at the very
> least.
>
> If ASF infra is willing to throw up a Gerrit instance, that’s the
> most ideal situation. If they are not there is precedence for what
> you guys are doing ( e.g., with Github; and also with build farm
> machines, e.g., such as those contributed by Y! initially when
> Hadoop started, etc.) But this is our core product; code, and it’s
> not something to be taken lightly especially in light of things
> “not happening at the ASF” and for provenance purposes. It’s nice
> that this is going on at UCI, and we appreciate their use of
> resources, however, ASF projects develop and “occur” at the ASF.
> Period.
>
> Here are the immediate actions:
>
> 1. Mirror all UCI Gerrit to Apache AsterixDB mailing list (discussed
> in dev@asterixdb.i.a.o and agreed upon by the PPMC within 24-48
> hours) 2. Work with ASF infra (David Nalley is the VP of infra, so
> you have his attention here) to see if they are willing to run a
> Gerrit instance. It’s my understanding David, that the AsterixDB
> folks have a few lingering issues here where they have not heard
> back so if you could reply on those I’d appreciate it.

I just replied to a flurry of emails, but I have no doubt I've missed
things. Please don't hesitate to call out if I still haven't answered
a question.
dev@asterixdb will work well.

--David

Re: Podling request: Gerrit

Posted by "Mattmann, Chris A (3980)" <ch...@jpl.nasa.gov>.
Hi Till,

We should probably move this discussion on to the
dev@asterixdb.incubator.apache.org list.

In short, we shouldn’t have situations in which there are contributors
who contributions are “shepherded in” by Apache AsterixDB Incubating
PPMC members whose contributions have an indirect middle man at
UCI. All development on ASF projects must happen at the ASF.

We went to great lengths to get the Github workflow integrated into
our mailing lists for provenance and for foundational tracking
perspective and ultimately so that we can tell people who use Apache
software that it’s from a plan and provenance they can trust. Infra
did a lot of work to make sure contributions have at least an email
address that flows through to the mailing list.

Here in the Gerrit situation, it could be similar to Github I suppose
if we make sure all communication from that Gerrit instance is
mirror’ed to the list (dev@, or some similar list, probably issues@,
or something that folks can choose to subscribe to).

Ideally we need ICLAs on file for anything bigger than smallish
contributions that have clear mailing list provenance. So, one thing
you guys are doing is potentially circumventing that review from
an ASF perspective without this mailing list mirror’ing at the very
least.

If ASF infra is willing to throw up a Gerrit instance, that’s the
most ideal situation. If they are not there is precedence for what
you guys are doing ( e.g., with Github; and also with build farm
machines, e.g., such as those contributed by Y! initially when
Hadoop started, etc.) But this is our core product; code, and it’s
not something to be taken lightly especially in light of things
“not happening at the ASF” and for provenance purposes. It’s nice
that this is going on at UCI, and we appreciate their use of
resources, however, ASF projects develop and “occur” at the ASF.
Period.

Here are the immediate actions:

1. Mirror all UCI Gerrit to Apache AsterixDB mailing list (discussed
in dev@asterixdb.i.a.o and agreed upon by the PPMC within 24-48
hours) 2. Work with ASF infra (David Nalley is the VP of infra, so
you have his attention here) to see if they are willing to run a
Gerrit instance. It’s my understanding David, that the AsterixDB
folks have a few lingering issues here where they have not heard
back so if you could reply on those I’d appreciate it.  3. Contributions
from non AsterixDB PPMC members need to be recognized as such as
we should be looking to have a discussion about who should be added
to the PPMC based on the work that’s been going on.

OK, that sound like a plan? This discussion should move to
dev@asterixdb.i.a.o if there is nothing more here. This is a community
and teaching issue that doesn’t need to be on general@.

Cheers, 
Chris


++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
Chris Mattmann, Ph.D.
Chief Architect
Instrument Software and Science Data Systems Section (398)
NASA Jet Propulsion Laboratory Pasadena, CA 91109 USA
Office: 168-519, Mailstop: 168-527
Email: chris.a.mattmann@nasa.gov
WWW:  http://sunset.usc.edu/~mattmann/
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
Adjunct Associate Professor, Computer Science Department
University of Southern California, Los Angeles, CA 90089 USA
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++





-----Original Message-----
From: Till Westmann <ti...@apache.org>
Reply-To: "general@incubator.apache.org" <ge...@incubator.apache.org>
Date: Wednesday, July 15, 2015 at 4:18 PM
To: "general@incubator.apache.org" <ge...@incubator.apache.org>
Subject: Re: Podling request: Gerrit

>
>> On Jul 16, 2015, at 12:25 AM, David Nalley <da...@gnsa.us> wrote:
>> 
>> On Wed, Jul 15, 2015 at 5:48 PM, Mattmann, Chris A (3980)
>> <ch...@jpl.nasa.gov> wrote:
>>> Hi Folks,
>>> 
>>> Can someone clarify in simple terms what the issue is here?
>>> 
>> 
>> There's a few issues Chris:
>
>Let me try to describe this in terms how most members of the AsterixDB
>community probably see it.
>
>> 1. Contributions are being submitted, discussed, and accepted
>> externally. No record of the submission, discussion, or acceptance is
>> currently maintained at the ASF.
>
>AsterixDB uses a Gerrit instance hosted at UCI as a code review tool
>before submitting to the master branch.
>Discussions on modifications are indeed happening in Gerrit and are
>currently not forwarded to the ASF mailing lists, but forwarding those
>discussions should be possible.
>After discussion, review, and acceptance in the review tool, an AsterixDB
>committer manually commits the reviewed modification to the master branch
>in the ASF repository.
>If the original author of the modification of the code was an AsterixDB
>committer, the commit should be done by the original author.
>If the original author was another contributor, the commit should be done
>by the AsterixDB committer who reviewed and validated the modification.
>
>> 2. As in 1) contributions are being accepted externally, and then
>> synced, to the ASF repo, essentially making it the mirror, rather than
>> the required canonical copy.
>
>Contributions are accepted by an AsterixDB committer on a tool that is
>not hosted by the ASF.
>It is not clear why that makes the acceptance external to the project or
>the ASF.
>After acceptance, an AsterixDB committer commits the modifications to the
>ASF repo.
>The master branch of the ASF repository is considered to be the source of
>truth and the basis for releases.
>It is not obvious, why the fact that the modifications were reviewed in
>Gerrit before being committed to the ASF repo would make the ASF repo a
>“non-canonical” copy.
>
>Till
>---------------------------------------------------------------------
>To unsubscribe, e-mail: general-unsubscribe@incubator.apache.org
>For additional commands, e-mail: general-help@incubator.apache.org
>


Re: Podling request: Gerrit

Posted by Ian Maxon <im...@uci.edu>.
+1. This is basically how I see it as well.

On Wed, Jul 15, 2015 at 4:18 PM, Till Westmann <ti...@apache.org> wrote:
>
>> On Jul 16, 2015, at 12:25 AM, David Nalley <da...@gnsa.us> wrote:
>>
>> On Wed, Jul 15, 2015 at 5:48 PM, Mattmann, Chris A (3980)
>> <ch...@jpl.nasa.gov> wrote:
>>> Hi Folks,
>>>
>>> Can someone clarify in simple terms what the issue is here?
>>>
>>
>> There's a few issues Chris:
>
> Let me try to describe this in terms how most members of the AsterixDB community probably see it.
>
>> 1. Contributions are being submitted, discussed, and accepted
>> externally. No record of the submission, discussion, or acceptance is
>> currently maintained at the ASF.
>
> AsterixDB uses a Gerrit instance hosted at UCI as a code review tool before submitting to the master branch.
> Discussions on modifications are indeed happening in Gerrit and are currently not forwarded to the ASF mailing lists, but forwarding those discussions should be possible.
> After discussion, review, and acceptance in the review tool, an AsterixDB committer manually commits the reviewed modification to the master branch in the ASF repository.
> If the original author of the modification of the code was an AsterixDB committer, the commit should be done by the original author.
> If the original author was another contributor, the commit should be done by the AsterixDB committer who reviewed and validated the modification.
>
>> 2. As in 1) contributions are being accepted externally, and then
>> synced, to the ASF repo, essentially making it the mirror, rather than
>> the required canonical copy.
>
> Contributions are accepted by an AsterixDB committer on a tool that is not hosted by the ASF.
> It is not clear why that makes the acceptance external to the project or the ASF.
> After acceptance, an AsterixDB committer commits the modifications to the ASF repo.
> The master branch of the ASF repository is considered to be the source of truth and the basis for releases.
> It is not obvious, why the fact that the modifications were reviewed in Gerrit before being committed to the ASF repo would make the ASF repo a “non-canonical” copy.
>
> Till
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: general-unsubscribe@incubator.apache.org
> For additional commands, e-mail: general-help@incubator.apache.org
>

---------------------------------------------------------------------
To unsubscribe, e-mail: general-unsubscribe@incubator.apache.org
For additional commands, e-mail: general-help@incubator.apache.org


Re: Podling request: Gerrit

Posted by "Mattmann, Chris A (3980)" <ch...@jpl.nasa.gov>.
Hi Till,

We should probably move this discussion on to the
dev@asterixdb.incubator.apache.org list.

In short, we shouldn’t have situations in which there are contributors
who contributions are “shepherded in” by Apache AsterixDB Incubating
PPMC members whose contributions have an indirect middle man at
UCI. All development on ASF projects must happen at the ASF.

We went to great lengths to get the Github workflow integrated into
our mailing lists for provenance and for foundational tracking
perspective and ultimately so that we can tell people who use Apache
software that it’s from a plan and provenance they can trust. Infra
did a lot of work to make sure contributions have at least an email
address that flows through to the mailing list.

Here in the Gerrit situation, it could be similar to Github I suppose
if we make sure all communication from that Gerrit instance is
mirror’ed to the list (dev@, or some similar list, probably issues@,
or something that folks can choose to subscribe to).

Ideally we need ICLAs on file for anything bigger than smallish
contributions that have clear mailing list provenance. So, one thing
you guys are doing is potentially circumventing that review from
an ASF perspective without this mailing list mirror’ing at the very
least.

If ASF infra is willing to throw up a Gerrit instance, that’s the
most ideal situation. If they are not there is precedence for what
you guys are doing ( e.g., with Github; and also with build farm
machines, e.g., such as those contributed by Y! initially when
Hadoop started, etc.) But this is our core product; code, and it’s
not something to be taken lightly especially in light of things
“not happening at the ASF” and for provenance purposes. It’s nice
that this is going on at UCI, and we appreciate their use of
resources, however, ASF projects develop and “occur” at the ASF.
Period.

Here are the immediate actions:

1. Mirror all UCI Gerrit to Apache AsterixDB mailing list (discussed
in dev@asterixdb.i.a.o and agreed upon by the PPMC within 24-48
hours) 2. Work with ASF infra (David Nalley is the VP of infra, so
you have his attention here) to see if they are willing to run a
Gerrit instance. It’s my understanding David, that the AsterixDB
folks have a few lingering issues here where they have not heard
back so if you could reply on those I’d appreciate it.  3. Contributions
from non AsterixDB PPMC members need to be recognized as such as
we should be looking to have a discussion about who should be added
to the PPMC based on the work that’s been going on.

OK, that sound like a plan? This discussion should move to
dev@asterixdb.i.a.o if there is nothing more here. This is a community
and teaching issue that doesn’t need to be on general@.

Cheers, 
Chris


++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
Chris Mattmann, Ph.D.
Chief Architect
Instrument Software and Science Data Systems Section (398)
NASA Jet Propulsion Laboratory Pasadena, CA 91109 USA
Office: 168-519, Mailstop: 168-527
Email: chris.a.mattmann@nasa.gov
WWW:  http://sunset.usc.edu/~mattmann/
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
Adjunct Associate Professor, Computer Science Department
University of Southern California, Los Angeles, CA 90089 USA
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++





-----Original Message-----
From: Till Westmann <ti...@apache.org>
Reply-To: "general@incubator.apache.org" <ge...@incubator.apache.org>
Date: Wednesday, July 15, 2015 at 4:18 PM
To: "general@incubator.apache.org" <ge...@incubator.apache.org>
Subject: Re: Podling request: Gerrit

>
>> On Jul 16, 2015, at 12:25 AM, David Nalley <da...@gnsa.us> wrote:
>> 
>> On Wed, Jul 15, 2015 at 5:48 PM, Mattmann, Chris A (3980)
>> <ch...@jpl.nasa.gov> wrote:
>>> Hi Folks,
>>> 
>>> Can someone clarify in simple terms what the issue is here?
>>> 
>> 
>> There's a few issues Chris:
>
>Let me try to describe this in terms how most members of the AsterixDB
>community probably see it.
>
>> 1. Contributions are being submitted, discussed, and accepted
>> externally. No record of the submission, discussion, or acceptance is
>> currently maintained at the ASF.
>
>AsterixDB uses a Gerrit instance hosted at UCI as a code review tool
>before submitting to the master branch.
>Discussions on modifications are indeed happening in Gerrit and are
>currently not forwarded to the ASF mailing lists, but forwarding those
>discussions should be possible.
>After discussion, review, and acceptance in the review tool, an AsterixDB
>committer manually commits the reviewed modification to the master branch
>in the ASF repository.
>If the original author of the modification of the code was an AsterixDB
>committer, the commit should be done by the original author.
>If the original author was another contributor, the commit should be done
>by the AsterixDB committer who reviewed and validated the modification.
>
>> 2. As in 1) contributions are being accepted externally, and then
>> synced, to the ASF repo, essentially making it the mirror, rather than
>> the required canonical copy.
>
>Contributions are accepted by an AsterixDB committer on a tool that is
>not hosted by the ASF.
>It is not clear why that makes the acceptance external to the project or
>the ASF.
>After acceptance, an AsterixDB committer commits the modifications to the
>ASF repo.
>The master branch of the ASF repository is considered to be the source of
>truth and the basis for releases.
>It is not obvious, why the fact that the modifications were reviewed in
>Gerrit before being committed to the ASF repo would make the ASF repo a
>“non-canonical” copy.
>
>Till
>---------------------------------------------------------------------
>To unsubscribe, e-mail: general-unsubscribe@incubator.apache.org
>For additional commands, e-mail: general-help@incubator.apache.org
>


---------------------------------------------------------------------
To unsubscribe, e-mail: general-unsubscribe@incubator.apache.org
For additional commands, e-mail: general-help@incubator.apache.org

Re: Podling request: Gerrit

Posted by Till Westmann <ti...@apache.org>.
> On Jul 16, 2015, at 12:25 AM, David Nalley <da...@gnsa.us> wrote:
> 
> On Wed, Jul 15, 2015 at 5:48 PM, Mattmann, Chris A (3980)
> <ch...@jpl.nasa.gov> wrote:
>> Hi Folks,
>> 
>> Can someone clarify in simple terms what the issue is here?
>> 
> 
> There's a few issues Chris:

Let me try to describe this in terms how most members of the AsterixDB community probably see it.

> 1. Contributions are being submitted, discussed, and accepted
> externally. No record of the submission, discussion, or acceptance is
> currently maintained at the ASF.

AsterixDB uses a Gerrit instance hosted at UCI as a code review tool before submitting to the master branch. 
Discussions on modifications are indeed happening in Gerrit and are currently not forwarded to the ASF mailing lists, but forwarding those discussions should be possible.
After discussion, review, and acceptance in the review tool, an AsterixDB committer manually commits the reviewed modification to the master branch in the ASF repository. 
If the original author of the modification of the code was an AsterixDB committer, the commit should be done by the original author. 
If the original author was another contributor, the commit should be done by the AsterixDB committer who reviewed and validated the modification.

> 2. As in 1) contributions are being accepted externally, and then
> synced, to the ASF repo, essentially making it the mirror, rather than
> the required canonical copy.

Contributions are accepted by an AsterixDB committer on a tool that is not hosted by the ASF.
It is not clear why that makes the acceptance external to the project or the ASF.
After acceptance, an AsterixDB committer commits the modifications to the ASF repo.
The master branch of the ASF repository is considered to be the source of truth and the basis for releases.
It is not obvious, why the fact that the modifications were reviewed in Gerrit before being committed to the ASF repo would make the ASF repo a “non-canonical” copy.

Till
---------------------------------------------------------------------
To unsubscribe, e-mail: general-unsubscribe@incubator.apache.org
For additional commands, e-mail: general-help@incubator.apache.org


Re: Podling request: Gerrit

Posted by David Nalley <da...@gnsa.us>.
On Wed, Jul 15, 2015 at 5:48 PM, Mattmann, Chris A (3980)
<ch...@jpl.nasa.gov> wrote:
> Hi Folks,
>
> Can someone clarify in simple terms what the issue is here?
>

There's a few issues Chris:

1. Contributions are being submitted, discussed, and accepted
externally. No record of the submission, discussion, or acceptance is
currently maintained at the ASF.
2. As in 1) contributions are being accepted externally, and then
synced, to the ASF repo, essentially making it the mirror, rather than
the required canonical copy.

There are other things, but these are the key issues.


--David

> I’m sorry I’m just catching up on this thread, but I want to make
> sure the podling community for AsterixDB is being supported.
>

---------------------------------------------------------------------
To unsubscribe, e-mail: general-unsubscribe@incubator.apache.org
For additional commands, e-mail: general-help@incubator.apache.org


Re: Podling request: Gerrit

Posted by "Mattmann, Chris A (3980)" <ch...@jpl.nasa.gov>.
Hi Folks,

Can someone clarify in simple terms what the issue is here?

I’m sorry I’m just catching up on this thread, but I want to make
sure the podling community for AsterixDB is being supported.

Cheers,
Chris

++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
Chris Mattmann, Ph.D.
Chief Architect
Instrument Software and Science Data Systems Section (398)
NASA Jet Propulsion Laboratory Pasadena, CA 91109 USA
Office: 168-519, Mailstop: 168-527
Email: chris.a.mattmann@nasa.gov
WWW:  http://sunset.usc.edu/~mattmann/
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
Adjunct Associate Professor, Computer Science Department
University of Southern California, Los Angeles, CA 90089 USA
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++





-----Original Message-----
From: Ian Maxon <im...@uci.edu>
Reply-To: "general@incubator.apache.org" <ge...@incubator.apache.org>
Date: Wednesday, July 15, 2015 at 1:22 PM
To: "general@incubator.apache.org" <ge...@incubator.apache.org>
Subject: Re: Podling request: Gerrit

>> In Git (and I'd presume any Git-like DVCS) anything but the push logs
>> can be spoofed. Having a record of who actually pushed to the repo
>> is one of the requirement from ASF's standpoint to track chain of
>>custody
>> for the code that lands in out projects.
>
>Understood. That's the very reason why we modified our process to its
>present state when we began incubation. As stated before in this
>thread, the push logs aren't played with- it is always a committer
>that actually pushes a contribution to the ASF, with their account,
>and not a robot or proxy, in our current workflow. The push logs still
>record a valid chain of custody.
>
>The analogous situation in the case David was describing, if I am
>understanding it correctly, is that ASF doesn't know of an
>uncommitted/unverified contribution that may lie in Gerrit's review
>queue, possibly pending commit. Unless there's something I am missing,
>I don't understand how that's any more or less recorded or visible
>than a contribution that lies in a personal fork in Github, before it
>has a pull request submitted and merged.
>
>-Ian
>
>On Wed, Jul 15, 2015 at 1:02 PM, Roman Shaposhnik <ro...@shaposhnik.org>
>wrote:
>> On Wed, Jul 15, 2015 at 3:13 AM, Ian Maxon <im...@uci.edu> wrote:
>>>> 2. The ASF has no record of any contributions that are happening on
>>>> the Gerrit instance at UCI, until a committer decides to push code to
>>>> the ASF repo.
>>>
>>> I'm afraid I don't understand this point. How is this different than
>>> any other distributed version control system? In github, nobody is
>>> aware of a contribution in a fork until a pull request is made. How's
>>> that any different than what's going on here?
>>
>> In Git (and I'd presume any Git-like DVCS) anything but the push logs
>> can be spoofed. Having a record of who actually pushed to the repo
>> is one of the requirement from ASF's standpoint to track chain of
>>custody
>> for the code that lands in out projects.
>>
>> Do realize that this unique requirement comes from the fact that
>> we're a foundation, not just a code hosting site.
>>
>> Thanks,
>> Roman.
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: general-unsubscribe@incubator.apache.org
>> For additional commands, e-mail: general-help@incubator.apache.org
>>
>
>---------------------------------------------------------------------
>To unsubscribe, e-mail: general-unsubscribe@incubator.apache.org
>For additional commands, e-mail: general-help@incubator.apache.org
>


---------------------------------------------------------------------
To unsubscribe, e-mail: general-unsubscribe@incubator.apache.org
For additional commands, e-mail: general-help@incubator.apache.org

Re: Podling request: Gerrit

Posted by Ian Maxon <im...@uci.edu>.
> In Git (and I'd presume any Git-like DVCS) anything but the push logs
> can be spoofed. Having a record of who actually pushed to the repo
> is one of the requirement from ASF's standpoint to track chain of custody
> for the code that lands in out projects.

Understood. That's the very reason why we modified our process to its
present state when we began incubation. As stated before in this
thread, the push logs aren't played with- it is always a committer
that actually pushes a contribution to the ASF, with their account,
and not a robot or proxy, in our current workflow. The push logs still
record a valid chain of custody.

The analogous situation in the case David was describing, if I am
understanding it correctly, is that ASF doesn't know of an
uncommitted/unverified contribution that may lie in Gerrit's review
queue, possibly pending commit. Unless there's something I am missing,
I don't understand how that's any more or less recorded or visible
than a contribution that lies in a personal fork in Github, before it
has a pull request submitted and merged.

-Ian

On Wed, Jul 15, 2015 at 1:02 PM, Roman Shaposhnik <ro...@shaposhnik.org> wrote:
> On Wed, Jul 15, 2015 at 3:13 AM, Ian Maxon <im...@uci.edu> wrote:
>>> 2. The ASF has no record of any contributions that are happening on
>>> the Gerrit instance at UCI, until a committer decides to push code to
>>> the ASF repo.
>>
>> I'm afraid I don't understand this point. How is this different than
>> any other distributed version control system? In github, nobody is
>> aware of a contribution in a fork until a pull request is made. How's
>> that any different than what's going on here?
>
> In Git (and I'd presume any Git-like DVCS) anything but the push logs
> can be spoofed. Having a record of who actually pushed to the repo
> is one of the requirement from ASF's standpoint to track chain of custody
> for the code that lands in out projects.
>
> Do realize that this unique requirement comes from the fact that
> we're a foundation, not just a code hosting site.
>
> Thanks,
> Roman.
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: general-unsubscribe@incubator.apache.org
> For additional commands, e-mail: general-help@incubator.apache.org
>

---------------------------------------------------------------------
To unsubscribe, e-mail: general-unsubscribe@incubator.apache.org
For additional commands, e-mail: general-help@incubator.apache.org


Re: Podling request: Gerrit

Posted by Ian Maxon <im...@uci.edu>.
To meet this, do we simply need the change proposals in Gerrit (i.e.
pull requests) to have their patch contents mirrored to a proper ASF
mailing list?

- Ian

On Wed, Jul 15, 2015 at 4:34 PM, Chris Douglas <cd...@apache.org> wrote:
> On Wed, Jul 15, 2015 at 4:21 PM, Roman Shaposhnik <ro...@shaposhnik.org> wrote:
>> As long as there's a human being in the loop reviewing what's going into the
>> repo I don't think I've got any issues with the process.
>
> The ASF needs to establish provenance. It can't do that if a committer
> pushes code that was posted to infrastructure owned by UCI. We need a
> record of the patch from the contributor, not just the committer. -C
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: general-unsubscribe@incubator.apache.org
> For additional commands, e-mail: general-help@incubator.apache.org
>

---------------------------------------------------------------------
To unsubscribe, e-mail: general-unsubscribe@incubator.apache.org
For additional commands, e-mail: general-help@incubator.apache.org


Re: Podling request: Gerrit

Posted by Ian Maxon <im...@uci.edu>.
> I understand that git is a DVCS, but by mirroring the commit content
> from one repo to the ASF (albeit via a committer middleman), we
> largely make the push records[1] pointless.

The push logs are intended to determine the committer either who
authored the contribution or is taking responsibility for it if it's
not authored by a committer, right? If that's the case then the push
logs do show what's going on lately in terms of provenance, since we
adopted the new procedure of having committers individually submit
changes which they own. For example this line from the push log:

[Fri May 22 23:41:08 2015] refs/heads/master 7de6f4eb80 -> b9611bb9ab
mhubail@http.98.164.231.216

Murtadha did indeed author this commit.

-Ian

On Wed, Jul 15, 2015 at 6:07 PM, David Nalley <da...@gnsa.us> wrote:
> On Wed, Jul 15, 2015 at 7:45 PM, Roman Shaposhnik <ro...@shaposhnik.org> wrote:
>> On Wed, Jul 15, 2015 at 4:34 PM, Chris Douglas <cd...@apache.org> wrote:
>>> On Wed, Jul 15, 2015 at 4:21 PM, Roman Shaposhnik <ro...@shaposhnik.org> wrote:
>>>> As long as there's a human being in the loop reviewing what's going into the
>>>> repo I don't think I've got any issues with the process.
>>>
>>> The ASF needs to establish provenance. It can't do that if a committer
>>> pushes code that was posted to infrastructure owned by UCI. We need a
>>> record of the patch from the contributor, not just the committer. -C
>>
>> My assessment of the situation is based on the workflow where whoever
>> is pushing the commits into the repo is reviewing both committers and
>> authors of every commit that goes in.
>>
>> If that's the case I see no material difference between the above and
>> a committers pushing a patch contributed on JIRA by Jon Doe OR
>> a Github Pull Request. Both acceptable practices within existing
>> ASF projects.
>>
>
>
> I'll respectfully disagree, and I'll illustrate it with a real world example.
>
> My employer, Citrix, wanted to operate a Gerrit instance for
> CloudStack. This was rejected out of hand (and the main reason was
> project independence - sending people to gerrit.citrix.com to submit
> their patches hardly makes the project seem independent. But ignoring
> that issue. This effectively means people would be doing all of the
> development in a Citrix hosted git repo. A git repo that the
> foundation does not control authz to, nor has access to push logs for.
> We then lose all of the data in the push logs when commits are synced
> from the Citrix repo to the ASF repo. And to boot, we generate what
> are essentially dummy records in the process of shipping data over.
>
> I understand that git is a DVCS, but by mirroring the commit content
> from one repo to the ASF (albeit via a committer middleman), we
> largely make the push records[1] pointless. That turns our copy of the
> repository into the equivalent of a github fork. It has all of the
> commit history, but none of the real provenance information. I
> understand that as a DVCS, the idea of a canonical repository is
> strange and foreign, and git usage patterns typically involve
> situations where they ignore push records themselves, but I am at a
> loss as to how we preserve useful records when we are just syncing
> stuff from a Gerri-managed repo.
>
> --David
>
> [1] Just to be perfectly clear, because frankly, I didn't understand
> it when people talked to me about it initially - the push logs have
> nothing to do with commit log. Push logs are tied to the authorization
> tool, and are not stored in the repository.  Here's the push logs for
> AsterixDB.
> https://git-wip-us.apache.org/logs/asf/incubator-asterixdb.git
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: general-unsubscribe@incubator.apache.org
> For additional commands, e-mail: general-help@incubator.apache.org
>

Re: Podling request: Gerrit

Posted by Ian Maxon <im...@uci.edu>.
> I understand that git is a DVCS, but by mirroring the commit content
> from one repo to the ASF (albeit via a committer middleman), we
> largely make the push records[1] pointless.

The push logs are intended to determine the committer either who
authored the contribution or is taking responsibility for it if it's
not authored by a committer, right? If that's the case then the push
logs do show what's going on lately in terms of provenance, since we
adopted the new procedure of having committers individually submit
changes which they own. For example this line from the push log:

[Fri May 22 23:41:08 2015] refs/heads/master 7de6f4eb80 -> b9611bb9ab
mhubail@http.98.164.231.216

Murtadha did indeed author this commit.

-Ian

On Wed, Jul 15, 2015 at 6:07 PM, David Nalley <da...@gnsa.us> wrote:
> On Wed, Jul 15, 2015 at 7:45 PM, Roman Shaposhnik <ro...@shaposhnik.org> wrote:
>> On Wed, Jul 15, 2015 at 4:34 PM, Chris Douglas <cd...@apache.org> wrote:
>>> On Wed, Jul 15, 2015 at 4:21 PM, Roman Shaposhnik <ro...@shaposhnik.org> wrote:
>>>> As long as there's a human being in the loop reviewing what's going into the
>>>> repo I don't think I've got any issues with the process.
>>>
>>> The ASF needs to establish provenance. It can't do that if a committer
>>> pushes code that was posted to infrastructure owned by UCI. We need a
>>> record of the patch from the contributor, not just the committer. -C
>>
>> My assessment of the situation is based on the workflow where whoever
>> is pushing the commits into the repo is reviewing both committers and
>> authors of every commit that goes in.
>>
>> If that's the case I see no material difference between the above and
>> a committers pushing a patch contributed on JIRA by Jon Doe OR
>> a Github Pull Request. Both acceptable practices within existing
>> ASF projects.
>>
>
>
> I'll respectfully disagree, and I'll illustrate it with a real world example.
>
> My employer, Citrix, wanted to operate a Gerrit instance for
> CloudStack. This was rejected out of hand (and the main reason was
> project independence - sending people to gerrit.citrix.com to submit
> their patches hardly makes the project seem independent. But ignoring
> that issue. This effectively means people would be doing all of the
> development in a Citrix hosted git repo. A git repo that the
> foundation does not control authz to, nor has access to push logs for.
> We then lose all of the data in the push logs when commits are synced
> from the Citrix repo to the ASF repo. And to boot, we generate what
> are essentially dummy records in the process of shipping data over.
>
> I understand that git is a DVCS, but by mirroring the commit content
> from one repo to the ASF (albeit via a committer middleman), we
> largely make the push records[1] pointless. That turns our copy of the
> repository into the equivalent of a github fork. It has all of the
> commit history, but none of the real provenance information. I
> understand that as a DVCS, the idea of a canonical repository is
> strange and foreign, and git usage patterns typically involve
> situations where they ignore push records themselves, but I am at a
> loss as to how we preserve useful records when we are just syncing
> stuff from a Gerri-managed repo.
>
> --David
>
> [1] Just to be perfectly clear, because frankly, I didn't understand
> it when people talked to me about it initially - the push logs have
> nothing to do with commit log. Push logs are tied to the authorization
> tool, and are not stored in the repository.  Here's the push logs for
> AsterixDB.
> https://git-wip-us.apache.org/logs/asf/incubator-asterixdb.git
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: general-unsubscribe@incubator.apache.org
> For additional commands, e-mail: general-help@incubator.apache.org
>

---------------------------------------------------------------------
To unsubscribe, e-mail: general-unsubscribe@incubator.apache.org
For additional commands, e-mail: general-help@incubator.apache.org


Re: Podling request: Gerrit

Posted by David Nalley <da...@gnsa.us>.
On Wed, Jul 15, 2015 at 7:45 PM, Roman Shaposhnik <ro...@shaposhnik.org> wrote:
> On Wed, Jul 15, 2015 at 4:34 PM, Chris Douglas <cd...@apache.org> wrote:
>> On Wed, Jul 15, 2015 at 4:21 PM, Roman Shaposhnik <ro...@shaposhnik.org> wrote:
>>> As long as there's a human being in the loop reviewing what's going into the
>>> repo I don't think I've got any issues with the process.
>>
>> The ASF needs to establish provenance. It can't do that if a committer
>> pushes code that was posted to infrastructure owned by UCI. We need a
>> record of the patch from the contributor, not just the committer. -C
>
> My assessment of the situation is based on the workflow where whoever
> is pushing the commits into the repo is reviewing both committers and
> authors of every commit that goes in.
>
> If that's the case I see no material difference between the above and
> a committers pushing a patch contributed on JIRA by Jon Doe OR
> a Github Pull Request. Both acceptable practices within existing
> ASF projects.
>


I'll respectfully disagree, and I'll illustrate it with a real world example.

My employer, Citrix, wanted to operate a Gerrit instance for
CloudStack. This was rejected out of hand (and the main reason was
project independence - sending people to gerrit.citrix.com to submit
their patches hardly makes the project seem independent. But ignoring
that issue. This effectively means people would be doing all of the
development in a Citrix hosted git repo. A git repo that the
foundation does not control authz to, nor has access to push logs for.
We then lose all of the data in the push logs when commits are synced
from the Citrix repo to the ASF repo. And to boot, we generate what
are essentially dummy records in the process of shipping data over.

I understand that git is a DVCS, but by mirroring the commit content
from one repo to the ASF (albeit via a committer middleman), we
largely make the push records[1] pointless. That turns our copy of the
repository into the equivalent of a github fork. It has all of the
commit history, but none of the real provenance information. I
understand that as a DVCS, the idea of a canonical repository is
strange and foreign, and git usage patterns typically involve
situations where they ignore push records themselves, but I am at a
loss as to how we preserve useful records when we are just syncing
stuff from a Gerri-managed repo.

--David

[1] Just to be perfectly clear, because frankly, I didn't understand
it when people talked to me about it initially - the push logs have
nothing to do with commit log. Push logs are tied to the authorization
tool, and are not stored in the repository.  Here's the push logs for
AsterixDB.
https://git-wip-us.apache.org/logs/asf/incubator-asterixdb.git

---------------------------------------------------------------------
To unsubscribe, e-mail: general-unsubscribe@incubator.apache.org
For additional commands, e-mail: general-help@incubator.apache.org


Re: Podling request: Gerrit

Posted by Roman Shaposhnik <ro...@shaposhnik.org>.
On Wed, Jul 15, 2015 at 4:34 PM, Chris Douglas <cd...@apache.org> wrote:
> On Wed, Jul 15, 2015 at 4:21 PM, Roman Shaposhnik <ro...@shaposhnik.org> wrote:
>> As long as there's a human being in the loop reviewing what's going into the
>> repo I don't think I've got any issues with the process.
>
> The ASF needs to establish provenance. It can't do that if a committer
> pushes code that was posted to infrastructure owned by UCI. We need a
> record of the patch from the contributor, not just the committer. -C

My assessment of the situation is based on the workflow where whoever
is pushing the commits into the repo is reviewing both committers and
authors of every commit that goes in.

If that's the case I see no material difference between the above and
a committers pushing a patch contributed on JIRA by Jon Doe OR
a Github Pull Request. Both acceptable practices within existing
ASF projects.

Thanks,
Roman.

---------------------------------------------------------------------
To unsubscribe, e-mail: general-unsubscribe@incubator.apache.org
For additional commands, e-mail: general-help@incubator.apache.org


Re: Podling request: Gerrit

Posted by Chris Douglas <cd...@apache.org>.
On Wed, Jul 15, 2015 at 4:21 PM, Roman Shaposhnik <ro...@shaposhnik.org> wrote:
> As long as there's a human being in the loop reviewing what's going into the
> repo I don't think I've got any issues with the process.

The ASF needs to establish provenance. It can't do that if a committer
pushes code that was posted to infrastructure owned by UCI. We need a
record of the patch from the contributor, not just the committer. -C

---------------------------------------------------------------------
To unsubscribe, e-mail: general-unsubscribe@incubator.apache.org
For additional commands, e-mail: general-help@incubator.apache.org


Re: Podling request: Gerrit

Posted by Roman Shaposhnik <ro...@shaposhnik.org>.
On Wed, Jul 15, 2015 at 4:42 PM, Till Westmann <ti...@apache.org> wrote:
> 1) I think that we didn’t ask for an ASF hosted instance. But I also think that
> David’s concern that the absence of the service would disrupt the
> development of AsterixDB is valid. And thus it might make sense not to rely
> on an instance that is not hosted by the ASF. However, I think that the current
> instance has virtually no risk of disappearing soon and so this is not an urgent
> topic
>
> 2) I don’t see how the organization that hosts the Gerrit instance impacts the
> process. Independent of the organization that hosts the Gerrit instance, our
> process has an AsterixDB committer in the loop between Gerrit and the
> canonical project repository.

Like as said, as long as there's a human in the loop who applies the same level
of diligence to reviewing the authorship of commits as I alluded to in my reply
to Chris I don't see a problem with what is the system from which these commits
originated.

The reason I brought ASF hosted instance of Gerrit into discussion was because
I thought you wanted to get rid of the human in the loop.

Thanks,
Roman.

---------------------------------------------------------------------
To unsubscribe, e-mail: general-unsubscribe@incubator.apache.org
For additional commands, e-mail: general-help@incubator.apache.org


Re: Podling request: Gerrit

Posted by Till Westmann <ti...@apache.org>.
> On Jul 16, 2015, at 1:21 AM, Roman Shaposhnik <ro...@shaposhnik.org> wrote:
> 
> On Wed, Jul 15, 2015 at 4:17 PM, Till Westmann <tillw@apache.org <ma...@apache.org>> wrote:
>> 
>>> On Jul 15, 2015, at 10:02 PM, Roman Shaposhnik <ro...@shaposhnik.org> wrote:
>>> 
>>> On Wed, Jul 15, 2015 at 3:13 AM, Ian Maxon <im...@uci.edu> wrote:
>>>>> 2. The ASF has no record of any contributions that are happening on
>>>>> the Gerrit instance at UCI, until a committer decides to push code to
>>>>> the ASF repo.
>>>> 
>>>> I'm afraid I don't understand this point. How is this different than
>>>> any other distributed version control system? In github, nobody is
>>>> aware of a contribution in a fork until a pull request is made. How's
>>>> that any different than what's going on here?
>>> 
>>> In Git (and I'd presume any Git-like DVCS) anything but the push logs
>>> can be spoofed. Having a record of who actually pushed to the repo
>>> is one of the requirement from ASF's standpoint to track chain of custody
>>> for the code that lands in out projects.
>> 
>> But that’s seems to be the case here. The actual commit is pushed manually
>> by an AsterixDB committer.
> 
> Exactly! Which rewinds us all the way to back to my original reply on
> this thread.
> 
> As long as there's a human being in the loop reviewing what's going into the
> repo I don't think I've got any issues with the process.
> 
> But! Asking for an ASF-managed Gerrit instance will remove that human from
> the loop. This is negotiable but would require INFRA having the same trust
> in Gerrit logs as they have in their current Git push logs.

1) I think that we didn’t ask for an ASF hosted instance. But I also think that
David’s concern that the absence of the service would disrupt the
development of AsterixDB is valid. And thus it might make sense not to rely
on an instance that is not hosted by the ASF. However, I think that the current
instance has virtually no risk of disappearing soon and so this is not an urgent
topic

2) I don’t see how the organization that hosts the Gerrit instance impacts the
process. Independent of the organization that hosts the Gerrit instance, our
process has an AsterixDB committer in the loop between Gerrit and the
canonical project repository.

Cheers,
Till

Re: Podling request: Gerrit

Posted by Roman Shaposhnik <ro...@shaposhnik.org>.
On Wed, Jul 15, 2015 at 4:17 PM, Till Westmann <ti...@apache.org> wrote:
>
>> On Jul 15, 2015, at 10:02 PM, Roman Shaposhnik <ro...@shaposhnik.org> wrote:
>>
>> On Wed, Jul 15, 2015 at 3:13 AM, Ian Maxon <im...@uci.edu> wrote:
>>>> 2. The ASF has no record of any contributions that are happening on
>>>> the Gerrit instance at UCI, until a committer decides to push code to
>>>> the ASF repo.
>>>
>>> I'm afraid I don't understand this point. How is this different than
>>> any other distributed version control system? In github, nobody is
>>> aware of a contribution in a fork until a pull request is made. How's
>>> that any different than what's going on here?
>>
>> In Git (and I'd presume any Git-like DVCS) anything but the push logs
>> can be spoofed. Having a record of who actually pushed to the repo
>> is one of the requirement from ASF's standpoint to track chain of custody
>> for the code that lands in out projects.
>
> But that’s seems to be the case here. The actual commit is pushed manually
> by an AsterixDB committer.

Exactly! Which rewinds us all the way to back to my original reply on
this thread.

As long as there's a human being in the loop reviewing what's going into the
repo I don't think I've got any issues with the process.

But! Asking for an ASF-managed Gerrit instance will remove that human from
the loop. This is negotiable but would require INFRA having the same trust
in Gerrit logs as they have in their current Git push logs.

Thanks,
Roman.

---------------------------------------------------------------------
To unsubscribe, e-mail: general-unsubscribe@incubator.apache.org
For additional commands, e-mail: general-help@incubator.apache.org


Re: Podling request: Gerrit

Posted by Till Westmann <ti...@apache.org>.
> On Jul 15, 2015, at 10:02 PM, Roman Shaposhnik <ro...@shaposhnik.org> wrote:
> 
> On Wed, Jul 15, 2015 at 3:13 AM, Ian Maxon <im...@uci.edu> wrote:
>>> 2. The ASF has no record of any contributions that are happening on
>>> the Gerrit instance at UCI, until a committer decides to push code to
>>> the ASF repo.
>> 
>> I'm afraid I don't understand this point. How is this different than
>> any other distributed version control system? In github, nobody is
>> aware of a contribution in a fork until a pull request is made. How's
>> that any different than what's going on here?
> 
> In Git (and I'd presume any Git-like DVCS) anything but the push logs
> can be spoofed. Having a record of who actually pushed to the repo
> is one of the requirement from ASF's standpoint to track chain of custody
> for the code that lands in out projects.

But that’s seems to be the case here. The actual commit is pushed manually
by an AsterixDB committer.

Till

> Do realize that this unique requirement comes from the fact that
> we're a foundation, not just a code hosting site.
> 
> Thanks,
> Roman.
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: general-unsubscribe@incubator.apache.org
> For additional commands, e-mail: general-help@incubator.apache.org
> 


---------------------------------------------------------------------
To unsubscribe, e-mail: general-unsubscribe@incubator.apache.org
For additional commands, e-mail: general-help@incubator.apache.org


Re: Podling request: Gerrit

Posted by Roman Shaposhnik <ro...@shaposhnik.org>.
On Wed, Jul 15, 2015 at 3:13 AM, Ian Maxon <im...@uci.edu> wrote:
>> 2. The ASF has no record of any contributions that are happening on
>> the Gerrit instance at UCI, until a committer decides to push code to
>> the ASF repo.
>
> I'm afraid I don't understand this point. How is this different than
> any other distributed version control system? In github, nobody is
> aware of a contribution in a fork until a pull request is made. How's
> that any different than what's going on here?

In Git (and I'd presume any Git-like DVCS) anything but the push logs
can be spoofed. Having a record of who actually pushed to the repo
is one of the requirement from ASF's standpoint to track chain of custody
for the code that lands in out projects.

Do realize that this unique requirement comes from the fact that
we're a foundation, not just a code hosting site.

Thanks,
Roman.

---------------------------------------------------------------------
To unsubscribe, e-mail: general-unsubscribe@incubator.apache.org
For additional commands, e-mail: general-help@incubator.apache.org


Re: Podling request: Gerrit

Posted by Ian Maxon <im...@uci.edu>.
> 1. People are not clearly contributing to Apache AsterixDB when
> submitting a patch via Gerrit at UCI.edu. Think about Section 5 of
> ASLv2.

Then what are they submitting a patch for review to, exactly?

> 2. The ASF has no record of any contributions that are happening on
> the Gerrit instance at UCI, until a committer decides to push code to
> the ASF repo.

I'm afraid I don't understand this point. How is this different than
any other distributed version control system? In github, nobody is
aware of a contribution in a fork until a pull request is made. How's
that any different than what's going on here?

>And from a provenance perspective, we have no records of
> submission of contributions at all.

How is provenance lost? What is the way in which records are kept? Is
there some other information side-channel besides the push records
that have already been discussed?

> 3. Discussion and code review is happening at UCI, within their Gerrit
> instance, there is no record of those discussions at the ASF. (With
> reviews.a.o, Jira, GH Pull Requests, all of that information gets
> copied to one of the project's mailing list for posterity.)

We can make Gerrit CC the dev@ list for all changes.

> 4. And this is the real issue for me. Gerrit is possessive of git
> repos it manages by nature; it needs and wants control. The very
> nature of Gerrit demands that it be the canonical repo.

I would tend to disagree. For instance, with the way things are right
now, there's nothing stopping us from accepting Github pull requests.
We'd just push them to Gerrit's head.

>We can play
> word games and say that it isn't, or that the repo of record that
> releases are produced from is the ASF repo, but there are a number of
> realities that reflect that it isn't. First, when the mirroring goes
> wrong, the initial call is to rewrite history on the ASF repo [3].

I think there's a misunderstanding of what happened there, then. What
happened is as follows: A committer failed to follow what we, as a
community, had agreed upon as a procedure for committing changes. It
actually looked at first like they had committed a patch which not
only hadn't been code reviewed, but was superseded by another version
of the same high-level change that fixed issues. Hence, it really
looked like a change that we didn't want to keep in the git history if
it could be helped. That's the main reason we wanted to rewrite the
git history. The fact Gerrit contained the correct version is
orthogonal.

> This suggests to me that the gerrit repo is the de facto repo for the
> project.

Nobody clones from Gerrit or sets it as their upstream, so again I disagree.

> Second, Gerrit is where everything is really happening:
> contributions, code review, testing (from a Jenkins instance at UCI).

What, per se, is unique about that? I could point at any number of
Apache projects where the activity is happening mostly in Github pull
requests, and the testing in Travis CI. These are all external
services that the community decided worked best for them. We have
external services that we like too, just different ones.


- Ian


On Tue, Jul 14, 2015 at 8:56 PM, David Nalley <da...@gnsa.us> wrote:
> On Tue, Jul 14, 2015 at 8:08 PM, Till Westmann <ti...@apache.org> wrote:
>>
>> On 14 Jul 2015, at 15:31, David Nalley wrote:
>>
>>> On Tue, Jul 14, 2015 at 1:14 AM, Ian Maxon <im...@uci.edu> wrote:
>>>
>>>> We use Gerrit as
>>>> a tool to do code reviews and to organize the commits, as well as to
>>>> facilitate easy testing. However that's all it's used for- we still
>>>> clone from repositories that come downstream from ASF, not the other
>>>> way around. I'd be interested to understand how this would be
>>>> considered any different than what is done with Github Pull Requests.
>>>>
>>>
>>> So GH PR have a subtle distinction (at least in the way that they are
>>> handled at the ASF). Projects can't merge pull requests into the repo
>>> at github. Non-committers see a workflow that is the Github workflow,
>>> because that's very familiar, and lowers the barrier to contribution.
>>> Committers, however, have a very different workflow than the folks who
>>> typically review and close pull requests on github. They have to take
>>> the patch [1], and merge it into the canonical repository at the ASF,
>>> which then appears in the github repository because of the mirror
>>> process.  This stops the problem of diverging codebases that you are
>>> currently experiencing, calls to rewrite history to align the ASF repo
>>> with the external repo, etc.
>>
>>
>> As Ian indicated AsterixDB's process also requires manual interaction of
>> a committer. The current steps are now documented on the website [2].
>>
>
> So, that's marginally better than some previous examples of similar behavior.
> But I think there are still multiple problems, and I'll try and be
> more explicit about them:
>
> 1. People are not clearly contributing to Apache AsterixDB when
> submitting a patch via Gerrit at UCI.edu. Think about Section 5 of
> ASLv2.
> 2. The ASF has no record of any contributions that are happening on
> the Gerrit instance at UCI, until a committer decides to push code to
> the ASF repo. And from a provenance perspective, we have no records of
> submission of contributions at all.
> 3. Discussion and code review is happening at UCI, within their Gerrit
> instance, there is no record of those discussions at the ASF. (With
> reviews.a.o, Jira, GH Pull Requests, all of that information gets
> copied to one of the project's mailing list for posterity.)
> 4. And this is the real issue for me. Gerrit is possessive of git
> repos it manages by nature; it needs and wants control. The very
> nature of Gerrit demands that it be the canonical repo. We can play
> word games and say that it isn't, or that the repo of record that
> releases are produced from is the ASF repo, but there are a number of
> realities that reflect that it isn't. First, when the mirroring goes
> wrong, the initial call is to rewrite history on the ASF repo [3].
> This suggests to me that the gerrit repo is the de facto repo for the
> project. Second, Gerrit is where everything is really happening:
> contributions, code review, testing (from a Jenkins instance at UCI).
>
>
>>> There are some other problems, that aren't necessarily as worrisome,
>>> but should be something to consider. First, you're relying on a third
>>> party to provide that resource. That's not inherently a problem, but
>>> we have a number of examples of projects using external tools and
>>> those being shut down or phased out which causes tremendous disruption
>>> to projects. It's also at the old project's home, which might cause
>>> some folks to question whether the project is truly independent, or
>>> not.
>>
>>
>> In my view Gerrit is "just" a tool that the AsterixDB community chose
>> to keep when starting the incubation process. It is is non-essential and
>> has been used by developers from different organizations before the
>> incubation started. But I think that its use was and is very beneficial
>> to the project.
>>
>> When we started incubation it seemed to us, that keeping the existing
>> tool would be a good idea as it
>> a) allows for a smoother transition and
>> b) would not put additional requirements on the ASF infrastructure.
>>
>
> I personally like Gerrit. I think it's probably one of the more robust
> review tools in existence, and it's certainly the most extensible
> based on what I've seen. That said, its use in this case is not
> without problems.
>
>> However, I do agree that a shut down of the service (which seems very
>> unlikely at the current point in time) could be a disruption to the
>> project.
>
> We would have said the same thing about Codehaus not too many years ago.
>
>> So it might be better to run this tool on the ASF
>> infrastructure.
>> Should we pursue this?
>
> We've explored gerrit 2-3 times in the past 24 months. We have seen
> several projects request it over the years. As I've mentioned
> elsewhere in this thread, our most recent exploration was in December,
> and there are a number of issues that would make an ASF-wide instance
> of gerrit to be impractically costly to deploy. I also think that due
> to the provenance requirements that come with version control as I
> understand them, as well as some of the other issues that would come
> into play, that infrastructure would not permit a project-specific
> instance of Gerrit to be run on ASF infrastructure.
>
>> Or is it acceptable to keep the tool on external hardware for now?
>> Or do you see fundamental issues with AsterixDB's use of Gerrit?
>>
>
> I do not think it's acceptable to use the tool on external hardware. I
> don't see inherent issues with the tool itself, but also don't think
> it's pragmatic to have running internally. I know that's a bad
> position that seems to be inflexible for the project itself, but with
> around 200 active projects a bit of flexibility is assumed to be lost.
>
>
> --David
>
>
>>
>>> [1]
>>> https://patch-diff.githubusercontent.com/raw/apache/airavata/pull/18.patch
>>
>>
>> [2] https://asterixdb.incubator.apache.org/pushing.html
>
> [3] http://mail-archives.apache.org/mod_mbox/incubator-asterixdb-dev/201507.mbox/%3cCAN_YF5ztLpaKLnnRSdTeSqB+mJ8Sk6aJ58p_NG9Scx=kBQJ00Q@mail.gmail.com%3e
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: general-unsubscribe@incubator.apache.org
> For additional commands, e-mail: general-help@incubator.apache.org
>

---------------------------------------------------------------------
To unsubscribe, e-mail: general-unsubscribe@incubator.apache.org
For additional commands, e-mail: general-help@incubator.apache.org


Re: Podling request: Gerrit

Posted by David Nalley <da...@gnsa.us>.
On Tue, Jul 14, 2015 at 8:08 PM, Till Westmann <ti...@apache.org> wrote:
>
> On 14 Jul 2015, at 15:31, David Nalley wrote:
>
>> On Tue, Jul 14, 2015 at 1:14 AM, Ian Maxon <im...@uci.edu> wrote:
>>
>>> We use Gerrit as
>>> a tool to do code reviews and to organize the commits, as well as to
>>> facilitate easy testing. However that's all it's used for- we still
>>> clone from repositories that come downstream from ASF, not the other
>>> way around. I'd be interested to understand how this would be
>>> considered any different than what is done with Github Pull Requests.
>>>
>>
>> So GH PR have a subtle distinction (at least in the way that they are
>> handled at the ASF). Projects can't merge pull requests into the repo
>> at github. Non-committers see a workflow that is the Github workflow,
>> because that's very familiar, and lowers the barrier to contribution.
>> Committers, however, have a very different workflow than the folks who
>> typically review and close pull requests on github. They have to take
>> the patch [1], and merge it into the canonical repository at the ASF,
>> which then appears in the github repository because of the mirror
>> process.  This stops the problem of diverging codebases that you are
>> currently experiencing, calls to rewrite history to align the ASF repo
>> with the external repo, etc.
>
>
> As Ian indicated AsterixDB's process also requires manual interaction of
> a committer. The current steps are now documented on the website [2].
>

So, that's marginally better than some previous examples of similar behavior.
But I think there are still multiple problems, and I'll try and be
more explicit about them:

1. People are not clearly contributing to Apache AsterixDB when
submitting a patch via Gerrit at UCI.edu. Think about Section 5 of
ASLv2.
2. The ASF has no record of any contributions that are happening on
the Gerrit instance at UCI, until a committer decides to push code to
the ASF repo. And from a provenance perspective, we have no records of
submission of contributions at all.
3. Discussion and code review is happening at UCI, within their Gerrit
instance, there is no record of those discussions at the ASF. (With
reviews.a.o, Jira, GH Pull Requests, all of that information gets
copied to one of the project's mailing list for posterity.)
4. And this is the real issue for me. Gerrit is possessive of git
repos it manages by nature; it needs and wants control. The very
nature of Gerrit demands that it be the canonical repo. We can play
word games and say that it isn't, or that the repo of record that
releases are produced from is the ASF repo, but there are a number of
realities that reflect that it isn't. First, when the mirroring goes
wrong, the initial call is to rewrite history on the ASF repo [3].
This suggests to me that the gerrit repo is the de facto repo for the
project. Second, Gerrit is where everything is really happening:
contributions, code review, testing (from a Jenkins instance at UCI).


>> There are some other problems, that aren't necessarily as worrisome,
>> but should be something to consider. First, you're relying on a third
>> party to provide that resource. That's not inherently a problem, but
>> we have a number of examples of projects using external tools and
>> those being shut down or phased out which causes tremendous disruption
>> to projects. It's also at the old project's home, which might cause
>> some folks to question whether the project is truly independent, or
>> not.
>
>
> In my view Gerrit is "just" a tool that the AsterixDB community chose
> to keep when starting the incubation process. It is is non-essential and
> has been used by developers from different organizations before the
> incubation started. But I think that its use was and is very beneficial
> to the project.
>
> When we started incubation it seemed to us, that keeping the existing
> tool would be a good idea as it
> a) allows for a smoother transition and
> b) would not put additional requirements on the ASF infrastructure.
>

I personally like Gerrit. I think it's probably one of the more robust
review tools in existence, and it's certainly the most extensible
based on what I've seen. That said, its use in this case is not
without problems.

> However, I do agree that a shut down of the service (which seems very
> unlikely at the current point in time) could be a disruption to the
> project.

We would have said the same thing about Codehaus not too many years ago.

> So it might be better to run this tool on the ASF
> infrastructure.
> Should we pursue this?

We've explored gerrit 2-3 times in the past 24 months. We have seen
several projects request it over the years. As I've mentioned
elsewhere in this thread, our most recent exploration was in December,
and there are a number of issues that would make an ASF-wide instance
of gerrit to be impractically costly to deploy. I also think that due
to the provenance requirements that come with version control as I
understand them, as well as some of the other issues that would come
into play, that infrastructure would not permit a project-specific
instance of Gerrit to be run on ASF infrastructure.

> Or is it acceptable to keep the tool on external hardware for now?
> Or do you see fundamental issues with AsterixDB's use of Gerrit?
>

I do not think it's acceptable to use the tool on external hardware. I
don't see inherent issues with the tool itself, but also don't think
it's pragmatic to have running internally. I know that's a bad
position that seems to be inflexible for the project itself, but with
around 200 active projects a bit of flexibility is assumed to be lost.


--David


>
>> [1]
>> https://patch-diff.githubusercontent.com/raw/apache/airavata/pull/18.patch
>
>
> [2] https://asterixdb.incubator.apache.org/pushing.html

[3] http://mail-archives.apache.org/mod_mbox/incubator-asterixdb-dev/201507.mbox/%3cCAN_YF5ztLpaKLnnRSdTeSqB+mJ8Sk6aJ58p_NG9Scx=kBQJ00Q@mail.gmail.com%3e

---------------------------------------------------------------------
To unsubscribe, e-mail: general-unsubscribe@incubator.apache.org
For additional commands, e-mail: general-help@incubator.apache.org


Re: Podling request: Gerrit

Posted by Jake Farrell <jf...@apache.org>.
Hey Till
All commits must occur against the ASF codebase and can not be mirrored
over from a different location (this includes Github, Gerrit, etc.). This
is a mandate that has come from the board level and is not a negotiable
item.

Gerrit has been discussed many times before and each time the answer has
been that without people to help maintain the service and until Gerrit can
be used as only a review tool and not require controlling the primary
codebase and resulting merge that it will not be an viable option for the
ASF's use.

We are working to see what we can do to help ease some of the pain with
pull requests, but at this time the tooling available is as stated
previously in this thread. Infra is always happy to take suggestions and
any ideas and we are always looking for volunteers that are interested in
lending a hand and helping improve our ecosystem for all projects. If you
have any questions please let me know

-Jake

On Tue, Jul 14, 2015 at 8:08 PM, Till Westmann <ti...@apache.org> wrote:

>
> On 14 Jul 2015, at 15:31, David Nalley wrote:
>
>  On Tue, Jul 14, 2015 at 1:14 AM, Ian Maxon <im...@uci.edu> wrote:
>>
>>  We use Gerrit as
>>> a tool to do code reviews and to organize the commits, as well as to
>>> facilitate easy testing. However that's all it's used for- we still
>>> clone from repositories that come downstream from ASF, not the other
>>> way around. I'd be interested to understand how this would be
>>> considered any different than what is done with Github Pull Requests.
>>>
>>>
>> So GH PR have a subtle distinction (at least in the way that they are
>> handled at the ASF). Projects can't merge pull requests into the repo
>> at github. Non-committers see a workflow that is the Github workflow,
>> because that's very familiar, and lowers the barrier to contribution.
>> Committers, however, have a very different workflow than the folks who
>> typically review and close pull requests on github. They have to take
>> the patch [1], and merge it into the canonical repository at the ASF,
>> which then appears in the github repository because of the mirror
>> process.  This stops the problem of diverging codebases that you are
>> currently experiencing, calls to rewrite history to align the ASF repo
>> with the external repo, etc.
>>
>
> As Ian indicated AsterixDB's process also requires manual interaction of
> a committer. The current steps are now documented on the website [2].
>
>  There are some other problems, that aren't necessarily as worrisome,
>> but should be something to consider. First, you're relying on a third
>> party to provide that resource. That's not inherently a problem, but
>> we have a number of examples of projects using external tools and
>> those being shut down or phased out which causes tremendous disruption
>> to projects. It's also at the old project's home, which might cause
>> some folks to question whether the project is truly independent, or
>> not.
>>
>
> In my view Gerrit is "just" a tool that the AsterixDB community chose
> to keep when starting the incubation process. It is is non-essential and
> has been used by developers from different organizations before the
> incubation started. But I think that its use was and is very beneficial
> to the project.
> When we started incubation it seemed to us, that keeping the existing
> tool would be a good idea as it
> a) allows for a smoother transition and
> b) would not put additional requirements on the ASF infrastructure.
>
> However, I do agree that a shut down of the service (which seems very
> unlikely at the current point in time) could be a disruption to the
> project. So it might be better to run this tool on the ASF
> infrastructure.
> Should we pursue this?
> Or is it acceptable to keep the tool on external hardware for now?
> Or do you see fundamental issues with AsterixDB's use of Gerrit?
>
> Thanks,
> Till
>
>  [1]
>> https://patch-diff.githubusercontent.com/raw/apache/airavata/pull/18.patch
>>
>
> [2] https://asterixdb.incubator.apache.org/pushing.html
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: general-unsubscribe@incubator.apache.org
> For additional commands, e-mail: general-help@incubator.apache.org
>
>

Re: Podling request: Gerrit

Posted by Till Westmann <ti...@apache.org>.
On 14 Jul 2015, at 15:31, David Nalley wrote:

> On Tue, Jul 14, 2015 at 1:14 AM, Ian Maxon <im...@uci.edu> wrote:
>
>> We use Gerrit as
>> a tool to do code reviews and to organize the commits, as well as to
>> facilitate easy testing. However that's all it's used for- we still
>> clone from repositories that come downstream from ASF, not the other
>> way around. I'd be interested to understand how this would be
>> considered any different than what is done with Github Pull Requests.
>>
>
> So GH PR have a subtle distinction (at least in the way that they are
> handled at the ASF). Projects can't merge pull requests into the repo
> at github. Non-committers see a workflow that is the Github workflow,
> because that's very familiar, and lowers the barrier to contribution.
> Committers, however, have a very different workflow than the folks who
> typically review and close pull requests on github. They have to take
> the patch [1], and merge it into the canonical repository at the ASF,
> which then appears in the github repository because of the mirror
> process.  This stops the problem of diverging codebases that you are
> currently experiencing, calls to rewrite history to align the ASF repo
> with the external repo, etc.

As Ian indicated AsterixDB's process also requires manual interaction of
a committer. The current steps are now documented on the website [2].

> There are some other problems, that aren't necessarily as worrisome,
> but should be something to consider. First, you're relying on a third
> party to provide that resource. That's not inherently a problem, but
> we have a number of examples of projects using external tools and
> those being shut down or phased out which causes tremendous disruption
> to projects. It's also at the old project's home, which might cause
> some folks to question whether the project is truly independent, or
> not.

In my view Gerrit is "just" a tool that the AsterixDB community chose
to keep when starting the incubation process. It is is non-essential and
has been used by developers from different organizations before the
incubation started. But I think that its use was and is very beneficial
to the project.
When we started incubation it seemed to us, that keeping the existing
tool would be a good idea as it
a) allows for a smoother transition and
b) would not put additional requirements on the ASF infrastructure.

However, I do agree that a shut down of the service (which seems very
unlikely at the current point in time) could be a disruption to the
project. So it might be better to run this tool on the ASF
infrastructure.
Should we pursue this?
Or is it acceptable to keep the tool on external hardware for now?
Or do you see fundamental issues with AsterixDB's use of Gerrit?

Thanks,
Till

> [1] 
> https://patch-diff.githubusercontent.com/raw/apache/airavata/pull/18.patch

[2] https://asterixdb.incubator.apache.org/pushing.html

---------------------------------------------------------------------
To unsubscribe, e-mail: general-unsubscribe@incubator.apache.org
For additional commands, e-mail: general-help@incubator.apache.org


Re: Podling request: Gerrit

Posted by David Nalley <da...@gnsa.us>.
On Tue, Jul 14, 2015 at 1:14 AM, Ian Maxon <im...@uci.edu> wrote:
>> This is pretty far from what the norm is, and is being proposed by a
>> podling, so I'd expect some skepticism. Other podlings, have proposed
>> similar workflows (albeit with extra problems.) and were not allowed
>> to retain that procedure.
>
> I think there's some confusion here about what the workflow exactly is
> at the moment. ASF git is the source of truth here, and nothing goes
> into it without a committer putting it there by hand.

That's good.

> We use Gerrit as
> a tool to do code reviews and to organize the commits, as well as to
> facilitate easy testing. However that's all it's used for- we still
> clone from repositories that come downstream from ASF, not the other
> way around. I'd be interested to understand how this would be
> considered any different than what is done with Github Pull Requests.
>

So GH PR have a subtle distinction (at least in the way that they are
handled at the ASF). Projects can't merge pull requests into the repo
at github. Non-committers see a workflow that is the Github workflow,
because that's very familiar, and lowers the barrier to contribution.
Committers, however, have a very different workflow than the folks who
typically review and close pull requests on github. They have to take
the patch [1], and merge it into the canonical repository at the ASF,
which then appears in the github repository because of the mirror
process.  This stops the problem of diverging codebases that you are
currently experiencing, calls to rewrite history to align the ASF repo
with the external repo, etc.

There are some other problems, that aren't necessarily as worrisome,
but should be something to consider. First, you're relying on a third
party to provide that resource. That's not inherently a problem, but
we have a number of examples of projects using external tools and
those being shut down or phased out which causes tremendous disruption
to projects. It's also at the old project's home, which might cause
some folks to question whether the project is truly independent, or
not.

--David

[1] https://patch-diff.githubusercontent.com/raw/apache/airavata/pull/18.patch

---------------------------------------------------------------------
To unsubscribe, e-mail: general-unsubscribe@incubator.apache.org
For additional commands, e-mail: general-help@incubator.apache.org


Re: Podling request: Gerrit

Posted by Ian Maxon <im...@uci.edu>.
> This is pretty far from what the norm is, and is being proposed by a
> podling, so I'd expect some skepticism. Other podlings, have proposed
> similar workflows (albeit with extra problems.) and were not allowed
> to retain that procedure.

I think there's some confusion here about what the workflow exactly is
at the moment. ASF git is the source of truth here, and nothing goes
into it without a committer putting it there by hand. We use Gerrit as
a tool to do code reviews and to organize the commits, as well as to
facilitate easy testing. However that's all it's used for- we still
clone from repositories that come downstream from ASF, not the other
way around. I'd be interested to understand how this would be
considered any different than what is done with Github Pull Requests.

- Ian

On Mon, Jul 13, 2015 at 7:45 PM, David Nalley <da...@gnsa.us> wrote:
> On Mon, Jul 13, 2015 at 9:38 PM, Ian Maxon <im...@uci.edu> wrote:
>>> This is a severe problem and needs to be rectified promptly.
>>> How are commits migrating from the external repository to the ASF repository?
>>> We typically end up missing provenance information that is important
>>> in blind mirroring of content.
>>>
>>
>> There's no blind mirroring going on here. What we have done is to ask
>> every committer to first, review and submit their patch to Gerrit, and
>> then take the submitted commit from Gerrit's master, and mirror that
>> to ASF's git by hand. This way, push logs remain valid.
>>
>
> That's a little better and keeps some problems at bay, but introduces
> others (as you're experiencing now, diverging codebases).
> At the end of the day, VP Legal is the one that has to be satisfied.
> This is pretty far from what the norm is, and is being proposed by a
> podling, so I'd expect some skepticism. Other podlings, have proposed
> similar workflows (albeit with extra problems.) and were not allowed
> to retain that procedure.
>
> --David
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: general-unsubscribe@incubator.apache.org
> For additional commands, e-mail: general-help@incubator.apache.org
>

---------------------------------------------------------------------
To unsubscribe, e-mail: general-unsubscribe@incubator.apache.org
For additional commands, e-mail: general-help@incubator.apache.org


Re: Podling request: Gerrit

Posted by David Nalley <da...@gnsa.us>.
On Mon, Jul 13, 2015 at 9:38 PM, Ian Maxon <im...@uci.edu> wrote:
>> This is a severe problem and needs to be rectified promptly.
>> How are commits migrating from the external repository to the ASF repository?
>> We typically end up missing provenance information that is important
>> in blind mirroring of content.
>>
>
> There's no blind mirroring going on here. What we have done is to ask
> every committer to first, review and submit their patch to Gerrit, and
> then take the submitted commit from Gerrit's master, and mirror that
> to ASF's git by hand. This way, push logs remain valid.
>

That's a little better and keeps some problems at bay, but introduces
others (as you're experiencing now, diverging codebases).
At the end of the day, VP Legal is the one that has to be satisfied.
This is pretty far from what the norm is, and is being proposed by a
podling, so I'd expect some skepticism. Other podlings, have proposed
similar workflows (albeit with extra problems.) and were not allowed
to retain that procedure.

--David

---------------------------------------------------------------------
To unsubscribe, e-mail: general-unsubscribe@incubator.apache.org
For additional commands, e-mail: general-help@incubator.apache.org


Re: Podling request: Gerrit

Posted by Ian Maxon <im...@uci.edu>.
> This is a severe problem and needs to be rectified promptly.
> How are commits migrating from the external repository to the ASF repository?
> We typically end up missing provenance information that is important
> in blind mirroring of content.
>

There's no blind mirroring going on here. What we have done is to ask
every committer to first, review and submit their patch to Gerrit, and
then take the submitted commit from Gerrit's master, and mirror that
to ASF's git by hand. This way, push logs remain valid.

- Ian

---------------------------------------------------------------------
To unsubscribe, e-mail: general-unsubscribe@incubator.apache.org
For additional commands, e-mail: general-help@incubator.apache.org


Re: Podling request: Gerrit

Posted by David Nalley <da...@gnsa.us>.
On Mon, Jul 13, 2015 at 7:24 PM, Jochen Wiedmann
<jo...@gmail.com> wrote:
> Hi,
>
> I am writing as one of the Mentors of the AsterixDB podling.
>
> It recently came to my attention, that there are, in fact, multiple
> Git repositories, which are used by the project, one of them being
> located externally of the ASF. I understand the structure to be like
> this:
>

This is a severe problem and needs to be rectified promptly.
How are commits migrating from the external repository to the ASF repository?
We typically end up missing provenance information that is important
in blind mirroring of content.



>   +--------------+  Commits   +------------------+  Mirrrors
> +----------------+
>    |  Gerrit      | --------------> | Git (External) | ------------->
> | Git (ASF)    |
>    +-------------+                    +------------------+
>      +----------------+
>
> The structure is made like this, because the project members desire
> that no commits can enter without a review, which is done in Gerrit. [2]
> (In the past, this was ensured by a commit hook in the external
> repository. That commit hook possibly still exists, but it doesn't
> prevent
> code to enter the ASF repository directly without a review. This lack
> of security is currently discussed by the podlings project members.)
>
> I understand the desire, and, to me, it makes sense. OTOH,  I suspect
> that this issue might affect a successful incubation. Hence this mail.
>

Agreed. This needs to be rectified rapidly.


> As Git is slowly gaining ground within the ASF, I'd suggest that a
> possible resolution might be to have a Gerrit instance within the ASF.
> Given how Github pull requests are already discussed by many projects,
> I can imagine that many projects would like to adopt a similar policy.
>

Git is very widely used - slightly over 1/2 of the active projects at
the ASF are using it now as their primary VCS.

That said, we've explored gerrit a number of times, most recently in
December. Just for frame of reference, I was very much in favor of
Gerrit. I thought that there were a number of projects who also wanted
it - but many of those changed their mind over time. In the end we
discovered that there are a number of challenges:

First, Gerrit wants what is best described as exclusive access to git
repositories. It tends to want them on a local filesystem, and
essentially acting as gatekeeper for commits. This isn't inherently a
problem if you have all repos treated this way. But since we don't
have all projects wanting
Second: Gerrit wants every patch author authenticated against a common
authn backend. This would mean folks would need accounts in LDAP. When
we explored this last our LDAP infrastructure was incapable of what
would have been an explosive growth in number of accounts and
authentication requests. We've since made the infrastructure much more
robust and resilient, but deploying gerrit would essentially require
us to have a self-service account creation service, and that's a lot
of work.

At the moment, Infrastructure doesn't see enough demand from projects
requesting Gerrit to make the tremendous investment required. We've
also noticed a number of trends in projects who were interested in
this:

The oldest strategy is from Hadoop, and they have every patch
submitted to Jira. Every patch is automatically detected, and has a
pre-commit test job run, with Jenkins reporting to Jira the results of
the tests.

We have Reviewboard[0], which is gerrit-like, without the problems
listed above.

More recently, we have folks making heavy use of github pull requests.
and there are two primary technologies that are being seen there.
1. Github pull request builder: Jenkins watches for pull requests
against the GH mirror of the repo, and automatically picks up the job,
and then reports the success or failure of that job in the pull
request.

2. TravisCI - The ASF has a paid account with TravisCI and has 30
concurrent builders. Like the Github Pull Request Builder, it watches
for pull requests against a repository and then runs tests, and then
reports against the pull request. [2]

Obviously, there's no automatic merge, or even technical enforcement.
However, most projects are able to use social enforcement (and reverts
if necessary) to ensure that folks aren't committing directly; and
automatic merges would be disallowed anyway since a committer needs to
make an explicit decision to commit.


[0] http://reviews.apache.org
[1] https://blogs.apache.org/infra/entry/github_pull_request_builds_now
[2] https://blogs.apache.org/infra/entry/apache_gains_additional_travis_ci


--David

---------------------------------------------------------------------
To unsubscribe, e-mail: general-unsubscribe@incubator.apache.org
For additional commands, e-mail: general-help@incubator.apache.org


Re: Podling request: Gerrit

Posted by Roman Shaposhnik <ro...@shaposhnik.org>.
IIRC, the problem with Gerrit workflow is that actually
pushes into the repo are actually done by the bot. This
runs against ASF's desire to keep push logs that actually
make sense.

The setup that you're describing (although ASCII art
came broken via Gmail) seems to be addressing that
very problem: having a Gerrit-specific repo that a human
being then synchronizes with the ASF canonical Git
repo. This seems like a pretty reasonable way to
accomodate both constraints.

Thanks,
Roman.

On Mon, Jul 13, 2015 at 4:24 PM, Jochen Wiedmann
<jo...@gmail.com> wrote:
> Hi,
>
> I am writing as one of the Mentors of the AsterixDB podling.
>
> It recently came to my attention, that there are, in fact, multiple
> Git repositories, which are used by the project, one of them being
> located externally of the ASF. I understand the structure to be like
> this:
>
>   +--------------+  Commits   +------------------+  Mirrrors
> +----------------+
>    |  Gerrit      | --------------> | Git (External) | ------------->
> | Git (ASF)    |
>    +-------------+                    +------------------+
>      +----------------+
>
> The structure is made like this, because the project members desire
> that no commits can enter without a review, which is done in Gerrit. [2]
> (In the past, this was ensured by a commit hook in the external
> repository. That commit hook possibly still exists, but it doesn't
> prevent
> code to enter the ASF repository directly without a review. This lack
> of security is currently discussed by the podlings project members.)
>
> I understand the desire, and, to me, it makes sense. OTOH,  I suspect
> that this issue might affect a successful incubation. Hence this mail.
>
> As Git is slowly gaining ground within the ASF, I'd suggest that a
> possible resolution might be to have a Gerrit instance within the ASF.
> Given how Github pull requests are already discussed by many projects,
> I can imagine that many projects would like to adopt a similar policy.
>
> How about that?
>
> Thanks,
>
> Jochen
>
>
>
>
>
>
> [1] http://mail-archives.apache.org/mod_mbox/incubator-asterixdb-dev/201507.mbox/%3CCAN_YF5zRWZijKOQyYx59%2B7wUyXkPg0P2d-c2hBrx64mNFd4hBg%40mail.gmail.com%3E
> [2] https://en.wikipedia.org/wiki/Gerrit_(software)
>
> --
> Any world that can produce the Taj Mahal, William Shakespeare,
> and Stripe toothpaste can't be all bad. (C.R. MacNamara, One Two Three)
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: general-unsubscribe@incubator.apache.org
> For additional commands, e-mail: general-help@incubator.apache.org
>

Re: Podling request: Gerrit

Posted by Ian Maxon <im...@uci.edu>.
To add onto Jochen's comments, even something lesser than a hosted
Gerrit instance might suffice. The core issue for integrating our
previous Git workflow is that, as I understand it, there's no way to
have "robot" committers to ASF git. We previously had Gerrit acting on
behalf of whoever submitted the commit, and whoever submitted the
commit in this case is necessarily an Apache committer. Gerrit is just
a safe intermediary for performing what otherwise is a more error
prone process of cherry-picking and pushing commits. The person who
submitted and pushed the review is still recorded, via the committer
and author fields of the commit, as well as the reviewed-by fields in
the comment of the commit.

- Ian

On Mon, Jul 13, 2015 at 4:24 PM, Jochen Wiedmann
<jo...@gmail.com> wrote:
> Hi,
>
> I am writing as one of the Mentors of the AsterixDB podling.
>
> It recently came to my attention, that there are, in fact, multiple
> Git repositories, which are used by the project, one of them being
> located externally of the ASF. I understand the structure to be like
> this:
>
>   +--------------+  Commits   +------------------+  Mirrrors
> +----------------+
>    |  Gerrit      | --------------> | Git (External) | ------------->
> | Git (ASF)    |
>    +-------------+                    +------------------+
>      +----------------+
>
> The structure is made like this, because the project members desire
> that no commits can enter without a review, which is done in Gerrit. [2]
> (In the past, this was ensured by a commit hook in the external
> repository. That commit hook possibly still exists, but it doesn't
> prevent
> code to enter the ASF repository directly without a review. This lack
> of security is currently discussed by the podlings project members.)
>
> I understand the desire, and, to me, it makes sense. OTOH,  I suspect
> that this issue might affect a successful incubation. Hence this mail.
>
> As Git is slowly gaining ground within the ASF, I'd suggest that a
> possible resolution might be to have a Gerrit instance within the ASF.
> Given how Github pull requests are already discussed by many projects,
> I can imagine that many projects would like to adopt a similar policy.
>
> How about that?
>
> Thanks,
>
> Jochen
>
>
>
>
>
>
> [1] http://mail-archives.apache.org/mod_mbox/incubator-asterixdb-dev/201507.mbox/%3CCAN_YF5zRWZijKOQyYx59%2B7wUyXkPg0P2d-c2hBrx64mNFd4hBg%40mail.gmail.com%3E
> [2] https://en.wikipedia.org/wiki/Gerrit_(software)
>
> --
> Any world that can produce the Taj Mahal, William Shakespeare,
> and Stripe toothpaste can't be all bad. (C.R. MacNamara, One Two Three)
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: general-unsubscribe@incubator.apache.org
> For additional commands, e-mail: general-help@incubator.apache.org
>

Re: Podling request: Gerrit

Posted by David Nalley <da...@gnsa.us>.
On Mon, Jul 13, 2015 at 7:24 PM, Jochen Wiedmann
<jo...@gmail.com> wrote:
> Hi,
>
> I am writing as one of the Mentors of the AsterixDB podling.
>
> It recently came to my attention, that there are, in fact, multiple
> Git repositories, which are used by the project, one of them being
> located externally of the ASF. I understand the structure to be like
> this:
>

This is a severe problem and needs to be rectified promptly.
How are commits migrating from the external repository to the ASF repository?
We typically end up missing provenance information that is important
in blind mirroring of content.



>   +--------------+  Commits   +------------------+  Mirrrors
> +----------------+
>    |  Gerrit      | --------------> | Git (External) | ------------->
> | Git (ASF)    |
>    +-------------+                    +------------------+
>      +----------------+
>
> The structure is made like this, because the project members desire
> that no commits can enter without a review, which is done in Gerrit. [2]
> (In the past, this was ensured by a commit hook in the external
> repository. That commit hook possibly still exists, but it doesn't
> prevent
> code to enter the ASF repository directly without a review. This lack
> of security is currently discussed by the podlings project members.)
>
> I understand the desire, and, to me, it makes sense. OTOH,  I suspect
> that this issue might affect a successful incubation. Hence this mail.
>

Agreed. This needs to be rectified rapidly.


> As Git is slowly gaining ground within the ASF, I'd suggest that a
> possible resolution might be to have a Gerrit instance within the ASF.
> Given how Github pull requests are already discussed by many projects,
> I can imagine that many projects would like to adopt a similar policy.
>

Git is very widely used - slightly over 1/2 of the active projects at
the ASF are using it now as their primary VCS.

That said, we've explored gerrit a number of times, most recently in
December. Just for frame of reference, I was very much in favor of
Gerrit. I thought that there were a number of projects who also wanted
it - but many of those changed their mind over time. In the end we
discovered that there are a number of challenges:

First, Gerrit wants what is best described as exclusive access to git
repositories. It tends to want them on a local filesystem, and
essentially acting as gatekeeper for commits. This isn't inherently a
problem if you have all repos treated this way. But since we don't
have all projects wanting
Second: Gerrit wants every patch author authenticated against a common
authn backend. This would mean folks would need accounts in LDAP. When
we explored this last our LDAP infrastructure was incapable of what
would have been an explosive growth in number of accounts and
authentication requests. We've since made the infrastructure much more
robust and resilient, but deploying gerrit would essentially require
us to have a self-service account creation service, and that's a lot
of work.

At the moment, Infrastructure doesn't see enough demand from projects
requesting Gerrit to make the tremendous investment required. We've
also noticed a number of trends in projects who were interested in
this:

The oldest strategy is from Hadoop, and they have every patch
submitted to Jira. Every patch is automatically detected, and has a
pre-commit test job run, with Jenkins reporting to Jira the results of
the tests.

We have Reviewboard[0], which is gerrit-like, without the problems
listed above.

More recently, we have folks making heavy use of github pull requests.
and there are two primary technologies that are being seen there.
1. Github pull request builder: Jenkins watches for pull requests
against the GH mirror of the repo, and automatically picks up the job,
and then reports the success or failure of that job in the pull
request.

2. TravisCI - The ASF has a paid account with TravisCI and has 30
concurrent builders. Like the Github Pull Request Builder, it watches
for pull requests against a repository and then runs tests, and then
reports against the pull request. [2]

Obviously, there's no automatic merge, or even technical enforcement.
However, most projects are able to use social enforcement (and reverts
if necessary) to ensure that folks aren't committing directly; and
automatic merges would be disallowed anyway since a committer needs to
make an explicit decision to commit.


[0] http://reviews.apache.org
[1] https://blogs.apache.org/infra/entry/github_pull_request_builds_now
[2] https://blogs.apache.org/infra/entry/apache_gains_additional_travis_ci


--David