You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@daffodil.apache.org by Steve Lawrence <sl...@apache.org> on 2017/10/11 12:39:25 UTC

Infrastructure Changes for ASF

While we are waiting the remaining SGAs, I think now is a good time to
start thinking about how the move to ASF infrastructure will affect the
Daffodil project. ASF supports a different infrastructure than we used
in the past, so some changes will be required to workflow, and some
changes should be made to reduce the barrier to entry for new contributors.

== Documentation ==

Daffodil uses Confluence for user and developer documentation. ASF
provides a confluence instance, so we just need to transfer the
information. This may be a good time to reorganize our confluence pages
and remove/update old information, but should otherwise work exactly the
same.

AASF also provides web hosting for static content (e.g. downloads,
Daffodil high level overview, mailing list info, etc.) as a sort of
landing page for the project. This will need to be developed. I'm not
too familiar with website building tools, but there are many out
there--this will take more research. We should look at what other Apache
projects use as inspiration.

== Issue Tracking ==

ASF provides JIRA for tracking issues, and we even already have an empty
JIRA project set up for us at:

  https://issues.apache.org/jira/projects/DAFFODIL

Daffodil used JIRA before Apache so the workflow changes should not be
too different. We should probably maintain a very similar workflow with
this regard (e.g. all changes require a bug, assign to self when
starting progress, resolve issues when fixed, etc.). We can flesh out a
formal description and process for issue tracking for new contributors
to follow, but I think this is all fairly standard and will remain
mostly unchanged from what we had before. I'm sure there will be some
changes to the overall workflow (e.g. removal of scala-new, how will
bugs be officially closed, etc.) but they will all be relatively minor
and not really infrastructure related, so I don't want to spend too much
time on that in this email.

Note that one piece of effort related to JIRA is transferring our
existing bugs to the new JIRA. Based on reading through the INFRA JIRA
and seeing other projects do this, we mainly just need to export our
existing bugs as JSON and create a user mapping between the JIRA accounts.

== Patch Submission & Review ==

This is where we will likely have the most change relative to
infrastructure and am looking to have some more in-depth discussions.
Previously, the Daffodil workflow had all committers making changes to
"review" branches in the main repo, the changes were reviewed, and
finally rebased to the development branch. This could continue to work
for us, but it has some downsides. As we gain more committers, more
review branches could make the main repo pretty messy. And in general we
probably don't want lots of unreviewed code in the main repo, even if
they are on separate branches. Furthermore, and probably the biggest
reason to not continue this practice, is that contributors that are not
committers would not have the privileges to add review branches to the
main repo and so they would need to follow a different process than
committers. I propose that all committers should follow the exact some
contribution process as non-committers, and so we need a different patch
submissions and review process that works for both, of which there are a
few options below:

The first, and I think the traditional method for Apache projects, is
for contributors to add a patch to a JIRA ticket as an attachment. This
is convenient in that JIRA tickets and patches are closely tied
together, but creating a patch file and uploading it might not be as
easy as it could be. Once a patch is attached, a process is
automatically kicked off to run tests on the patch and start a review at
reviews.apache.org via ReviewBoard. This seems like a good workflow, but
I personally find ReviewBoard difficult to use and lacks some features
that I've become accustomed to after using Crucible for Daffodil in the
past.

A similar method would be to use github. Apache mirrors the Daffodil git
repository to github, and with the use of Apache gitbox, can even
support accepting github pull requests. This has some very obvious
benefits. Many people are already very familiar with github and so could
be a good way to attract more contributors. It also has an intuitive
interface for creating and accepting pull requests, again reducing
barrier to entry. Github also very cleanly integrates with TravisCI to
test pull requests. Note that JIRA must still be the bug tracker, and
gitbox copies all review comments to the original JIRA bug as comments.
This is good for tracking the review comments, but makes JIRA bugs
pretty messy and hard to follow. Also, there are some criticism of the
github code review interface, or people that simple do not want or have
a github account. Like the above, it also requires network connectivity
to draft reviews, though this may be a non-issue nowadays.

Another alternative, which is maybe less modern but is pretty tried and
true is to use something similar to Linux kernel review process. In this
process, all patches are emailed directly to the mailing list via
git-send-email. Review comments happen as replies to those emails,
allowing for complex and easily branching discussions. Committing a
patch requires that a committer save the email and apply it using
git-am. One big benefit of this process over the others is that patches
and review comments are much more likely to be seen since they go
directly to the dev list. This encourages activity and allows new devs
to learn as they see the patches. It also has a low barrier to
entry--one just needs to configure git-send-email to use SMTP servers of
preference and run a git command. It also also been shown to scale very
well, is well understood, and is well documented. It also follows the
ASF motto of "If it didn't happen on a mailing list, it didn't happen."
Note that this would not remove JIRA for bug tracking, so a downside is
that it may require some manual updates to JIRA such as specifying that
a patch has been submitted to the mailing list. This also does not
tightly integrate with continuous integration systems, so might require
committers to manually test patches (not necessarily a bad thing, and
tools like patchwork/snowpatch exist to send mailing list patches to a
Jenkins server, though not currently supported by Apache infra). Maybe
the biggest downside is that while people are familiar with email, it
doesn't have some nice features of other review tools, like marking
comments as resolved, syntax highlighting, etc. It's simple, but
minimal. The article and comments below have some good discussions about
the pros and cons of email for patches and how it works well for the
Linux kernel:

  https://lwn.net/Articles/702177/

I'm sure there are many other options that I have not considered. I'm
definitely open to alternatives.

== Continuous Integration ==

Previously, Daffodil used Bamboo for continuous integration. ASF does
not support this, but does support a few alternatives:

  https://ci.apache.org/

We have had experience setting up Daffodil to run on Jenkins in the
past, so this seems preferable. Though, it looks like both Jenkins and
Buildbot meet the necessary requirements, so either would likely work.
We could also provide a TravisCI configuration so that people that
maintain a github fork (regardless of the Patch Submission process)
could take advantage of that service).

== Maven Repository ==

Daffodil used a Nexus repository on the NCSA servers. Apache infra
provides a Nexus server, so this should be virtually unchanged. Just
need to publish to a different server, and tweak our release process to
follow Apache release guidelines.

- Steve

Re: Hipchat account?

Posted by Steve Lawrence <sl...@apache.org>.
You need an invite to create a HipChat account. I think I've just sent
you an invite to your @apache.org email address. That might just be a
guest invite, so if that doesn't let you create a HipChat account you'll
need to contact Infra and have them do it. Let me know if this works and
I can invite the rest of the team.

Once you create an account, follow the links on this page to add it to
your XMPP client of choice:

https://confluence.atlassian.com/hipchatkb/setting-up-xmpp-jabber-clients-for-hipchat-751436251.html

- Steve

On 10/30/2017 03:36 PM, John D. Ament wrote:
> Can you reach out to infra?  Now that hipchat has started to fade away with
> Atlassian's new product, I've seen weird issues like this.  You could jump
> in on the infra chat channel http://infra.chat
> 
> 
> John
> 
> On Mon, Oct 30, 2017 at 3:32 PM Mike Beckerle <mb...@tresys.com> wrote:
> 
>>
>>
>> So our HipChat is now available: https://www.hipchat.com/gJt9EQs5l
>>
>> But when I try to actually login to this (as opposed to using it as
>> guest), it asks for email, and it won't accept mbeckerle@apache.org.
>>
>> Shouldn't that be my credentials on hipchat?
>>
> 


Re: Hipchat account?

Posted by "John D. Ament" <jo...@apache.org>.
Can you reach out to infra?  Now that hipchat has started to fade away with
Atlassian's new product, I've seen weird issues like this.  You could jump
in on the infra chat channel http://infra.chat


John

On Mon, Oct 30, 2017 at 3:32 PM Mike Beckerle <mb...@tresys.com> wrote:

>
>
> So our HipChat is now available: https://www.hipchat.com/gJt9EQs5l
>
> But when I try to actually login to this (as opposed to using it as
> guest), it asks for email, and it won't accept mbeckerle@apache.org.
>
> Shouldn't that be my credentials on hipchat?
>

Hipchat account?

Posted by Mike Beckerle <mb...@tresys.com>.

So our HipChat is now available: https://www.hipchat.com/gJt9EQs5l

But when I try to actually login to this (as opposed to using it as guest), it asks for email, and it won't accept mbeckerle@apache.org.

Shouldn't that be my credentials on hipchat?

Re: Infrastructure Changes for ASF

Posted by Steve Lawrence <sl...@apache.org>.
On 10/13/2017 12:53 PM, Steve Lawrence wrote:
> 
> Left out one thing: online chatting. In the past, we've used HipChat,
> which ASF supports. It looks like another commonly used alternative
> among Apache projects is freenode IRC. I'm not sure of any real benefit
> to one over the other, so I suggest we just continue with HipChat since
> that's what we've used and ASF supports it. Unless there are any
> objections, I'll open a bug with INFRA to create a HipChat room for
> Daffodil.
> 

HipChat is now available: https://www.hipchat.com/gJt9EQs5l

- Steve

Re: Infrastructure Changes for ASF

Posted by Steve Lawrence <sl...@apache.org>.
On 10/11/2017 08:39 AM, Steve Lawrence wrote:
> While we are waiting the remaining SGAs, I think now is a good time to
> start thinking about how the move to ASF infrastructure will affect the
> Daffodil project. ASF supports a different infrastructure than we used
> in the past, so some changes will be required to workflow, and some
> changes should be made to reduce the barrier to entry for new contributors.
> 
> == Documentation ==
> 
> Daffodil uses Confluence for user and developer documentation. ASF
> provides a confluence instance, so we just need to transfer the
> information. This may be a good time to reorganize our confluence pages
> and remove/update old information, but should otherwise work exactly the
> same.
> 
> AASF also provides web hosting for static content (e.g. downloads,
> Daffodil high level overview, mailing list info, etc.) as a sort of
> landing page for the project. This will need to be developed. I'm not
> too familiar with website building tools, but there are many out
> there--this will take more research. We should look at what other Apache
> projects use as inspiration.
> 
> == Issue Tracking ==
> 
> ASF provides JIRA for tracking issues, and we even already have an empty
> JIRA project set up for us at:
> 
>   https://issues.apache.org/jira/projects/DAFFODIL
> 
> Daffodil used JIRA before Apache so the workflow changes should not be
> too different. We should probably maintain a very similar workflow with
> this regard (e.g. all changes require a bug, assign to self when
> starting progress, resolve issues when fixed, etc.). We can flesh out a
> formal description and process for issue tracking for new contributors
> to follow, but I think this is all fairly standard and will remain
> mostly unchanged from what we had before. I'm sure there will be some
> changes to the overall workflow (e.g. removal of scala-new, how will
> bugs be officially closed, etc.) but they will all be relatively minor
> and not really infrastructure related, so I don't want to spend too much
> time on that in this email.
> 
> Note that one piece of effort related to JIRA is transferring our
> existing bugs to the new JIRA. Based on reading through the INFRA JIRA
> and seeing other projects do this, we mainly just need to export our
> existing bugs as JSON and create a user mapping between the JIRA accounts.
> 
> == Patch Submission & Review ==
> 
> This is where we will likely have the most change relative to
> infrastructure and am looking to have some more in-depth discussions.
> Previously, the Daffodil workflow had all committers making changes to
> "review" branches in the main repo, the changes were reviewed, and
> finally rebased to the development branch. This could continue to work
> for us, but it has some downsides. As we gain more committers, more
> review branches could make the main repo pretty messy. And in general we
> probably don't want lots of unreviewed code in the main repo, even if
> they are on separate branches. Furthermore, and probably the biggest
> reason to not continue this practice, is that contributors that are not
> committers would not have the privileges to add review branches to the
> main repo and so they would need to follow a different process than
> committers. I propose that all committers should follow the exact some
> contribution process as non-committers, and so we need a different patch
> submissions and review process that works for both, of which there are a
> few options below:
> 
> The first, and I think the traditional method for Apache projects, is
> for contributors to add a patch to a JIRA ticket as an attachment. This
> is convenient in that JIRA tickets and patches are closely tied
> together, but creating a patch file and uploading it might not be as
> easy as it could be. Once a patch is attached, a process is
> automatically kicked off to run tests on the patch and start a review at
> reviews.apache.org via ReviewBoard. This seems like a good workflow, but
> I personally find ReviewBoard difficult to use and lacks some features
> that I've become accustomed to after using Crucible for Daffodil in the
> past.
> 
> A similar method would be to use github. Apache mirrors the Daffodil git
> repository to github, and with the use of Apache gitbox, can even
> support accepting github pull requests. This has some very obvious
> benefits. Many people are already very familiar with github and so could
> be a good way to attract more contributors. It also has an intuitive
> interface for creating and accepting pull requests, again reducing
> barrier to entry. Github also very cleanly integrates with TravisCI to
> test pull requests. Note that JIRA must still be the bug tracker, and
> gitbox copies all review comments to the original JIRA bug as comments.
> This is good for tracking the review comments, but makes JIRA bugs
> pretty messy and hard to follow. Also, there are some criticism of the
> github code review interface, or people that simple do not want or have
> a github account. Like the above, it also requires network connectivity
> to draft reviews, though this may be a non-issue nowadays.
> 
> Another alternative, which is maybe less modern but is pretty tried and
> true is to use something similar to Linux kernel review process. In this
> process, all patches are emailed directly to the mailing list via
> git-send-email. Review comments happen as replies to those emails,
> allowing for complex and easily branching discussions. Committing a
> patch requires that a committer save the email and apply it using
> git-am. One big benefit of this process over the others is that patches
> and review comments are much more likely to be seen since they go
> directly to the dev list. This encourages activity and allows new devs
> to learn as they see the patches. It also has a low barrier to
> entry--one just needs to configure git-send-email to use SMTP servers of
> preference and run a git command. It also also been shown to scale very
> well, is well understood, and is well documented. It also follows the
> ASF motto of "If it didn't happen on a mailing list, it didn't happen."
> Note that this would not remove JIRA for bug tracking, so a downside is
> that it may require some manual updates to JIRA such as specifying that
> a patch has been submitted to the mailing list. This also does not
> tightly integrate with continuous integration systems, so might require
> committers to manually test patches (not necessarily a bad thing, and
> tools like patchwork/snowpatch exist to send mailing list patches to a
> Jenkins server, though not currently supported by Apache infra). Maybe
> the biggest downside is that while people are familiar with email, it
> doesn't have some nice features of other review tools, like marking
> comments as resolved, syntax highlighting, etc. It's simple, but
> minimal. The article and comments below have some good discussions about
> the pros and cons of email for patches and how it works well for the
> Linux kernel:
> 
>   https://lwn.net/Articles/702177/
> 
> I'm sure there are many other options that I have not considered. I'm
> definitely open to alternatives.
> 
> == Continuous Integration ==
> 
> Previously, Daffodil used Bamboo for continuous integration. ASF does
> not support this, but does support a few alternatives:
> 
>   https://ci.apache.org/
> 
> We have had experience setting up Daffodil to run on Jenkins in the
> past, so this seems preferable. Though, it looks like both Jenkins and
> Buildbot meet the necessary requirements, so either would likely work.
> We could also provide a TravisCI configuration so that people that
> maintain a github fork (regardless of the Patch Submission process)
> could take advantage of that service).
> 
> == Maven Repository ==
> 
> Daffodil used a Nexus repository on the NCSA servers. Apache infra
> provides a Nexus server, so this should be virtually unchanged. Just
> need to publish to a different server, and tweak our release process to
> follow Apache release guidelines.
> 

Left out one thing: online chatting. In the past, we've used HipChat,
which ASF supports. It looks like another commonly used alternative
among Apache projects is freenode IRC. I'm not sure of any real benefit
to one over the other, so I suggest we just continue with HipChat since
that's what we've used and ASF supports it. Unless there are any
objections, I'll open a bug with INFRA to create a HipChat room for
Daffodil.

Re: Infrastructure Changes for ASF

Posted by Mike Beckerle <mb...@tresys.com>.
Very good. I forgot that one can setup a git remote to several other repositories.  That solves the visibility issue nicely.

________________________________
From: Steve Lawrence <sl...@apache.org>
Sent: Wednesday, October 11, 2017 12:39:40 PM
To: dev@daffodil.apache.org; Mike Beckerle
Subject: Re: Infrastructure Changes for ASF

On 10/11/2017 11:34 AM, Mike Beckerle wrote:
> Review branches have a lot of value.
>
>
> 1) Work Visibility
>
>
> I like being able to see OTHER developers review branches as a means of avoiding duplicate work, seeing if they are progressing, or the branches go quiet, are their changes going to create an integration nightmare with mine, or my planned changes.
>
>
> So we lose a lot by not having review branches.
>
>
> I would suggest maybe one needs two repos. Let's call one "controlled", and the other "freeforall". The controlled one, only selected people can integrate into. The freeforall  any project developer can create their own review branches on, and which is frequently updated from changes on the controlled one.
>
>
> This is an approximation of a git feature that doesn't exist which is per-branch access controls.
>
>
> 2) Backups
>
>
> One issue with the pull-request system where no review branches are created,.... is that a developer does not automatically get a place to checkpoint/backup their work to a repo.
>
>
> My development machine does automatic incremental backups  but I believe this is not standard practice among developers who know and love git.
>
>
> Many developers rely on being able to push their own branches at will as a means of checkpointing their work. In a patch-driven no-review-branches workflow, one needs a separate remote git repo for this.
>
>
> There are ways to address this - e.g., an organization (such as Tresys) can have it's own clone of the apache repo, and developers can have all the branches they want on that.
>
>
> But let's say there's an individual contributor. They'd have to setup say, their own fork repo on github to have a place to push to where they are unconstrained. This is perhaps just fine for backups, but visibility... every developer having their own backup repo makes cross-team visibility harder.
>
>
> 3) Code-Review with no Integration Intent
>
>
> I also want to be sure I understand how I engage code-review by others with no intention whatsoever of integrating into the master. I.e., I just want to get some other people's eyes on some code in its formative stages (early reviews have more leverage than "just before integration" reviews), and I want to do this using the exact same code-review tool chain that is used for other kinds of pre-integration reviews.
>
>

I think github forks actually solve all three of these issues. Each
person should have a personal github fork of the main Apache repo.
Anytime they want to contribute a change, regardless if you're a
committer or not, you do a pull request from your fork so the change can
be reviewed. Using your words, the "controlled" repo is the main repo
hosted by Apache that only committers have access to. The "freeforall"
is each persons individual github fork. So rather than one freeforall,
each person gets their own. This is more inline with the distributed
model of Git.

Issue 1, if you care about what a particular person is working on, then
just add their fork as a git remote and fetch their branches. This is
also nice in that if you only care about the progress of a few people,
you can only add a few remotes and ignore everyone else. This doesn't
scale easily if you want to keep track of 10s or 100s of remotes, but
keeping track of that many peoples progress also doesn't really scale
either, so that's not really an issue. Github will also alert you if
there are conflicts with the upstream "controlled" repo, but that might
only be when you do a pull request.

Issue 2, since people have forks, they can push to github and get a free
backup of their work. Having this in a personal instead of a freeforall
repo is nice since if you're just checkpointing work it doesn't really
affect anyone else--it only affects your personal repo. You can
checkpoint all you want.

Issue 3, if you want to review someones branch as a sort of RFC and not
official review for integration, just have them push it up to their
github fork, perform a pull request, and just add a comment that the PR
is not to be merged but is just for an early review. Once that's done,
the pull request can be closed/rejected so it won't get merged, but will
still maintain the comments that came out of that review for history
tracking.

>
>
>
> ________________________________
> From: Steve Lawrence <sl...@apache.org>
> Sent: Wednesday, October 11, 2017 10:38:12 AM
> To: dev@daffodil.apache.org; Taylor Wise
> Cc: dev@daffodil.incubator.apache.org
> Subject: Re: Infrastructure Changes for ASF
>
> On 10/11/2017 10:17 AM, Taylor Wise wrote:
>> I see your point.  In these cases would it make more sense then to link it
>> to the PR when closing the ticket? As in add a comment with a direct link
>> to the PR to more easily view the comments?
>
> Perhaps. Looking at Apache Spark, it looks like they do not have PR
> comments in JIRA comments, for example:
>
> PR:   https://github.com/apache/spark/pull/19466
> JIRA: https://issues.apache.org/jira/browse/SPARK-22237
>
> Some more digging found this INFRA issue:
>
> https://issues.apache.org/jira/browse/INFRA-11675
>
> Which was fixed to make the ASF Github Bot add PR comments to the
> worklog rather than the comments. I'm not sure if NiFi just never
> updated some config, or if they chose that, but it seems like PR
> comments in JIRA comments is optional somehow.
>
>> On Wed, Oct 11, 2017 at 10:04 AM, Steve Lawrence <sl...@apache.org>
>> wrote:
>>
>>> On 10/11/2017 09:36 AM, Taylor Wise wrote:
>>>> On Wed, Oct 11, 2017 at 8:39 AM, Steve Lawrence <sl...@apache.org>
>>>> wrote:
>>>>
>>>>>
>>>>> A similar method would be to use github. Apache mirrors the Daffodil git
>>>>> repository to github, and with the use of Apache gitbox, can even
>>>>> support accepting github pull requests. This has some very obvious
>>>>> benefits. Many people are already very familiar with github and so could
>>>>> be a good way to attract more contributors. It also has an intuitive
>>>>> interface for creating and accepting pull requests, again reducing
>>>>> barrier to entry. Github also very cleanly integrates with TravisCI to
>>>>> test pull requests. Note that JIRA must still be the bug tracker, and
>>>>> gitbox copies all review comments to the original JIRA bug as comments.
>>>>> This is good for tracking the review comments, but makes JIRA bugs
>>>>> pretty messy and hard to follow. Also, there are some criticism of the
>>>>> github code review interface, or people that simple do not want or have
>>>>> a github account. Like the above, it also requires network connectivity
>>>>> to draft reviews, though this may be a non-issue nowadays.
>>>>>
>>>>
>>>> I personally like the interface provided by GitHub.  At least with the
>>>> graphical diff it makes it easy to see the immediate changes as well as
>>>> historically what has changed.  How does it make the JIRA bugs messy?
>>> I'm
>>>> curious, I haven't seen this in action.
>>>>
>>>
>>> Yeah, the github interface is definitely much prettier and somewhat
>>> easier to look at than ReviewBoard or email diffs (though, I'm used to
>>> email diffs on other projects, so it's all the same to me). And the
>>> TravisCI integration is very appealing to me. Though, I would kindof
>>> miss the ability to have nested review comments.
>>>
>>> As an example of the JIRA messiness, Apache NiFi uses the github
>>> workflow. Here is a random pull request with 11 comments on one commit:
>>>
>>>   https://github.com/apache/nifi/pull/2162
>>>
>>> And here is the associated JIRA issue with those comments copied in:
>>>
>>>   https://issues.apache.org/jira/browse/NIFI-1706
>>>
>>> Here's another one that's particularly bad with huge diffs in the JIRA
>>> comments:
>>>
>>>   PR:   https://github.com/apache/nifi/pull/2181
>>>   JIRA: https://issues.apache.org/jira/browse/NIFI-4428
>>>
>>> Neither of these pull requests are particularly complicated too. Imagine
>>> this with some of the big patches we've had in the past with lots of
>>> comments.
>>>
>>> It's not terrible, but I think it makes legitimate JIRA comments
>>> difficult to find, and might even discourage comments in JIRA issues--I
>>> haven't looked alot, but I have yet to find anything other than ASF
>>> Github Bot comments in the NiFi JIRA issues.
>>>
>>
>>
>>
>
>


Re: Infrastructure Changes for ASF

Posted by Steve Lawrence <sl...@apache.org>.
On 10/11/2017 11:34 AM, Mike Beckerle wrote:
> Review branches have a lot of value.
> 
> 
> 1) Work Visibility
> 
> 
> I like being able to see OTHER developers review branches as a means of avoiding duplicate work, seeing if they are progressing, or the branches go quiet, are their changes going to create an integration nightmare with mine, or my planned changes.
> 
> 
> So we lose a lot by not having review branches.
> 
> 
> I would suggest maybe one needs two repos. Let's call one "controlled", and the other "freeforall". The controlled one, only selected people can integrate into. The freeforall  any project developer can create their own review branches on, and which is frequently updated from changes on the controlled one.
> 
> 
> This is an approximation of a git feature that doesn't exist which is per-branch access controls.
> 
> 
> 2) Backups
> 
> 
> One issue with the pull-request system where no review branches are created,.... is that a developer does not automatically get a place to checkpoint/backup their work to a repo.
> 
> 
> My development machine does automatic incremental backups  but I believe this is not standard practice among developers who know and love git.
> 
> 
> Many developers rely on being able to push their own branches at will as a means of checkpointing their work. In a patch-driven no-review-branches workflow, one needs a separate remote git repo for this.
> 
> 
> There are ways to address this - e.g., an organization (such as Tresys) can have it's own clone of the apache repo, and developers can have all the branches they want on that.
> 
> 
> But let's say there's an individual contributor. They'd have to setup say, their own fork repo on github to have a place to push to where they are unconstrained. This is perhaps just fine for backups, but visibility... every developer having their own backup repo makes cross-team visibility harder.
> 
> 
> 3) Code-Review with no Integration Intent
> 
> 
> I also want to be sure I understand how I engage code-review by others with no intention whatsoever of integrating into the master. I.e., I just want to get some other people's eyes on some code in its formative stages (early reviews have more leverage than "just before integration" reviews), and I want to do this using the exact same code-review tool chain that is used for other kinds of pre-integration reviews.
> 
> 

I think github forks actually solve all three of these issues. Each
person should have a personal github fork of the main Apache repo.
Anytime they want to contribute a change, regardless if you're a
committer or not, you do a pull request from your fork so the change can
be reviewed. Using your words, the "controlled" repo is the main repo
hosted by Apache that only committers have access to. The "freeforall"
is each persons individual github fork. So rather than one freeforall,
each person gets their own. This is more inline with the distributed
model of Git.

Issue 1, if you care about what a particular person is working on, then
just add their fork as a git remote and fetch their branches. This is
also nice in that if you only care about the progress of a few people,
you can only add a few remotes and ignore everyone else. This doesn't
scale easily if you want to keep track of 10s or 100s of remotes, but
keeping track of that many peoples progress also doesn't really scale
either, so that's not really an issue. Github will also alert you if
there are conflicts with the upstream "controlled" repo, but that might
only be when you do a pull request.

Issue 2, since people have forks, they can push to github and get a free
backup of their work. Having this in a personal instead of a freeforall
repo is nice since if you're just checkpointing work it doesn't really
affect anyone else--it only affects your personal repo. You can
checkpoint all you want.

Issue 3, if you want to review someones branch as a sort of RFC and not
official review for integration, just have them push it up to their
github fork, perform a pull request, and just add a comment that the PR
is not to be merged but is just for an early review. Once that's done,
the pull request can be closed/rejected so it won't get merged, but will
still maintain the comments that came out of that review for history
tracking.

> 
> 
> 
> ________________________________
> From: Steve Lawrence <sl...@apache.org>
> Sent: Wednesday, October 11, 2017 10:38:12 AM
> To: dev@daffodil.apache.org; Taylor Wise
> Cc: dev@daffodil.incubator.apache.org
> Subject: Re: Infrastructure Changes for ASF
> 
> On 10/11/2017 10:17 AM, Taylor Wise wrote:
>> I see your point.  In these cases would it make more sense then to link it
>> to the PR when closing the ticket? As in add a comment with a direct link
>> to the PR to more easily view the comments?
> 
> Perhaps. Looking at Apache Spark, it looks like they do not have PR
> comments in JIRA comments, for example:
> 
> PR:   https://github.com/apache/spark/pull/19466
> JIRA: https://issues.apache.org/jira/browse/SPARK-22237
> 
> Some more digging found this INFRA issue:
> 
> https://issues.apache.org/jira/browse/INFRA-11675
> 
> Which was fixed to make the ASF Github Bot add PR comments to the
> worklog rather than the comments. I'm not sure if NiFi just never
> updated some config, or if they chose that, but it seems like PR
> comments in JIRA comments is optional somehow.
> 
>> On Wed, Oct 11, 2017 at 10:04 AM, Steve Lawrence <sl...@apache.org>
>> wrote:
>>
>>> On 10/11/2017 09:36 AM, Taylor Wise wrote:
>>>> On Wed, Oct 11, 2017 at 8:39 AM, Steve Lawrence <sl...@apache.org>
>>>> wrote:
>>>>
>>>>>
>>>>> A similar method would be to use github. Apache mirrors the Daffodil git
>>>>> repository to github, and with the use of Apache gitbox, can even
>>>>> support accepting github pull requests. This has some very obvious
>>>>> benefits. Many people are already very familiar with github and so could
>>>>> be a good way to attract more contributors. It also has an intuitive
>>>>> interface for creating and accepting pull requests, again reducing
>>>>> barrier to entry. Github also very cleanly integrates with TravisCI to
>>>>> test pull requests. Note that JIRA must still be the bug tracker, and
>>>>> gitbox copies all review comments to the original JIRA bug as comments.
>>>>> This is good for tracking the review comments, but makes JIRA bugs
>>>>> pretty messy and hard to follow. Also, there are some criticism of the
>>>>> github code review interface, or people that simple do not want or have
>>>>> a github account. Like the above, it also requires network connectivity
>>>>> to draft reviews, though this may be a non-issue nowadays.
>>>>>
>>>>
>>>> I personally like the interface provided by GitHub.  At least with the
>>>> graphical diff it makes it easy to see the immediate changes as well as
>>>> historically what has changed.  How does it make the JIRA bugs messy?
>>> I'm
>>>> curious, I haven't seen this in action.
>>>>
>>>
>>> Yeah, the github interface is definitely much prettier and somewhat
>>> easier to look at than ReviewBoard or email diffs (though, I'm used to
>>> email diffs on other projects, so it's all the same to me). And the
>>> TravisCI integration is very appealing to me. Though, I would kindof
>>> miss the ability to have nested review comments.
>>>
>>> As an example of the JIRA messiness, Apache NiFi uses the github
>>> workflow. Here is a random pull request with 11 comments on one commit:
>>>
>>>   https://github.com/apache/nifi/pull/2162
>>>
>>> And here is the associated JIRA issue with those comments copied in:
>>>
>>>   https://issues.apache.org/jira/browse/NIFI-1706
>>>
>>> Here's another one that's particularly bad with huge diffs in the JIRA
>>> comments:
>>>
>>>   PR:   https://github.com/apache/nifi/pull/2181
>>>   JIRA: https://issues.apache.org/jira/browse/NIFI-4428
>>>
>>> Neither of these pull requests are particularly complicated too. Imagine
>>> this with some of the big patches we've had in the past with lots of
>>> comments.
>>>
>>> It's not terrible, but I think it makes legitimate JIRA comments
>>> difficult to find, and might even discourage comments in JIRA issues--I
>>> haven't looked alot, but I have yet to find anything other than ASF
>>> Github Bot comments in the NiFi JIRA issues.
>>>
>>
>>
>>
> 
> 


Re: Infrastructure Changes for ASF

Posted by Mike Beckerle <mb...@tresys.com>.
Review branches have a lot of value.


1) Work Visibility


I like being able to see OTHER developers review branches as a means of avoiding duplicate work, seeing if they are progressing, or the branches go quiet, are their changes going to create an integration nightmare with mine, or my planned changes.


So we lose a lot by not having review branches.


I would suggest maybe one needs two repos. Let's call one "controlled", and the other "freeforall". The controlled one, only selected people can integrate into. The freeforall  any project developer can create their own review branches on, and which is frequently updated from changes on the controlled one.


This is an approximation of a git feature that doesn't exist which is per-branch access controls.


2) Backups


One issue with the pull-request system where no review branches are created,.... is that a developer does not automatically get a place to checkpoint/backup their work to a repo.


My development machine does automatic incremental backups  but I believe this is not standard practice among developers who know and love git.


Many developers rely on being able to push their own branches at will as a means of checkpointing their work. In a patch-driven no-review-branches workflow, one needs a separate remote git repo for this.


There are ways to address this - e.g., an organization (such as Tresys) can have it's own clone of the apache repo, and developers can have all the branches they want on that.


But let's say there's an individual contributor. They'd have to setup say, their own fork repo on github to have a place to push to where they are unconstrained. This is perhaps just fine for backups, but visibility... every developer having their own backup repo makes cross-team visibility harder.


3) Code-Review with no Integration Intent


I also want to be sure I understand how I engage code-review by others with no intention whatsoever of integrating into the master. I.e., I just want to get some other people's eyes on some code in its formative stages (early reviews have more leverage than "just before integration" reviews), and I want to do this using the exact same code-review tool chain that is used for other kinds of pre-integration reviews.





________________________________
From: Steve Lawrence <sl...@apache.org>
Sent: Wednesday, October 11, 2017 10:38:12 AM
To: dev@daffodil.apache.org; Taylor Wise
Cc: dev@daffodil.incubator.apache.org
Subject: Re: Infrastructure Changes for ASF

On 10/11/2017 10:17 AM, Taylor Wise wrote:
> I see your point.  In these cases would it make more sense then to link it
> to the PR when closing the ticket? As in add a comment with a direct link
> to the PR to more easily view the comments?

Perhaps. Looking at Apache Spark, it looks like they do not have PR
comments in JIRA comments, for example:

PR:   https://github.com/apache/spark/pull/19466
JIRA: https://issues.apache.org/jira/browse/SPARK-22237

Some more digging found this INFRA issue:

https://issues.apache.org/jira/browse/INFRA-11675

Which was fixed to make the ASF Github Bot add PR comments to the
worklog rather than the comments. I'm not sure if NiFi just never
updated some config, or if they chose that, but it seems like PR
comments in JIRA comments is optional somehow.

> On Wed, Oct 11, 2017 at 10:04 AM, Steve Lawrence <sl...@apache.org>
> wrote:
>
>> On 10/11/2017 09:36 AM, Taylor Wise wrote:
>>> On Wed, Oct 11, 2017 at 8:39 AM, Steve Lawrence <sl...@apache.org>
>>> wrote:
>>>
>>>>
>>>> A similar method would be to use github. Apache mirrors the Daffodil git
>>>> repository to github, and with the use of Apache gitbox, can even
>>>> support accepting github pull requests. This has some very obvious
>>>> benefits. Many people are already very familiar with github and so could
>>>> be a good way to attract more contributors. It also has an intuitive
>>>> interface for creating and accepting pull requests, again reducing
>>>> barrier to entry. Github also very cleanly integrates with TravisCI to
>>>> test pull requests. Note that JIRA must still be the bug tracker, and
>>>> gitbox copies all review comments to the original JIRA bug as comments.
>>>> This is good for tracking the review comments, but makes JIRA bugs
>>>> pretty messy and hard to follow. Also, there are some criticism of the
>>>> github code review interface, or people that simple do not want or have
>>>> a github account. Like the above, it also requires network connectivity
>>>> to draft reviews, though this may be a non-issue nowadays.
>>>>
>>>
>>> I personally like the interface provided by GitHub.  At least with the
>>> graphical diff it makes it easy to see the immediate changes as well as
>>> historically what has changed.  How does it make the JIRA bugs messy?
>> I'm
>>> curious, I haven't seen this in action.
>>>
>>
>> Yeah, the github interface is definitely much prettier and somewhat
>> easier to look at than ReviewBoard or email diffs (though, I'm used to
>> email diffs on other projects, so it's all the same to me). And the
>> TravisCI integration is very appealing to me. Though, I would kindof
>> miss the ability to have nested review comments.
>>
>> As an example of the JIRA messiness, Apache NiFi uses the github
>> workflow. Here is a random pull request with 11 comments on one commit:
>>
>>   https://github.com/apache/nifi/pull/2162
>>
>> And here is the associated JIRA issue with those comments copied in:
>>
>>   https://issues.apache.org/jira/browse/NIFI-1706
>>
>> Here's another one that's particularly bad with huge diffs in the JIRA
>> comments:
>>
>>   PR:   https://github.com/apache/nifi/pull/2181
>>   JIRA: https://issues.apache.org/jira/browse/NIFI-4428
>>
>> Neither of these pull requests are particularly complicated too. Imagine
>> this with some of the big patches we've had in the past with lots of
>> comments.
>>
>> It's not terrible, but I think it makes legitimate JIRA comments
>> difficult to find, and might even discourage comments in JIRA issues--I
>> haven't looked alot, but I have yet to find anything other than ASF
>> Github Bot comments in the NiFi JIRA issues.
>>
>
>
>


Re: Infrastructure Changes for ASF

Posted by Steve Lawrence <sl...@apache.org>.
On 10/11/2017 10:17 AM, Taylor Wise wrote:
> I see your point.  In these cases would it make more sense then to link it
> to the PR when closing the ticket? As in add a comment with a direct link
> to the PR to more easily view the comments?

Perhaps. Looking at Apache Spark, it looks like they do not have PR
comments in JIRA comments, for example:

PR:   https://github.com/apache/spark/pull/19466
JIRA: https://issues.apache.org/jira/browse/SPARK-22237

Some more digging found this INFRA issue:

https://issues.apache.org/jira/browse/INFRA-11675

Which was fixed to make the ASF Github Bot add PR comments to the
worklog rather than the comments. I'm not sure if NiFi just never
updated some config, or if they chose that, but it seems like PR
comments in JIRA comments is optional somehow.

> On Wed, Oct 11, 2017 at 10:04 AM, Steve Lawrence <sl...@apache.org>
> wrote:
> 
>> On 10/11/2017 09:36 AM, Taylor Wise wrote:
>>> On Wed, Oct 11, 2017 at 8:39 AM, Steve Lawrence <sl...@apache.org>
>>> wrote:
>>>
>>>>
>>>> A similar method would be to use github. Apache mirrors the Daffodil git
>>>> repository to github, and with the use of Apache gitbox, can even
>>>> support accepting github pull requests. This has some very obvious
>>>> benefits. Many people are already very familiar with github and so could
>>>> be a good way to attract more contributors. It also has an intuitive
>>>> interface for creating and accepting pull requests, again reducing
>>>> barrier to entry. Github also very cleanly integrates with TravisCI to
>>>> test pull requests. Note that JIRA must still be the bug tracker, and
>>>> gitbox copies all review comments to the original JIRA bug as comments.
>>>> This is good for tracking the review comments, but makes JIRA bugs
>>>> pretty messy and hard to follow. Also, there are some criticism of the
>>>> github code review interface, or people that simple do not want or have
>>>> a github account. Like the above, it also requires network connectivity
>>>> to draft reviews, though this may be a non-issue nowadays.
>>>>
>>>
>>> I personally like the interface provided by GitHub.  At least with the
>>> graphical diff it makes it easy to see the immediate changes as well as
>>> historically what has changed.  How does it make the JIRA bugs messy?
>> I'm
>>> curious, I haven't seen this in action.
>>>
>>
>> Yeah, the github interface is definitely much prettier and somewhat
>> easier to look at than ReviewBoard or email diffs (though, I'm used to
>> email diffs on other projects, so it's all the same to me). And the
>> TravisCI integration is very appealing to me. Though, I would kindof
>> miss the ability to have nested review comments.
>>
>> As an example of the JIRA messiness, Apache NiFi uses the github
>> workflow. Here is a random pull request with 11 comments on one commit:
>>
>>   https://github.com/apache/nifi/pull/2162
>>
>> And here is the associated JIRA issue with those comments copied in:
>>
>>   https://issues.apache.org/jira/browse/NIFI-1706
>>
>> Here's another one that's particularly bad with huge diffs in the JIRA
>> comments:
>>
>>   PR:   https://github.com/apache/nifi/pull/2181
>>   JIRA: https://issues.apache.org/jira/browse/NIFI-4428
>>
>> Neither of these pull requests are particularly complicated too. Imagine
>> this with some of the big patches we've had in the past with lots of
>> comments.
>>
>> It's not terrible, but I think it makes legitimate JIRA comments
>> difficult to find, and might even discourage comments in JIRA issues--I
>> haven't looked alot, but I have yet to find anything other than ASF
>> Github Bot comments in the NiFi JIRA issues.
>>
> 
> 
> 


Re: Infrastructure Changes for ASF

Posted by Mike Beckerle <mb...@tresys.com>.
I don't want review comments showing up in JIRA tickets as comments.


Keep in mind, one patch set may be the result of, and fix, a number of JIRA tickets. It's much too 1-to-1 to try to align them.


Or I suppose one could require each patch set to have a corresponding JIRA ticket created specifically to represent it, of type "PatchNotice" or similar. But its not the same thing as a bug report or feature request JIRA ticket.



________________________________
From: Taylor Wise <tw...@gmail.com>
Sent: Wednesday, October 11, 2017 10:17:44 AM
To: dev@daffodil.apache.org
Cc: dev@daffodil.incubator.apache.org
Subject: Re: Infrastructure Changes for ASF

I see your point.  In these cases would it make more sense then to link it
to the PR when closing the ticket? As in add a comment with a direct link
to the PR to more easily view the comments?

On Wed, Oct 11, 2017 at 10:04 AM, Steve Lawrence <sl...@apache.org>
wrote:

> On 10/11/2017 09:36 AM, Taylor Wise wrote:
> > On Wed, Oct 11, 2017 at 8:39 AM, Steve Lawrence <sl...@apache.org>
> > wrote:
> >
> >>
> >> A similar method would be to use github. Apache mirrors the Daffodil git
> >> repository to github, and with the use of Apache gitbox, can even
> >> support accepting github pull requests. This has some very obvious
> >> benefits. Many people are already very familiar with github and so could
> >> be a good way to attract more contributors. It also has an intuitive
> >> interface for creating and accepting pull requests, again reducing
> >> barrier to entry. Github also very cleanly integrates with TravisCI to
> >> test pull requests. Note that JIRA must still be the bug tracker, and
> >> gitbox copies all review comments to the original JIRA bug as comments.
> >> This is good for tracking the review comments, but makes JIRA bugs
> >> pretty messy and hard to follow. Also, there are some criticism of the
> >> github code review interface, or people that simple do not want or have
> >> a github account. Like the above, it also requires network connectivity
> >> to draft reviews, though this may be a non-issue nowadays.
> >>
> >
> > I personally like the interface provided by GitHub.  At least with the
> > graphical diff it makes it easy to see the immediate changes as well as
> > historically what has changed.  How does it make the JIRA bugs messy?
> I'm
> > curious, I haven't seen this in action.
> >
>
> Yeah, the github interface is definitely much prettier and somewhat
> easier to look at than ReviewBoard or email diffs (though, I'm used to
> email diffs on other projects, so it's all the same to me). And the
> TravisCI integration is very appealing to me. Though, I would kindof
> miss the ability to have nested review comments.
>
> As an example of the JIRA messiness, Apache NiFi uses the github
> workflow. Here is a random pull request with 11 comments on one commit:
>
>   https://github.com/apache/nifi/pull/2162
>
> And here is the associated JIRA issue with those comments copied in:
>
>   https://issues.apache.org/jira/browse/NIFI-1706
>
> Here's another one that's particularly bad with huge diffs in the JIRA
> comments:
>
>   PR:   https://github.com/apache/nifi/pull/2181
>   JIRA: https://issues.apache.org/jira/browse/NIFI-4428
>
> Neither of these pull requests are particularly complicated too. Imagine
> this with some of the big patches we've had in the past with lots of
> comments.
>
> It's not terrible, but I think it makes legitimate JIRA comments
> difficult to find, and might even discourage comments in JIRA issues--I
> haven't looked alot, but I have yet to find anything other than ASF
> Github Bot comments in the NiFi JIRA issues.
>



--
-Taylor Wise

Re: Infrastructure Changes for ASF

Posted by Taylor Wise <tw...@gmail.com>.
I see your point.  In these cases would it make more sense then to link it
to the PR when closing the ticket? As in add a comment with a direct link
to the PR to more easily view the comments?

On Wed, Oct 11, 2017 at 10:04 AM, Steve Lawrence <sl...@apache.org>
wrote:

> On 10/11/2017 09:36 AM, Taylor Wise wrote:
> > On Wed, Oct 11, 2017 at 8:39 AM, Steve Lawrence <sl...@apache.org>
> > wrote:
> >
> >>
> >> A similar method would be to use github. Apache mirrors the Daffodil git
> >> repository to github, and with the use of Apache gitbox, can even
> >> support accepting github pull requests. This has some very obvious
> >> benefits. Many people are already very familiar with github and so could
> >> be a good way to attract more contributors. It also has an intuitive
> >> interface for creating and accepting pull requests, again reducing
> >> barrier to entry. Github also very cleanly integrates with TravisCI to
> >> test pull requests. Note that JIRA must still be the bug tracker, and
> >> gitbox copies all review comments to the original JIRA bug as comments.
> >> This is good for tracking the review comments, but makes JIRA bugs
> >> pretty messy and hard to follow. Also, there are some criticism of the
> >> github code review interface, or people that simple do not want or have
> >> a github account. Like the above, it also requires network connectivity
> >> to draft reviews, though this may be a non-issue nowadays.
> >>
> >
> > I personally like the interface provided by GitHub.  At least with the
> > graphical diff it makes it easy to see the immediate changes as well as
> > historically what has changed.  How does it make the JIRA bugs messy?
> I'm
> > curious, I haven't seen this in action.
> >
>
> Yeah, the github interface is definitely much prettier and somewhat
> easier to look at than ReviewBoard or email diffs (though, I'm used to
> email diffs on other projects, so it's all the same to me). And the
> TravisCI integration is very appealing to me. Though, I would kindof
> miss the ability to have nested review comments.
>
> As an example of the JIRA messiness, Apache NiFi uses the github
> workflow. Here is a random pull request with 11 comments on one commit:
>
>   https://github.com/apache/nifi/pull/2162
>
> And here is the associated JIRA issue with those comments copied in:
>
>   https://issues.apache.org/jira/browse/NIFI-1706
>
> Here's another one that's particularly bad with huge diffs in the JIRA
> comments:
>
>   PR:   https://github.com/apache/nifi/pull/2181
>   JIRA: https://issues.apache.org/jira/browse/NIFI-4428
>
> Neither of these pull requests are particularly complicated too. Imagine
> this with some of the big patches we've had in the past with lots of
> comments.
>
> It's not terrible, but I think it makes legitimate JIRA comments
> difficult to find, and might even discourage comments in JIRA issues--I
> haven't looked alot, but I have yet to find anything other than ASF
> Github Bot comments in the NiFi JIRA issues.
>



-- 
-Taylor Wise

Re: Infrastructure Changes for ASF

Posted by Steve Lawrence <sl...@apache.org>.
On 10/11/2017 09:36 AM, Taylor Wise wrote:
> On Wed, Oct 11, 2017 at 8:39 AM, Steve Lawrence <sl...@apache.org>
> wrote:
> 
>>
>> A similar method would be to use github. Apache mirrors the Daffodil git
>> repository to github, and with the use of Apache gitbox, can even
>> support accepting github pull requests. This has some very obvious
>> benefits. Many people are already very familiar with github and so could
>> be a good way to attract more contributors. It also has an intuitive
>> interface for creating and accepting pull requests, again reducing
>> barrier to entry. Github also very cleanly integrates with TravisCI to
>> test pull requests. Note that JIRA must still be the bug tracker, and
>> gitbox copies all review comments to the original JIRA bug as comments.
>> This is good for tracking the review comments, but makes JIRA bugs
>> pretty messy and hard to follow. Also, there are some criticism of the
>> github code review interface, or people that simple do not want or have
>> a github account. Like the above, it also requires network connectivity
>> to draft reviews, though this may be a non-issue nowadays.
>>
> 
> I personally like the interface provided by GitHub.  At least with the
> graphical diff it makes it easy to see the immediate changes as well as
> historically what has changed.  How does it make the JIRA bugs messy?  I'm
> curious, I haven't seen this in action.
> 

Yeah, the github interface is definitely much prettier and somewhat
easier to look at than ReviewBoard or email diffs (though, I'm used to
email diffs on other projects, so it's all the same to me). And the
TravisCI integration is very appealing to me. Though, I would kindof
miss the ability to have nested review comments.

As an example of the JIRA messiness, Apache NiFi uses the github
workflow. Here is a random pull request with 11 comments on one commit:

  https://github.com/apache/nifi/pull/2162

And here is the associated JIRA issue with those comments copied in:

  https://issues.apache.org/jira/browse/NIFI-1706

Here's another one that's particularly bad with huge diffs in the JIRA
comments:

  PR:   https://github.com/apache/nifi/pull/2181
  JIRA: https://issues.apache.org/jira/browse/NIFI-4428

Neither of these pull requests are particularly complicated too. Imagine
this with some of the big patches we've had in the past with lots of
comments.

It's not terrible, but I think it makes legitimate JIRA comments
difficult to find, and might even discourage comments in JIRA issues--I
haven't looked alot, but I have yet to find anything other than ASF
Github Bot comments in the NiFi JIRA issues.

Re: Infrastructure Changes for ASF

Posted by Taylor Wise <tw...@gmail.com>.
On Wed, Oct 11, 2017 at 8:39 AM, Steve Lawrence <sl...@apache.org>
wrote:

> While we are waiting the remaining SGAs, I think now is a good time to
> start thinking about how the move to ASF infrastructure will affect the
> Daffodil project. ASF supports a different infrastructure than we used
> in the past, so some changes will be required to workflow, and some
> changes should be made to reduce the barrier to entry for new contributors.
>
> == Documentation ==
>
> Daffodil uses Confluence for user and developer documentation. ASF
> provides a confluence instance, so we just need to transfer the
> information. This may be a good time to reorganize our confluence pages
> and remove/update old information, but should otherwise work exactly the
> same.
>
> AASF also provides web hosting for static content (e.g. downloads,
> Daffodil high level overview, mailing list info, etc.) as a sort of
> landing page for the project. This will need to be developed. I'm not
> too familiar with website building tools, but there are many out
> there--this will take more research. We should look at what other Apache
> projects use as inspiration.
>
> == Issue Tracking ==
>
> ASF provides JIRA for tracking issues, and we even already have an empty
> JIRA project set up for us at:
>
>   https://issues.apache.org/jira/projects/DAFFODIL
>
> Daffodil used JIRA before Apache so the workflow changes should not be
> too different. We should probably maintain a very similar workflow with
> this regard (e.g. all changes require a bug, assign to self when
> starting progress, resolve issues when fixed, etc.). We can flesh out a
> formal description and process for issue tracking for new contributors
> to follow, but I think this is all fairly standard and will remain
> mostly unchanged from what we had before. I'm sure there will be some
> changes to the overall workflow (e.g. removal of scala-new, how will
> bugs be officially closed, etc.) but they will all be relatively minor
> and not really infrastructure related, so I don't want to spend too much
> time on that in this email.
>
> Note that one piece of effort related to JIRA is transferring our
> existing bugs to the new JIRA. Based on reading through the INFRA JIRA
> and seeing other projects do this, we mainly just need to export our
> existing bugs as JSON and create a user mapping between the JIRA accounts.
>
> == Patch Submission & Review ==
>
> This is where we will likely have the most change relative to
> infrastructure and am looking to have some more in-depth discussions.
> Previously, the Daffodil workflow had all committers making changes to
> "review" branches in the main repo, the changes were reviewed, and
> finally rebased to the development branch. This could continue to work
> for us, but it has some downsides. As we gain more committers, more
> review branches could make the main repo pretty messy. And in general we
> probably don't want lots of unreviewed code in the main repo, even if
> they are on separate branches. Furthermore, and probably the biggest
> reason to not continue this practice, is that contributors that are not
> committers would not have the privileges to add review branches to the
> main repo and so they would need to follow a different process than
> committers. I propose that all committers should follow the exact some
> contribution process as non-committers, and so we need a different patch
> submissions and review process that works for both, of which there are a
> few options below:
>
> The first, and I think the traditional method for Apache projects, is
> for contributors to add a patch to a JIRA ticket as an attachment. This
> is convenient in that JIRA tickets and patches are closely tied
> together, but creating a patch file and uploading it might not be as
> easy as it could be. Once a patch is attached, a process is
> automatically kicked off to run tests on the patch and start a review at
> reviews.apache.org via ReviewBoard. This seems like a good workflow, but
> I personally find ReviewBoard difficult to use and lacks some features
> that I've become accustomed to after using Crucible for Daffodil in the
> past.
>
> A similar method would be to use github. Apache mirrors the Daffodil git
> repository to github, and with the use of Apache gitbox, can even
> support accepting github pull requests. This has some very obvious
> benefits. Many people are already very familiar with github and so could
> be a good way to attract more contributors. It also has an intuitive
> interface for creating and accepting pull requests, again reducing
> barrier to entry. Github also very cleanly integrates with TravisCI to
> test pull requests. Note that JIRA must still be the bug tracker, and
> gitbox copies all review comments to the original JIRA bug as comments.
> This is good for tracking the review comments, but makes JIRA bugs
> pretty messy and hard to follow. Also, there are some criticism of the
> github code review interface, or people that simple do not want or have
> a github account. Like the above, it also requires network connectivity
> to draft reviews, though this may be a non-issue nowadays.
>

I personally like the interface provided by GitHub.  At least with the
graphical diff it makes it easy to see the immediate changes as well as
historically what has changed.  How does it make the JIRA bugs messy?  I'm
curious, I haven't seen this in action.

>
> Another alternative, which is maybe less modern but is pretty tried and
> true is to use something similar to Linux kernel review process. In this
> process, all patches are emailed directly to the mailing list via
> git-send-email. Review comments happen as replies to those emails,
> allowing for complex and easily branching discussions. Committing a
> patch requires that a committer save the email and apply it using
> git-am. One big benefit of this process over the others is that patches
> and review comments are much more likely to be seen since they go
> directly to the dev list. This encourages activity and allows new devs
> to learn as they see the patches. It also has a low barrier to
> entry--one just needs to configure git-send-email to use SMTP servers of
> preference and run a git command. It also also been shown to scale very
> well, is well understood, and is well documented. It also follows the
> ASF motto of "If it didn't happen on a mailing list, it didn't happen."
> Note that this would not remove JIRA for bug tracking, so a downside is
> that it may require some manual updates to JIRA such as specifying that
> a patch has been submitted to the mailing list. This also does not
> tightly integrate with continuous integration systems, so might require
> committers to manually test patches (not necessarily a bad thing, and
> tools like patchwork/snowpatch exist to send mailing list patches to a
> Jenkins server, though not currently supported by Apache infra). Maybe
> the biggest downside is that while people are familiar with email, it
> doesn't have some nice features of other review tools, like marking
> comments as resolved, syntax highlighting, etc. It's simple, but
> minimal.

Wouldn't having this just on the mailing list make it more difficult to go
back and look at a particular bug/ticket?  You'd have to look in JIRA as
well as track down that particular strain of e-mails?


> The article and comments below have some good discussions about
> the pros and cons of email for patches and how it works well for the
> Linux kernel:
>
>   https://lwn.net/Articles/702177/
>
> I'm sure there are many other options that I have not considered. I'm
> definitely open to alternatives.
>
> == Continuous Integration ==
>
> Previously, Daffodil used Bamboo for continuous integration. ASF does
> not support this, but does support a few alternatives:
>
>   https://ci.apache.org/
>
> We have had experience setting up Daffodil to run on Jenkins in the
> past, so this seems preferable. Though, it looks like both Jenkins and
> Buildbot meet the necessary requirements, so either would likely work.
> We could also provide a TravisCI configuration so that people that
> maintain a github fork (regardless of the Patch Submission process)
> could take advantage of that service).
>
> == Maven Repository ==
>
> Daffodil used a Nexus repository on the NCSA servers. Apache infra
> provides a Nexus server, so this should be virtually unchanged. Just
> need to publish to a different server, and tweak our release process to
> follow Apache release guidelines.
>
> - Steve
>



-- 
-Taylor Wise