You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@yetus.apache.org by Andrew Wang <an...@cloudera.com> on 2017/01/04 20:00:31 UTC

RDM, precommit, and Patch Available JIRAs

Hi folks,

I'm dealing with a workflow issue in Hadoop, and was wondering where to put
the fix.

Hadoop has multiple release branches, e.g. trunk and branch-2. Sometimes, a
patch is committed to trunk, but needs to be amended slightly for branch-2.
In this case, the JIRA remains open and marked Patch Available so it's
picked up automatically by the precommit bot.

However, this conflicts with RDM, which looks for "fixed" JIRAs with a
particular fix version. This means it'll miss these JIRAs pending a
backport.

One way to fix this is to change Hadoop's Yetus invocation so
JIRA_STATUS_RE is permissive. This would be a workflow change, in that we
would resolve a JIRA as soon as it gets committed, then manually trigger
precommit for additional backports.

Option 2, we make RDM just look at fix versions, and ignore the state of
the JIRA.

Thoughts?

Thanks,
Andrew

Re: RDM, precommit, and Patch Available JIRAs

Posted by Allen Wittenauer <aw...@effectivemachines.com>.
> On May 1, 2017, at 8:04 PM, suraj acharya <su...@gmail.com> wrote:
> 
> If i am to understand the system correctly, there is currently a jenkins
> job which triggers the Pre-Commit job. Is the discussion here to make that
> more permissive, or is there a check in precomit.sh I have overlooked?

	precommit-admin is a job that sits in an abandoned svn tree in the hadoop universe.  It's responsible for querying JIRA and kicking off the Jenkins jobs.  I think I filed an issue in YETUS a year+ ago to pull into our tree.  Lots of projects are using that code (whether they know it or not), so if we replace the master precommit-admin job, we need to communicate it heavily.

	(On the plus side, it'd give us a chance to unify JIRA and github access.)

> Either way, I agree that relaxing Pre-commit is a good idea.  And I can
> volunteer some time to help if a jira is filed.

	Relaxing precommit is going to end up with a lot of extra runs (if not execution storms when mass changes happen)  if we aren't careful here. So you or anyone else that decides to take this up:  this is not a small task.

Re: RDM, precommit, and Patch Available JIRAs

Posted by suraj acharya <su...@gmail.com>.
I agree Allen. Sometimes the commit messages are more confusing than
helpful.
I agree with the approach of relaxing Pre-Commit to run on closed jiras.

If i am to understand the system correctly, there is currently a jenkins
job which triggers the Pre-Commit job. Is the discussion here to make that
more permissive, or is there a check in precomit.sh I have overlooked?

Either way, I agree that relaxing Pre-commit is a good idea.  And I can
volunteer some time to help if a jira is filed.

-Suraj Acharya

On Mon, May 1, 2017 at 8:06 PM, Allen Wittenauer <aw...@effectivemachines.com>
wrote:

>
> > On May 1, 2017, at 4:47 PM, suraj acharya <su...@gmail.com> wrote:
> >
> > One thing I can think of if the project is very good about entering the
> > JIRA number in the commit is :
> > * we take the JIRA number from the git log.
>
>         FWIW: People typo these numbers or skip it all the time, even for
> projects like Hadoop where that has been a key component of committing.
> Before even writing RDM, this was going to be my first plan of attack.  I
> took a long, hard look at hadoop and other's commit logs and it was just
> incredibly awful.  Human nature just gets in the way here.
>
>         ... and that's even before you get to the issues of "which commit
> is the first commit to report?"  (branch comparison doesn't work that well,
> esp due to cherry-picking and branch merges, even in a normal repo.
> Hadoop's is especially jacked since 2.x came from 0.23 and not trunk.)
>
>         In a perfect world... no, wait... in a world where this data can
> be corrected easily in case of mistakes: this would be the ideal solution.
> However, git's model really only supports fail forward without forcing
> everyone to resync their entire history.  Better commit tooling would
> probably help here, but that's likely a bigger uphill battle.
>
>

Re: RDM, precommit, and Patch Available JIRAs

Posted by Allen Wittenauer <aw...@effectivemachines.com>.
> On May 1, 2017, at 4:47 PM, suraj acharya <su...@gmail.com> wrote:
> 
> One thing I can think of if the project is very good about entering the
> JIRA number in the commit is :
> * we take the JIRA number from the git log.

	FWIW: People typo these numbers or skip it all the time, even for projects like Hadoop where that has been a key component of committing.  Before even writing RDM, this was going to be my first plan of attack.  I took a long, hard look at hadoop and other's commit logs and it was just incredibly awful.  Human nature just gets in the way here.

	... and that's even before you get to the issues of "which commit is the first commit to report?"  (branch comparison doesn't work that well, esp due to cherry-picking and branch merges, even in a normal repo.  Hadoop's is especially jacked since 2.x came from 0.23 and not trunk.)

	In a perfect world... no, wait... in a world where this data can be corrected easily in case of mistakes: this would be the ideal solution.  However, git's model really only supports fail forward without forcing everyone to resync their entire history.  Better commit tooling would probably help here, but that's likely a bigger uphill battle.


Re: RDM, precommit, and Patch Available JIRAs

Posted by suraj acharya <su...@gmail.com>.
One thing I can think of if the project is very good about entering the
JIRA number in the commit is :
* we take the JIRA number from the git log.
* Compare that with the list of JIRAs present in the RDM. Of course, the
git logic has to start from the previous branch and not from start of time.
* This can be an advanced topic.

Else, we can add a flag to RDM to include JIRAs present in non fixed status
and use FIX version as the filter.

Not sure how much of this will apply to Hadoop, but this is something I can
think of.


-Suraj Acharya

On Mon, May 1, 2017 at 5:58 PM, Andrew Wang <an...@cloudera.com>
wrote:

> Pinging this thread, this problem is becoming more acute for Hadoop since
> we have multiple releases in flight.
>
> Any thoughts on the proposal? The goal is that this would allow manually
> triggering test-patch on a JIRA that is not Patch Available. The automatic
> runs would still look only for the PA status.
>
> On Thu, Jan 5, 2017 at 1:18 PM, Andrew Wang <an...@cloudera.com>
> wrote:
>
> > Hi Ajay,
> >
> > I think you captured it, but I'll explain once more to be sure:
> >
> > * RDM looks for JIRAs resolved as "Fixed" when making the release notes
> > * In the Hadoop dev workflow, sometimes patch is committed to a branch,
> > then there's a delay before it's backported to another branch.
> > * We want to run precommit on the backported patch, and the precommit
> > script requires the JIRA to be open and in "Patch Available" status.
> >
> > Thus, until the JIRA is fully backported, it needs to be open and "Patch
> > Available". RDM won't pick up these JIRAs for any fix version since
> they're
> > not in "Fixed" state. This means the release notes will be incomplete,
> and
> > we can't do a release.
> >
> > Getting back to the proposals, I suggested relaxing either RDM or
> > precommit to be more permissive as to the JIRA status. Thinking about it
> > more, I'd prefer to relax precommit, since relaxing RDM means we
> > potentially need to edit a lot of JIRA fix versions. There's also value
> > from having a fix version set even for JIRAs not resolved as "Fixed"
> (e.g.
> > "Duplicate").
> >
> > Filing an additional JIRA for any unclean backport introduces overhead to
> > the happy path and complicates tracking, so I'm not a fan.
> >
> > We've never been able to avoid typos in commit messages, so we don't have
> > a reliable mapping from git log -> JIRAs. I don't think RDM wants to be
> in
> > the business of trying to solve this either.
> >
> > Best,
> > Andrew
> >
> > On Thu, Jan 5, 2017 at 12:35 PM, Ajay Yadava <aj...@gmail.com>
> > wrote:
> >
> >> I am not sure if I understand the problem that you are facing or your
> >> proposal completely, so please feel free to correct me.
> >>
> >> I think what Allen has suggested is perhaps the best way to workaround
> >> this
> >> problem. However, if for some reason that's not possible/desirable, then
> >> one option may be to fall back to git commit log to determine the true
> >> status of such issues i.e. if apart from target release version, an open
> >> JIRA has other versions also in fix-versions field of JIRA, then it
> means
> >> it *may* be committed to the branch. So for such cases, we check commit
> >> log
> >> of the branch for such cases.
> >>
> >> However, this requires a way to identify the commit for the JIRA e.g. by
> >> assuming a convention of mentioning JIRA ID in the commit message.
> >>
> >> Are you suggesting this change in RDM or proposing another script to
> >> identify such JIRAs? Or did I misunderstood the issue completely?
> >>
> >> Regards
> >> Ajay Yadava
> >>
> >> On Wed, Jan 4, 2017 at 4:11 PM Andrew Wang <an...@cloudera.com>
> >> wrote:
> >>
> >> On Wed, Jan 4, 2017 at 1:07 PM, Andrew Wang <an...@cloudera.com>
> >> wrote:
> >>
> >> >
> >> > On Wed, Jan 4, 2017 at 12:58 PM, Allen Wittenauer <
> >> > aw@effectivemachines.com> wrote:
> >> >
> >> >>
> >> >> > On Jan 4, 2017, at 12:00 PM, Andrew Wang <andrew.wang@cloudera.com
> >
> >> >> wrote:
> >> >> >
> >> >> > Thoughts?
> >> >>
> >> >>         If a back port is taking that long, it's probably better to
> >> open
> >> >> another JIRA for it.
> >> >>
> >> >> It's not that any individual backport takes that long, but there are
> >> > enough patches flying around that there's always at least one JIRA in
> >> this
> >> > state.
> >> >
> >> > My goal is for the branch (and JIRA) to always be in a releaseable
> >> state.
> >> >
> >>
> >> BTW, one example is that I wrote a script to reconcile JIRA information
> >> with git state. I'd like to turn on an email when the tool detects an
> >> error, but it's never passed successfully due to the above issue.
> >>
> >> https://builds.apache.org/view/H-L/view/Hadoop/job/
> Hadoop-trunk-versions/
> >>
> >> --
> >> Regards
> >> Ajay Yadava
> >>
> >
> >
>

Re: RDM, precommit, and Patch Available JIRAs

Posted by Andrew Wang <an...@cloudera.com>.
Pinging this thread, this problem is becoming more acute for Hadoop since
we have multiple releases in flight.

Any thoughts on the proposal? The goal is that this would allow manually
triggering test-patch on a JIRA that is not Patch Available. The automatic
runs would still look only for the PA status.

On Thu, Jan 5, 2017 at 1:18 PM, Andrew Wang <an...@cloudera.com>
wrote:

> Hi Ajay,
>
> I think you captured it, but I'll explain once more to be sure:
>
> * RDM looks for JIRAs resolved as "Fixed" when making the release notes
> * In the Hadoop dev workflow, sometimes patch is committed to a branch,
> then there's a delay before it's backported to another branch.
> * We want to run precommit on the backported patch, and the precommit
> script requires the JIRA to be open and in "Patch Available" status.
>
> Thus, until the JIRA is fully backported, it needs to be open and "Patch
> Available". RDM won't pick up these JIRAs for any fix version since they're
> not in "Fixed" state. This means the release notes will be incomplete, and
> we can't do a release.
>
> Getting back to the proposals, I suggested relaxing either RDM or
> precommit to be more permissive as to the JIRA status. Thinking about it
> more, I'd prefer to relax precommit, since relaxing RDM means we
> potentially need to edit a lot of JIRA fix versions. There's also value
> from having a fix version set even for JIRAs not resolved as "Fixed" (e.g.
> "Duplicate").
>
> Filing an additional JIRA for any unclean backport introduces overhead to
> the happy path and complicates tracking, so I'm not a fan.
>
> We've never been able to avoid typos in commit messages, so we don't have
> a reliable mapping from git log -> JIRAs. I don't think RDM wants to be in
> the business of trying to solve this either.
>
> Best,
> Andrew
>
> On Thu, Jan 5, 2017 at 12:35 PM, Ajay Yadava <aj...@gmail.com>
> wrote:
>
>> I am not sure if I understand the problem that you are facing or your
>> proposal completely, so please feel free to correct me.
>>
>> I think what Allen has suggested is perhaps the best way to workaround
>> this
>> problem. However, if for some reason that's not possible/desirable, then
>> one option may be to fall back to git commit log to determine the true
>> status of such issues i.e. if apart from target release version, an open
>> JIRA has other versions also in fix-versions field of JIRA, then it means
>> it *may* be committed to the branch. So for such cases, we check commit
>> log
>> of the branch for such cases.
>>
>> However, this requires a way to identify the commit for the JIRA e.g. by
>> assuming a convention of mentioning JIRA ID in the commit message.
>>
>> Are you suggesting this change in RDM or proposing another script to
>> identify such JIRAs? Or did I misunderstood the issue completely?
>>
>> Regards
>> Ajay Yadava
>>
>> On Wed, Jan 4, 2017 at 4:11 PM Andrew Wang <an...@cloudera.com>
>> wrote:
>>
>> On Wed, Jan 4, 2017 at 1:07 PM, Andrew Wang <an...@cloudera.com>
>> wrote:
>>
>> >
>> > On Wed, Jan 4, 2017 at 12:58 PM, Allen Wittenauer <
>> > aw@effectivemachines.com> wrote:
>> >
>> >>
>> >> > On Jan 4, 2017, at 12:00 PM, Andrew Wang <an...@cloudera.com>
>> >> wrote:
>> >> >
>> >> > Thoughts?
>> >>
>> >>         If a back port is taking that long, it's probably better to
>> open
>> >> another JIRA for it.
>> >>
>> >> It's not that any individual backport takes that long, but there are
>> > enough patches flying around that there's always at least one JIRA in
>> this
>> > state.
>> >
>> > My goal is for the branch (and JIRA) to always be in a releaseable
>> state.
>> >
>>
>> BTW, one example is that I wrote a script to reconcile JIRA information
>> with git state. I'd like to turn on an email when the tool detects an
>> error, but it's never passed successfully due to the above issue.
>>
>> https://builds.apache.org/view/H-L/view/Hadoop/job/Hadoop-trunk-versions/
>>
>> --
>> Regards
>> Ajay Yadava
>>
>
>

Re: RDM, precommit, and Patch Available JIRAs

Posted by Andrew Wang <an...@cloudera.com>.
Hi Ajay,

I think you captured it, but I'll explain once more to be sure:

* RDM looks for JIRAs resolved as "Fixed" when making the release notes
* In the Hadoop dev workflow, sometimes patch is committed to a branch,
then there's a delay before it's backported to another branch.
* We want to run precommit on the backported patch, and the precommit
script requires the JIRA to be open and in "Patch Available" status.

Thus, until the JIRA is fully backported, it needs to be open and "Patch
Available". RDM won't pick up these JIRAs for any fix version since they're
not in "Fixed" state. This means the release notes will be incomplete, and
we can't do a release.

Getting back to the proposals, I suggested relaxing either RDM or precommit
to be more permissive as to the JIRA status. Thinking about it more, I'd
prefer to relax precommit, since relaxing RDM means we potentially need to
edit a lot of JIRA fix versions. There's also value from having a fix
version set even for JIRAs not resolved as "Fixed" (e.g. "Duplicate").

Filing an additional JIRA for any unclean backport introduces overhead to
the happy path and complicates tracking, so I'm not a fan.

We've never been able to avoid typos in commit messages, so we don't have a
reliable mapping from git log -> JIRAs. I don't think RDM wants to be in
the business of trying to solve this either.

Best,
Andrew

On Thu, Jan 5, 2017 at 12:35 PM, Ajay Yadava <aj...@gmail.com>
wrote:

> I am not sure if I understand the problem that you are facing or your
> proposal completely, so please feel free to correct me.
>
> I think what Allen has suggested is perhaps the best way to workaround this
> problem. However, if for some reason that's not possible/desirable, then
> one option may be to fall back to git commit log to determine the true
> status of such issues i.e. if apart from target release version, an open
> JIRA has other versions also in fix-versions field of JIRA, then it means
> it *may* be committed to the branch. So for such cases, we check commit log
> of the branch for such cases.
>
> However, this requires a way to identify the commit for the JIRA e.g. by
> assuming a convention of mentioning JIRA ID in the commit message.
>
> Are you suggesting this change in RDM or proposing another script to
> identify such JIRAs? Or did I misunderstood the issue completely?
>
> Regards
> Ajay Yadava
>
> On Wed, Jan 4, 2017 at 4:11 PM Andrew Wang <an...@cloudera.com>
> wrote:
>
> On Wed, Jan 4, 2017 at 1:07 PM, Andrew Wang <an...@cloudera.com>
> wrote:
>
> >
> > On Wed, Jan 4, 2017 at 12:58 PM, Allen Wittenauer <
> > aw@effectivemachines.com> wrote:
> >
> >>
> >> > On Jan 4, 2017, at 12:00 PM, Andrew Wang <an...@cloudera.com>
> >> wrote:
> >> >
> >> > Thoughts?
> >>
> >>         If a back port is taking that long, it's probably better to open
> >> another JIRA for it.
> >>
> >> It's not that any individual backport takes that long, but there are
> > enough patches flying around that there's always at least one JIRA in
> this
> > state.
> >
> > My goal is for the branch (and JIRA) to always be in a releaseable state.
> >
>
> BTW, one example is that I wrote a script to reconcile JIRA information
> with git state. I'd like to turn on an email when the tool detects an
> error, but it's never passed successfully due to the above issue.
>
> https://builds.apache.org/view/H-L/view/Hadoop/job/Hadoop-trunk-versions/
>
> --
> Regards
> Ajay Yadava
>

Re: RDM, precommit, and Patch Available JIRAs

Posted by Ajay Yadava <aj...@gmail.com>.
I am not sure if I understand the problem that you are facing or your
proposal completely, so please feel free to correct me.

I think what Allen has suggested is perhaps the best way to workaround this
problem. However, if for some reason that's not possible/desirable, then
one option may be to fall back to git commit log to determine the true
status of such issues i.e. if apart from target release version, an open
JIRA has other versions also in fix-versions field of JIRA, then it means
it *may* be committed to the branch. So for such cases, we check commit log
of the branch for such cases.

However, this requires a way to identify the commit for the JIRA e.g. by
assuming a convention of mentioning JIRA ID in the commit message.

Are you suggesting this change in RDM or proposing another script to
identify such JIRAs? Or did I misunderstood the issue completely?

Regards
Ajay Yadava

On Wed, Jan 4, 2017 at 4:11 PM Andrew Wang <an...@cloudera.com> wrote:

On Wed, Jan 4, 2017 at 1:07 PM, Andrew Wang <an...@cloudera.com>
wrote:

>
> On Wed, Jan 4, 2017 at 12:58 PM, Allen Wittenauer <
> aw@effectivemachines.com> wrote:
>
>>
>> > On Jan 4, 2017, at 12:00 PM, Andrew Wang <an...@cloudera.com>
>> wrote:
>> >
>> > Thoughts?
>>
>>         If a back port is taking that long, it's probably better to open
>> another JIRA for it.
>>
>> It's not that any individual backport takes that long, but there are
> enough patches flying around that there's always at least one JIRA in this
> state.
>
> My goal is for the branch (and JIRA) to always be in a releaseable state.
>

BTW, one example is that I wrote a script to reconcile JIRA information
with git state. I'd like to turn on an email when the tool detects an
error, but it's never passed successfully due to the above issue.

https://builds.apache.org/view/H-L/view/Hadoop/job/Hadoop-trunk-versions/

-- 
Regards
Ajay Yadava

Re: RDM, precommit, and Patch Available JIRAs

Posted by Andrew Wang <an...@cloudera.com>.
On Wed, Jan 4, 2017 at 1:07 PM, Andrew Wang <an...@cloudera.com>
wrote:

>
> On Wed, Jan 4, 2017 at 12:58 PM, Allen Wittenauer <
> aw@effectivemachines.com> wrote:
>
>>
>> > On Jan 4, 2017, at 12:00 PM, Andrew Wang <an...@cloudera.com>
>> wrote:
>> >
>> > Thoughts?
>>
>>         If a back port is taking that long, it's probably better to open
>> another JIRA for it.
>>
>> It's not that any individual backport takes that long, but there are
> enough patches flying around that there's always at least one JIRA in this
> state.
>
> My goal is for the branch (and JIRA) to always be in a releaseable state.
>

BTW, one example is that I wrote a script to reconcile JIRA information
with git state. I'd like to turn on an email when the tool detects an
error, but it's never passed successfully due to the above issue.

https://builds.apache.org/view/H-L/view/Hadoop/job/Hadoop-trunk-versions/

Re: RDM, precommit, and Patch Available JIRAs

Posted by Andrew Wang <an...@cloudera.com>.
On Wed, Jan 4, 2017 at 12:58 PM, Allen Wittenauer <aw...@effectivemachines.com>
wrote:

>
> > On Jan 4, 2017, at 12:00 PM, Andrew Wang <an...@cloudera.com>
> wrote:
> >
> > Thoughts?
>
>         If a back port is taking that long, it's probably better to open
> another JIRA for it.
>
> It's not that any individual backport takes that long, but there are
enough patches flying around that there's always at least one JIRA in this
state.

My goal is for the branch (and JIRA) to always be in a releaseable state.

Re: RDM, precommit, and Patch Available JIRAs

Posted by Allen Wittenauer <aw...@effectivemachines.com>.
> On Jan 4, 2017, at 12:00 PM, Andrew Wang <an...@cloudera.com> wrote:
> 
> Thoughts?

	If a back port is taking that long, it's probably better to open another JIRA for it.