You are viewing a plain text version of this content. The canonical link for it is here.
Posted to builds@apache.org by Ted Yu <yu...@gmail.com> on 2012/01/07 04:20:48 UTC

HBase TRUNK build 2616 hangs

Hi,
I tried to terminate build 2616 but couldn't:
https://builds.apache.org/view/G-L/view/HBase/job/HBase-TRUNK/2616/

Please stop this build.

Thanks

Re: HBase TRUNK build 2616 hangs

Posted by Niklas Gustavsson <ni...@protocol7.com>.
On Mon, Jan 9, 2012 at 10:44 AM, Andreas Veithen
<an...@gmail.com> wrote:
> Updated the issue with a summary of these observations (and some
> additional observations).

Thanks!

/niklas

Re: HBase TRUNK build 2616 hangs

Posted by Andreas Veithen <an...@gmail.com>.
Updated the issue with a summary of these observations (and some
additional observations).

Andreas

On Sat, Jan 7, 2012 at 18:37, Niklas Gustavsson <ni...@protocol7.com> wrote:
> All of this is excellent information, please add anything significant
> on https://issues.jenkins-ci.org/browse/JENKINS-9688
>
> /niklas
>
> On Sat, Jan 7, 2012 at 2:57 PM, Andreas Veithen
> <an...@gmail.com> wrote:
>> In addition, while the build is waiting on the executor to acquire the
>> lock, it is reported as running on the master.
>>
>> E.g. the axis2-1.6 #225 build right now shows:
>>
>> "Started 42 min agoBuild is being executed for 42 min on master"
>> Andreas
>>
>> On Sat, Jan 7, 2012 at 14:33, Andreas Veithen <an...@gmail.com> wrote:
>>> BTW, here is a screenshot that shows the problem:
>>>
>>> http://people.apache.org/~veithen/axis2-builds.png
>>>
>>> The three Axis2 builds all use the same lock.
>>>
>>> Andreas
>>>
>>> On Sat, Jan 7, 2012 at 13:38, Andreas Veithen <an...@gmail.com> wrote:
>>>> We've seen similar issues with Axis2 builds in the past. I had the
>>>> impression that this has something to do with the usage of locks. We
>>>> are using a lock for the Axis2 builds (because they use fixed port
>>>> numbers in unit tests and therefore can't be executed concurrently).
>>>> Interestingly the hbase build also uses a lock.
>>>>
>>>> Speculating further on the possible cause, what we have seen with
>>>> Axis2 was that sometimes two (or more) different builds (for different
>>>> branches) where triggered at the same time by the completion of a
>>>> common upstream build (Axiom trunk e.g.). Since the Axis2 builds use a
>>>> common lock, one would expect that only one starts execution, while
>>>> the others remain in the build queue. However, what happened is that
>>>> sometimes, two builds started execution in parallel, with one waiting
>>>> for the lock (i.e. instead of waiting in the build queue, it was
>>>> assigned to an executor and waiting there). I had the impression that
>>>> this kind of situation increases the probability of ending up in a
>>>> situation where the build is stuck or reported as being assigned to
>>>> master.
>>>>
>>>> I haven't seen this in a while for the Axis2 builds, but this may be
>>>> simply because the dependencies between builds have changed in the
>>>> meantime.
>>>>
>>>> Andreas
>>>>
>>>> On Sat, Jan 7, 2012 at 13:16, sebb <se...@gmail.com> wrote:
>>>>> On 7 January 2012 03:20, Ted Yu <yu...@gmail.com> wrote:
>>>>>> Hi,
>>>>>> I tried to terminate build 2616 but couldn't:
>>>>>> https://builds.apache.org/view/G-L/view/HBase/job/HBase-TRUNK/2616/
>>>>>
>>>>> Looks like a Jenkins bug - the summary says
>>>>>
>>>>> "Started 10 hr ago
>>>>> Build is being executed for null on master"
>>>>>
>>>>> and the tooltip text in build history says:
>>>>>
>>>>> "Started null ago
>>>>> Estimated remaining time: null"
>>>>>
>>>>>> Please stop this build.
>>>>>>
>>>>>> Thanks

Re: HBase TRUNK build 2616 hangs

Posted by Niklas Gustavsson <ni...@protocol7.com>.
All of this is excellent information, please add anything significant
on https://issues.jenkins-ci.org/browse/JENKINS-9688

/niklas

On Sat, Jan 7, 2012 at 2:57 PM, Andreas Veithen
<an...@gmail.com> wrote:
> In addition, while the build is waiting on the executor to acquire the
> lock, it is reported as running on the master.
>
> E.g. the axis2-1.6 #225 build right now shows:
>
> "Started 42 min agoBuild is being executed for 42 min on master"
> Andreas
>
> On Sat, Jan 7, 2012 at 14:33, Andreas Veithen <an...@gmail.com> wrote:
>> BTW, here is a screenshot that shows the problem:
>>
>> http://people.apache.org/~veithen/axis2-builds.png
>>
>> The three Axis2 builds all use the same lock.
>>
>> Andreas
>>
>> On Sat, Jan 7, 2012 at 13:38, Andreas Veithen <an...@gmail.com> wrote:
>>> We've seen similar issues with Axis2 builds in the past. I had the
>>> impression that this has something to do with the usage of locks. We
>>> are using a lock for the Axis2 builds (because they use fixed port
>>> numbers in unit tests and therefore can't be executed concurrently).
>>> Interestingly the hbase build also uses a lock.
>>>
>>> Speculating further on the possible cause, what we have seen with
>>> Axis2 was that sometimes two (or more) different builds (for different
>>> branches) where triggered at the same time by the completion of a
>>> common upstream build (Axiom trunk e.g.). Since the Axis2 builds use a
>>> common lock, one would expect that only one starts execution, while
>>> the others remain in the build queue. However, what happened is that
>>> sometimes, two builds started execution in parallel, with one waiting
>>> for the lock (i.e. instead of waiting in the build queue, it was
>>> assigned to an executor and waiting there). I had the impression that
>>> this kind of situation increases the probability of ending up in a
>>> situation where the build is stuck or reported as being assigned to
>>> master.
>>>
>>> I haven't seen this in a while for the Axis2 builds, but this may be
>>> simply because the dependencies between builds have changed in the
>>> meantime.
>>>
>>> Andreas
>>>
>>> On Sat, Jan 7, 2012 at 13:16, sebb <se...@gmail.com> wrote:
>>>> On 7 January 2012 03:20, Ted Yu <yu...@gmail.com> wrote:
>>>>> Hi,
>>>>> I tried to terminate build 2616 but couldn't:
>>>>> https://builds.apache.org/view/G-L/view/HBase/job/HBase-TRUNK/2616/
>>>>
>>>> Looks like a Jenkins bug - the summary says
>>>>
>>>> "Started 10 hr ago
>>>> Build is being executed for null on master"
>>>>
>>>> and the tooltip text in build history says:
>>>>
>>>> "Started null ago
>>>> Estimated remaining time: null"
>>>>
>>>>> Please stop this build.
>>>>>
>>>>> Thanks

Re: HBase TRUNK build 2616 hangs

Posted by Andreas Veithen <an...@gmail.com>.
In addition, while the build is waiting on the executor to acquire the
lock, it is reported as running on the master.

E.g. the axis2-1.6 #225 build right now shows:

"Started 42 min agoBuild is being executed for 42 min on master"
Andreas

On Sat, Jan 7, 2012 at 14:33, Andreas Veithen <an...@gmail.com> wrote:
> BTW, here is a screenshot that shows the problem:
>
> http://people.apache.org/~veithen/axis2-builds.png
>
> The three Axis2 builds all use the same lock.
>
> Andreas
>
> On Sat, Jan 7, 2012 at 13:38, Andreas Veithen <an...@gmail.com> wrote:
>> We've seen similar issues with Axis2 builds in the past. I had the
>> impression that this has something to do with the usage of locks. We
>> are using a lock for the Axis2 builds (because they use fixed port
>> numbers in unit tests and therefore can't be executed concurrently).
>> Interestingly the hbase build also uses a lock.
>>
>> Speculating further on the possible cause, what we have seen with
>> Axis2 was that sometimes two (or more) different builds (for different
>> branches) where triggered at the same time by the completion of a
>> common upstream build (Axiom trunk e.g.). Since the Axis2 builds use a
>> common lock, one would expect that only one starts execution, while
>> the others remain in the build queue. However, what happened is that
>> sometimes, two builds started execution in parallel, with one waiting
>> for the lock (i.e. instead of waiting in the build queue, it was
>> assigned to an executor and waiting there). I had the impression that
>> this kind of situation increases the probability of ending up in a
>> situation where the build is stuck or reported as being assigned to
>> master.
>>
>> I haven't seen this in a while for the Axis2 builds, but this may be
>> simply because the dependencies between builds have changed in the
>> meantime.
>>
>> Andreas
>>
>> On Sat, Jan 7, 2012 at 13:16, sebb <se...@gmail.com> wrote:
>>> On 7 January 2012 03:20, Ted Yu <yu...@gmail.com> wrote:
>>>> Hi,
>>>> I tried to terminate build 2616 but couldn't:
>>>> https://builds.apache.org/view/G-L/view/HBase/job/HBase-TRUNK/2616/
>>>
>>> Looks like a Jenkins bug - the summary says
>>>
>>> "Started 10 hr ago
>>> Build is being executed for null on master"
>>>
>>> and the tooltip text in build history says:
>>>
>>> "Started null ago
>>> Estimated remaining time: null"
>>>
>>>> Please stop this build.
>>>>
>>>> Thanks

Re: HBase TRUNK build 2616 hangs

Posted by Andreas Veithen <an...@gmail.com>.
BTW, here is a screenshot that shows the problem:

http://people.apache.org/~veithen/axis2-builds.png

The three Axis2 builds all use the same lock.

Andreas

On Sat, Jan 7, 2012 at 13:38, Andreas Veithen <an...@gmail.com> wrote:
> We've seen similar issues with Axis2 builds in the past. I had the
> impression that this has something to do with the usage of locks. We
> are using a lock for the Axis2 builds (because they use fixed port
> numbers in unit tests and therefore can't be executed concurrently).
> Interestingly the hbase build also uses a lock.
>
> Speculating further on the possible cause, what we have seen with
> Axis2 was that sometimes two (or more) different builds (for different
> branches) where triggered at the same time by the completion of a
> common upstream build (Axiom trunk e.g.). Since the Axis2 builds use a
> common lock, one would expect that only one starts execution, while
> the others remain in the build queue. However, what happened is that
> sometimes, two builds started execution in parallel, with one waiting
> for the lock (i.e. instead of waiting in the build queue, it was
> assigned to an executor and waiting there). I had the impression that
> this kind of situation increases the probability of ending up in a
> situation where the build is stuck or reported as being assigned to
> master.
>
> I haven't seen this in a while for the Axis2 builds, but this may be
> simply because the dependencies between builds have changed in the
> meantime.
>
> Andreas
>
> On Sat, Jan 7, 2012 at 13:16, sebb <se...@gmail.com> wrote:
>> On 7 January 2012 03:20, Ted Yu <yu...@gmail.com> wrote:
>>> Hi,
>>> I tried to terminate build 2616 but couldn't:
>>> https://builds.apache.org/view/G-L/view/HBase/job/HBase-TRUNK/2616/
>>
>> Looks like a Jenkins bug - the summary says
>>
>> "Started 10 hr ago
>> Build is being executed for null on master"
>>
>> and the tooltip text in build history says:
>>
>> "Started null ago
>> Estimated remaining time: null"
>>
>>> Please stop this build.
>>>
>>> Thanks

Re: HBase TRUNK build 2616 hangs

Posted by Andreas Veithen <an...@gmail.com>.
We've seen similar issues with Axis2 builds in the past. I had the
impression that this has something to do with the usage of locks. We
are using a lock for the Axis2 builds (because they use fixed port
numbers in unit tests and therefore can't be executed concurrently).
Interestingly the hbase build also uses a lock.

Speculating further on the possible cause, what we have seen with
Axis2 was that sometimes two (or more) different builds (for different
branches) where triggered at the same time by the completion of a
common upstream build (Axiom trunk e.g.). Since the Axis2 builds use a
common lock, one would expect that only one starts execution, while
the others remain in the build queue. However, what happened is that
sometimes, two builds started execution in parallel, with one waiting
for the lock (i.e. instead of waiting in the build queue, it was
assigned to an executor and waiting there). I had the impression that
this kind of situation increases the probability of ending up in a
situation where the build is stuck or reported as being assigned to
master.

I haven't seen this in a while for the Axis2 builds, but this may be
simply because the dependencies between builds have changed in the
meantime.

Andreas

On Sat, Jan 7, 2012 at 13:16, sebb <se...@gmail.com> wrote:
> On 7 January 2012 03:20, Ted Yu <yu...@gmail.com> wrote:
>> Hi,
>> I tried to terminate build 2616 but couldn't:
>> https://builds.apache.org/view/G-L/view/HBase/job/HBase-TRUNK/2616/
>
> Looks like a Jenkins bug - the summary says
>
> "Started 10 hr ago
> Build is being executed for null on master"
>
> and the tooltip text in build history says:
>
> "Started null ago
> Estimated remaining time: null"
>
>> Please stop this build.
>>
>> Thanks

Re: HBase TRUNK build 2616 hangs

Posted by sebb <se...@gmail.com>.
On 7 January 2012 03:20, Ted Yu <yu...@gmail.com> wrote:
> Hi,
> I tried to terminate build 2616 but couldn't:
> https://builds.apache.org/view/G-L/view/HBase/job/HBase-TRUNK/2616/

Looks like a Jenkins bug - the summary says

"Started 10 hr ago
Build is being executed for null on master"

and the tooltip text in build history says:

"Started null ago
Estimated remaining time: null"

> Please stop this build.
>
> Thanks

Re: HBase TRUNK build 2616 hangs

Posted by Ted Yu <yu...@gmail.com>.
It has been hanging for 4 days.

FYI

On Fri, Jan 6, 2012 at 7:20 PM, Ted Yu <yu...@gmail.com> wrote:

> Hi,
> I tried to terminate build 2616 but couldn't:
> https://builds.apache.org/view/G-L/view/HBase/job/HBase-TRUNK/2616/
>
> Please stop this build.
>
> Thanks
>