You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@harmony.apache.org by Naveen Neelakantam <ne...@uiuc.edu> on 2007/03/01 00:25:02 UTC

Re: [drlvm] Fwd: [jira] Updated: (HARMONY-2803) stress.Mix hangs on RHEL4 update 4

On Feb 28, 2007, at 3:40 PM, Weldon Washburn wrote:

> Naveen,
>
> 1)
> I tried Mix.java.load.patch on my 2-way Linux box.  It hangs  
> consistently.
> 2)
> I tried the svn HEAD version of Mix.java.  I can't get it to fail.   
> Can you
> do an "svn update" and see if the baseline Mix.java causes your  
> regression
> tests to hang?

I used "repeat 100 java -cp . stress.Mix" and the test still hangs  
for me.

> My guess is that Mix.java.load.patch is somehow triggering  
> classloader/vm
> synchronization bugs.  I worry that we will spend a bunch of time  
> fixing
> legitimate bugs with no specific focus on getting chosen enterprise
> workloads running.  Unless an enterprise app exhibits a very high  
> rate of
> class loading, I'd like to set this bug aside for now.  What do you  
> think?

I think you are correct.  We should make sure some app demands high- 
rate class loading before chasing this bug down.

> The worst case is that you would have to put stress.Mix on your local
> excludes list.  Another worst case scenario is that we replace
> stress.Mixwith regression tests that are influenced by enterprise
> workloads we want to
> try to run.

replacing stress.Mix makes sense.  A simple way to defer the problem  
would be to remove the "load" case from the test.  I've tried doing  
this and stress.Mix then passes consistently for me.  I'd prefer this  
option, because removing the current incarnation of stress.Mix from  
the excludes list would cause my CruiseControl setup to hang.

Naveen

>
>
>
> On 2/25/07, Naveen Neelakantam <ne...@uiuc.edu> wrote:
>>
>> I tried running the test with only 1 thread and it doesn't hang (I
>> even upped the "period" variable in Mix.java to stress further class
>> loading), so I believe that threading/synchronization is at least
>> involved.
>>
>> I also tried running the test on a single CPU box and it hangs there
>> too.
>>
>> Strangely enough, I tried increasing the number of threads spawned to
>> 1000 and the test no longer seems to hang.
>>
>> Naveen
>>
>> On Feb 25, 2007, at 9:04 PM, Weldon Washburn wrote:
>>
>> > Naveen,
>> >
>> > I looked at your Mix.java.load.patch and have some  questions.  Do
>> > you know
>> > if the problems you are observing are related to the repetitive  
>> class
>> > loading or to basic threading/sync problems?  Or something  
>> completely
>> > different??
>> >
>> >    Thanks
>> >         Weldon
>> >
>> > On 2/23/07, Naveen Neelakantam <ne...@uiuc.edu> wrote:
>> >>
>> >> FYI, stress.Mix is still hanging for me.  It looks like there is
>> >> another issue (i.e. patch from HARMONY-2963 does not completely  
>> solve
>> >> this issue).
>> >>
>> >> Naveen
>> >>
>> >> Begin forwarded message:
>> >>
>> >> > From: "Naveen Neelakantam (JIRA)" <ji...@apache.org>
>> >> > Date: February 23, 2007 6:36:05 PM CST
>> >> > To: neelakan@uiuc.edu
>> >> > Subject: [jira] Updated: (HARMONY-2803) stress.Mix hangs on  
>> RHEL4
>> >> > update 4
>> >> >
>> >> >
>> >> >      [ https://issues.apache.org/jira/browse/HARMONY-2803?
>> >> > page=com.atlassian.jira.plugin.system.issuetabpanels:all- 
>> tabpanel ]
>> >> >
>> >> > Naveen Neelakantam updated HARMONY-2803:
>> >> > ----------------------------------------
>> >> >
>> >> >     Attachment: Mix.java.load.patch
>> >> >
>> >> > I took a play out of the Washburn playbook:
>> >> >
>> >> > The issue can be reproduces by patching Mix.java so that it only
>> >> > "load"s.  The attached patch (Mix.java.load.patch) does the  
>> trick.
>> >> >
>> >> > the patched stress.Mix will still only hang intermittently.
>> >> >
>> >> >> stress.Mix hangs on RHEL4 update 4
>> >> >> ----------------------------------
>> >> >>
>> >> >>                 Key: HARMONY-2803
>> >> >>                 URL: https://issues.apache.org/jira/browse/
>> >> >> HARMONY-2803
>> >> >>             Project: Harmony
>> >> >>          Issue Type: Bug
>> >> >>          Components: build - test - ci
>> >> >>         Environment: RHEL4 update 4, gcc 3.4.6, core2 (i386)
>> >> >>            Reporter: Naveen Neelakantam
>> >> >>         Assigned To: weldon washburn
>> >> >>            Priority: Critical
>> >> >>         Attachments: MegaSpawn.java, Mix.java.load.patch
>> >> >>
>> >> >>
>> >> >> This test consistently hangs on RHEL4 update 4 for i386.  It  
>> has
>> >> >> been preventing Cruise Control from making any progress.
>> >> >
>> >> > --
>> >> > This message is automatically generated by JIRA.
>> >> > -
>> >> > You can reply to this email to add a comment to the issue  
>> online.
>> >> >
>> >>
>> >>
>> >
>> >
>> > --
>> > Weldon Washburn
>> > Intel Enterprise Solutions Software Division
>>
>>
>
>
> -- 
> Weldon Washburn
> Intel Enterprise Solutions Software Division


Re: [drlvm] Fwd: [jira] Updated: (HARMONY-2803) stress.Mix hangs on RHEL4 update 4

Posted by Elena Semukhina <el...@gmail.com>.
On 3/12/07, Elena Semukhina <el...@gmail.com> wrote:
>
>
>
>  On 3/11/07, Naveen Neelakantam <ne...@uiuc.edu> wrote:
> >
> >
> > On Mar 2, 2007, at 12:43 PM, Geir Magnusson Jr. wrote:
> >
> > >
> > > On Feb 28, 2007, at 3:25 PM, Naveen Neelakantam wrote:
> > >
> > >>
> > >> On Feb 28, 2007, at 3:40 PM, Weldon Washburn wrote:
> > >>
> > >>> Naveen,
> > >>>
> > >>> 1)
> > >>> I tried Mix.java.load.patch on my 2-way Linux box.  It hangs
> > >>> consistently.
> > >>> 2)
> > >>> I tried the svn HEAD version of Mix.java.  I can't get it to
> > >>> fail.  Can you
> > >>> do an "svn update" and see if the baseline Mix.java causes your
> > >>> regression
> > >>> tests to hang?
> > >>
> > >> I used "repeat 100 java -cp . stress.Mix" and the test still hangs
> > >> for me.
> > >>
> > >>> My guess is that Mix.java.load.patch is somehow triggering
> > >>> classloader/vm
> > >>> synchronization bugs.  I worry that we will spend a bunch of time
> > >>> fixing
> > >>> legitimate bugs with no specific focus on getting chosen enterprise
> > >>> workloads running.  Unless an enterprise app exhibits a very high
> > >>> rate of
> > >>> class loading, I'd like to set this bug aside for now.  What do
> > >>> you think?
> > >>
> > >> I think you are correct.  We should make sure some app demands
> > >> high-rate class loading before chasing this bug down.
> > >
> > > I don't agree.  I think we should at least understand it (that
> > > doesn't mean fix it if it's hard), but understand why it's
> > > happening.  Even a good faith "I'll spend 2 days trying to figure
> > > this out, but no more" might well be worth it.
> > >
> > > Nothing scares me more than sweeping unaccountable failures under
> > > the rug.
> >
> > Thankfully, the patches from HARMONY-2982 resolved the remaining issues.
>
>
> Does this mean that we can remove stress.Mix from the excluded list on
> linux x86? IIRC, it was excluded just because it failed on RHEL4 for Naveen.
> It passes for me on SLES x86. Does it fail on Ubuntu?
>

Naveen suggested the same on another thread:)


> Thanks,
> Elena
>
> >
> > > geir
> > >
> >
> >
>
>
> --
> Thanks,
> Elena




-- 
Thanks,
Elena

Re: [drlvm] Fwd: [jira] Updated: (HARMONY-2803) stress.Mix hangs on RHEL4 update 4

Posted by Elena Semukhina <el...@gmail.com>.
On 3/11/07, Naveen Neelakantam <ne...@uiuc.edu> wrote:
>
>
> On Mar 2, 2007, at 12:43 PM, Geir Magnusson Jr. wrote:
>
> >
> > On Feb 28, 2007, at 3:25 PM, Naveen Neelakantam wrote:
> >
> >>
> >> On Feb 28, 2007, at 3:40 PM, Weldon Washburn wrote:
> >>
> >>> Naveen,
> >>>
> >>> 1)
> >>> I tried Mix.java.load.patch on my 2-way Linux box.  It hangs
> >>> consistently.
> >>> 2)
> >>> I tried the svn HEAD version of Mix.java.  I can't get it to
> >>> fail.  Can you
> >>> do an "svn update" and see if the baseline Mix.java causes your
> >>> regression
> >>> tests to hang?
> >>
> >> I used "repeat 100 java -cp . stress.Mix" and the test still hangs
> >> for me.
> >>
> >>> My guess is that Mix.java.load.patch is somehow triggering
> >>> classloader/vm
> >>> synchronization bugs.  I worry that we will spend a bunch of time
> >>> fixing
> >>> legitimate bugs with no specific focus on getting chosen enterprise
> >>> workloads running.  Unless an enterprise app exhibits a very high
> >>> rate of
> >>> class loading, I'd like to set this bug aside for now.  What do
> >>> you think?
> >>
> >> I think you are correct.  We should make sure some app demands
> >> high-rate class loading before chasing this bug down.
> >
> > I don't agree.  I think we should at least understand it (that
> > doesn't mean fix it if it's hard), but understand why it's
> > happening.  Even a good faith "I'll spend 2 days trying to figure
> > this out, but no more" might well be worth it.
> >
> > Nothing scares me more than sweeping unaccountable failures under
> > the rug.
>
> Thankfully, the patches from HARMONY-2982 resolved the remaining issues.


Does this mean that we can remove stress.Mix from the excluded list on linux
x86? IIRC, it was excluded just because it failed on RHEL4 for Naveen. It
passes for me on SLES x86. Does it fail on Ubuntu?

Thanks,
Elena

>
> > geir
> >
>
>


-- 
Thanks,
Elena

Re: [drlvm] Fwd: [jira] Updated: (HARMONY-2803) stress.Mix hangs on RHEL4 update 4

Posted by Naveen Neelakantam <ne...@uiuc.edu>.
On Mar 2, 2007, at 12:43 PM, Geir Magnusson Jr. wrote:

>
> On Feb 28, 2007, at 3:25 PM, Naveen Neelakantam wrote:
>
>>
>> On Feb 28, 2007, at 3:40 PM, Weldon Washburn wrote:
>>
>>> Naveen,
>>>
>>> 1)
>>> I tried Mix.java.load.patch on my 2-way Linux box.  It hangs  
>>> consistently.
>>> 2)
>>> I tried the svn HEAD version of Mix.java.  I can't get it to  
>>> fail.  Can you
>>> do an "svn update" and see if the baseline Mix.java causes your  
>>> regression
>>> tests to hang?
>>
>> I used "repeat 100 java -cp . stress.Mix" and the test still hangs  
>> for me.
>>
>>> My guess is that Mix.java.load.patch is somehow triggering  
>>> classloader/vm
>>> synchronization bugs.  I worry that we will spend a bunch of time  
>>> fixing
>>> legitimate bugs with no specific focus on getting chosen enterprise
>>> workloads running.  Unless an enterprise app exhibits a very high  
>>> rate of
>>> class loading, I'd like to set this bug aside for now.  What do  
>>> you think?
>>
>> I think you are correct.  We should make sure some app demands  
>> high-rate class loading before chasing this bug down.
>
> I don't agree.  I think we should at least understand it (that  
> doesn't mean fix it if it's hard), but understand why it's  
> happening.  Even a good faith "I'll spend 2 days trying to figure  
> this out, but no more" might well be worth it.
>
> Nothing scares me more than sweeping unaccountable failures under  
> the rug.

Thankfully, the patches from HARMONY-2982 resolved the remaining issues.

>
> geir
>


Re: [drlvm] Fwd: [jira] Updated: (HARMONY-2803) stress.Mix hangs on RHEL4 update 4

Posted by "Geir Magnusson Jr." <ge...@pobox.com>.
On Mar 13, 2007, at 1:24 AM, Weldon Washburn wrote:

> On 3/2/07, Geir Magnusson Jr. <ge...@pobox.com> wrote:
>>
>>
>> On Feb 28, 2007, at 3:25 PM, Naveen Neelakantam wrote:
>>
>> >
>> > On Feb 28, 2007, at 3:40 PM, Weldon Washburn wrote:
>> >
>> >> Naveen,
>> >>
>> >> 1)
>> >> I tried Mix.java.load.patch on my 2-way Linux box.  It hangs
>> >> consistently.
>> >> 2)
>> >> I tried the svn HEAD version of Mix.java.  I can't get it to
>> >> fail.  Can you
>> >> do an "svn update" and see if the baseline Mix.java causes your
>> >> regression
>> >> tests to hang?
>> >
>> > I used "repeat 100 java -cp . stress.Mix" and the test still hangs
>> > for me.
>> >
>> >> My guess is that Mix.java.load.patch is somehow triggering
>> >> classloader/vm
>> >> synchronization bugs.  I worry that we will spend a bunch of time
>> >> fixing
>> >> legitimate bugs with no specific focus on getting chosen  
>> enterprise
>> >> workloads running.  Unless an enterprise app exhibits a very high
>> >> rate of
>> >> class loading, I'd like to set this bug aside for now.  What do
>> >> you think?
>> >
>> > I think you are correct.  We should make sure some app demands  
>> high-
>> > rate class loading before chasing this bug down.
>>
>> I don't agree.  I think we should at least understand it (that
>> doesn't mean fix it if it's hard), but understand why it's
>> happening.  Even a good faith "I'll spend 2 days trying to figure
>> this out, but no more" might well be worth it.
>
>
> Geir as the say in Apache-land, "thank you for volunteering".  I  
> have looked
> and satisfied my own curiosity.  It would be great if you do the  
> same and
> report back.

Why don't you tell us what you found?  Do you understand what the  
problem is beyond some unknown bugs?

I can also tell you that they are being cause by unknown bugs.

>
> Nothing scares me more than sweeping unaccountable failures under the
>> rug.
>
>
> Geir, with all due respect these are the kinds of problems one  
> observes with
> early stage systems software. DRLVM is a construction zone.  Its
> unreasonable to expect to fix roofing problems when we are still  
> working on
> the foundation.

Do you really think that core synchronization issues are the  
construction-equivalent to hanging curtains or choosing floor tiles?

I don't.

I think that this kind of thing is core to the foundation.  Better  
done earlier than later.

geir

>  Believe me, even if it the problem where "swept under the
> rug", it will come bouncing back when we try to run advanced  
> workloads.
> Nothing is being hidden here.  If you want to follow where the  
> threading
> focus is right now, take a look at H3288 and H3289.

>
>
>
>
> geir
>>
>>
>
>
> -- 
> Weldon Washburn
> Intel Enterprise Solutions Software Division


Re: [drlvm] Fwd: [jira] Updated: (HARMONY-2803) stress.Mix hangs on RHEL4 update 4

Posted by Weldon Washburn <we...@gmail.com>.
On 3/2/07, Geir Magnusson Jr. <ge...@pobox.com> wrote:
>
>
> On Feb 28, 2007, at 3:25 PM, Naveen Neelakantam wrote:
>
> >
> > On Feb 28, 2007, at 3:40 PM, Weldon Washburn wrote:
> >
> >> Naveen,
> >>
> >> 1)
> >> I tried Mix.java.load.patch on my 2-way Linux box.  It hangs
> >> consistently.
> >> 2)
> >> I tried the svn HEAD version of Mix.java.  I can't get it to
> >> fail.  Can you
> >> do an "svn update" and see if the baseline Mix.java causes your
> >> regression
> >> tests to hang?
> >
> > I used "repeat 100 java -cp . stress.Mix" and the test still hangs
> > for me.
> >
> >> My guess is that Mix.java.load.patch is somehow triggering
> >> classloader/vm
> >> synchronization bugs.  I worry that we will spend a bunch of time
> >> fixing
> >> legitimate bugs with no specific focus on getting chosen enterprise
> >> workloads running.  Unless an enterprise app exhibits a very high
> >> rate of
> >> class loading, I'd like to set this bug aside for now.  What do
> >> you think?
> >
> > I think you are correct.  We should make sure some app demands high-
> > rate class loading before chasing this bug down.
>
> I don't agree.  I think we should at least understand it (that
> doesn't mean fix it if it's hard), but understand why it's
> happening.  Even a good faith "I'll spend 2 days trying to figure
> this out, but no more" might well be worth it.


Geir as the say in Apache-land, "thank you for volunteering".  I have looked
and satisfied my own curiosity.  It would be great if you do the same and
report back.

Nothing scares me more than sweeping unaccountable failures under the
> rug.


Geir, with all due respect these are the kinds of problems one observes with
early stage systems software.  DRLVM is a construction zone.  Its
unreasonable to expect to fix roofing problems when we are still working on
the foundation.  Believe me, even if it the problem where "swept under the
rug", it will come bouncing back when we try to run advanced workloads.
Nothing is being hidden here.  If you want to follow where the threading
focus is right now, take a look at H3288 and H3289.




geir
>
>


-- 
Weldon Washburn
Intel Enterprise Solutions Software Division

Re: [drlvm] Fwd: [jira] Updated: (HARMONY-2803) stress.Mix hangs on RHEL4 update 4

Posted by "Geir Magnusson Jr." <ge...@pobox.com>.
On Feb 28, 2007, at 3:25 PM, Naveen Neelakantam wrote:

>
> On Feb 28, 2007, at 3:40 PM, Weldon Washburn wrote:
>
>> Naveen,
>>
>> 1)
>> I tried Mix.java.load.patch on my 2-way Linux box.  It hangs  
>> consistently.
>> 2)
>> I tried the svn HEAD version of Mix.java.  I can't get it to  
>> fail.  Can you
>> do an "svn update" and see if the baseline Mix.java causes your  
>> regression
>> tests to hang?
>
> I used "repeat 100 java -cp . stress.Mix" and the test still hangs  
> for me.
>
>> My guess is that Mix.java.load.patch is somehow triggering  
>> classloader/vm
>> synchronization bugs.  I worry that we will spend a bunch of time  
>> fixing
>> legitimate bugs with no specific focus on getting chosen enterprise
>> workloads running.  Unless an enterprise app exhibits a very high  
>> rate of
>> class loading, I'd like to set this bug aside for now.  What do  
>> you think?
>
> I think you are correct.  We should make sure some app demands high- 
> rate class loading before chasing this bug down.

I don't agree.  I think we should at least understand it (that  
doesn't mean fix it if it's hard), but understand why it's  
happening.  Even a good faith "I'll spend 2 days trying to figure  
this out, but no more" might well be worth it.

Nothing scares me more than sweeping unaccountable failures under the  
rug.

geir