You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@ant.apache.org by Steve Loughran <st...@apache.org> on 2007/10/04 13:27:24 UTC

race conditions in Ant


I've been running the new build and havent seen any more loops; I think 
race conditions are gone.

Incidentally, given we didnt see any way that the thing could loop, 
given we were using threadlocal to store a per-thread datastructure, and 
given that threadlocals are implicitly thread safe, I'm still not sure 
where the problem arose. but I have been pointed at some bugs in ThreadLocal


http://bugs.sun.com/bugdatabase/view_bug.do?bug_id=6550283
"ThreadLocal.initialValue() may be called multiple times in some cases"

There is a bit of a race condition on initialisation, *across all 
threadlocal classes in use in that thread*. if you are only using your 
own classes, you need to sync off something common (like the current 
thread). But you are still vulnerable to any other class using Thread 
local storage making an operation that increases the size of the hash 
table of threadlocal values, in which case you are stuffed.

This is a WONTFIX for Java<1.6.

Accordingly, you can't reliably use ThreadLocal on a multicpu system for 
Java <=1.5 as you cannot be sure that other classes wont stamp on you. 
This is pretty serious, as the reason you would use TLS is to avoid 
concurrency problems.

-Steve

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@ant.apache.org
For additional commands, e-mail: dev-help@ant.apache.org


Re: race conditions in Ant

Posted by Steve Loughran <st...@apache.org>.
Peter Reilly wrote:
> On 10/5/07, Steve Loughran <st...@apache.org> wrote:
>> Matt Benson wrote:
>>> --- Steve Loughran <st...@apache.org> wrote:
>>>
>>>> Steve Loughran wrote:
>>>>> I've been running the new build and havent seen
>>>> any more loops; I think
>>>>> race conditions are gone.
>>>>>
>>>>> Incidentally, given we didnt see any way that the
>>>> thing could loop,
>>>>> given we were using threadlocal to store a
>>>> per-thread datastructure, and
>>>>> given that threadlocals are implicitly thread
>>>> safe, I'm still not sure
>>>>> where the problem arose. but I have been pointed
>>>> at some bugs in
>>>>> ThreadLocal
>>>>>
>>>>>
>>>>>
>>> http://bugs.sun.com/bugdatabase/view_bug.do?bug_id=6550283
>>>>> "ThreadLocal.initialValue() may be called multiple
>>>> times in some cases"
>>>>> There is a bit of a race condition on
>>>> initialisation, *across all
>>>>> threadlocal classes in use in that thread*. if you
>>>> are only using your
>>>>> own classes, you need to sync off something common
>>>> (like the current
>>>>> thread). But you are still vulnerable to any other
>>>> class using Thread
>>>>> local storage making an operation that increases
>>>> the size of the hash
>>>>> table of threadlocal values, in which case you are
>>>> stuffed.
>>>>> This is a WONTFIX for Java<1.6.
>>>>>
>>>>> Accordingly, you can't reliably use ThreadLocal on
>>>> a multicpu system for
>>>>> Java <=1.5 as you cannot be sure that other
>>>> classes wont stamp on you.
>>>>> This is pretty serious, as the reason you would
>>>> use TLS is to avoid
>>>>> concurrency problems.
>>>> followup based on looking at the code.
>>> Which code, Steve?
>> Any code that uses threadlocal needs to be looked at. You mustnt
>> subclass it and override the initialvalue method with one that itself
>> creates objects that may use threadlocal. If you leave it with a null
>> init value all is well, but once you start subclassing, you run a risk
>> of same-thread reentrant access to datastructures that, while thread
>> safe, are not locked against re-entrant operations
>>
>> oh, lots of ambuiguity about ThreadLocal cleanup too: stuff in a
>> ThreadLocal can hang around for the life of a thread
> 
> Yes, this is the issue with IVY.


we could do a trick here under <subant>: start a new thread for each run 
and have the original thread block until it exits. It wouldnt be 
parallel, but it may clean up

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@ant.apache.org
For additional commands, e-mail: dev-help@ant.apache.org


Re: race conditions in Ant

Posted by Steve Loughran <st...@apache.org>.
Peter Reilly wrote:
> On 10/5/07, Steve Loughran <st...@apache.org> wrote:

>> oh, lots of ambuiguity about ThreadLocal cleanup too: stuff in a
>> ThreadLocal can hang around for the life of a thread
> 
> Yes, this is the issue with IVY.
> 
> 
>> http://bugs.sun.com/bugdatabase/view_bug.do?bug_id=6209042
>>
>> Which means you ought to clean them up when you are finished.
> 
> At least for threadlocals associated with long-lived threads. ThreadLocals
> associated with threads that have ended should be GCed. (However
> there may be bugs with this).
> 
> Peter
> 
>> I've filed this as todo items @work
>> http://jira.smartfrog.org/jira/browse/SFOS-469
>> http://jira.smartfrog.org/jira/browse/SFOS-470
>>
>> but not looked through ant's code, not yet


only place we use it in SVN_HEAD is in definer, which componentdef, 
taskdef and typedef extend...this creates a stack...this stack would 
leak but its cost is fairly low.

Now, we could add some lifecycle callback; a list of things to invoke at 
the end of a build, definer would register something there on startup, 
and perhaps ivy could add a registration too. Make it harmless to invoke 
more than once, and give the option of adding a weakref as well as a 
hard reference, in case people only want weak lifecycle callbacks. These 
would be pretty low-level events; all errors get caught and logged but 
not propagated.

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@ant.apache.org
For additional commands, e-mail: dev-help@ant.apache.org


Re: race conditions in Ant

Posted by Peter Reilly <pe...@gmail.com>.
On 10/5/07, Steve Loughran <st...@apache.org> wrote:
> Matt Benson wrote:
> > --- Steve Loughran <st...@apache.org> wrote:
> >
> >> Steve Loughran wrote:
> >>>
> >>> I've been running the new build and havent seen
> >> any more loops; I think
> >>> race conditions are gone.
> >>>
> >>> Incidentally, given we didnt see any way that the
> >> thing could loop,
> >>> given we were using threadlocal to store a
> >> per-thread datastructure, and
> >>> given that threadlocals are implicitly thread
> >> safe, I'm still not sure
> >>> where the problem arose. but I have been pointed
> >> at some bugs in
> >>> ThreadLocal
> >>>
> >>>
> >>>
> > http://bugs.sun.com/bugdatabase/view_bug.do?bug_id=6550283
> >>> "ThreadLocal.initialValue() may be called multiple
> >> times in some cases"
> >>> There is a bit of a race condition on
> >> initialisation, *across all
> >>> threadlocal classes in use in that thread*. if you
> >> are only using your
> >>> own classes, you need to sync off something common
> >> (like the current
> >>> thread). But you are still vulnerable to any other
> >> class using Thread
> >>> local storage making an operation that increases
> >> the size of the hash
> >>> table of threadlocal values, in which case you are
> >> stuffed.
> >>> This is a WONTFIX for Java<1.6.
> >>>
> >>> Accordingly, you can't reliably use ThreadLocal on
> >> a multicpu system for
> >>> Java <=1.5 as you cannot be sure that other
> >> classes wont stamp on you.
> >>> This is pretty serious, as the reason you would
> >> use TLS is to avoid
> >>> concurrency problems.
> >> followup based on looking at the code.
> >
> > Which code, Steve?
>
> Any code that uses threadlocal needs to be looked at. You mustnt
> subclass it and override the initialvalue method with one that itself
> creates objects that may use threadlocal. If you leave it with a null
> init value all is well, but once you start subclassing, you run a risk
> of same-thread reentrant access to datastructures that, while thread
> safe, are not locked against re-entrant operations
>
> oh, lots of ambuiguity about ThreadLocal cleanup too: stuff in a
> ThreadLocal can hang around for the life of a thread

Yes, this is the issue with IVY.


>
> http://bugs.sun.com/bugdatabase/view_bug.do?bug_id=6209042
>
> Which means you ought to clean them up when you are finished.

At least for threadlocals associated with long-lived threads. ThreadLocals
associated with threads that have ended should be GCed. (However
there may be bugs with this).

Peter

>
> I've filed this as todo items @work
> http://jira.smartfrog.org/jira/browse/SFOS-469
> http://jira.smartfrog.org/jira/browse/SFOS-470
>
> but not looked through ant's code, not yet
> -Steve
>
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: dev-unsubscribe@ant.apache.org
> For additional commands, e-mail: dev-help@ant.apache.org
>
>

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@ant.apache.org
For additional commands, e-mail: dev-help@ant.apache.org


Re: race conditions in Ant

Posted by Matt Benson <gu...@yahoo.com>.
--- Steve Loughran <st...@apache.org> wrote:

> Matt Benson wrote:
> > --- Steve Loughran <st...@apache.org> wrote:
> > 
> >> Steve Loughran wrote:
> >>>
> >>> I've been running the new build and havent seen
> >> any more loops; I think 
> >>> race conditions are gone.
> >>>
> >>> Incidentally, given we didnt see any way that
> the
> >> thing could loop, 
> >>> given we were using threadlocal to store a
> >> per-thread datastructure, and 
> >>> given that threadlocals are implicitly thread
> >> safe, I'm still not sure 
> >>> where the problem arose. but I have been pointed
> >> at some bugs in 
> >>> ThreadLocal
> >>>
> >>>
> >>>
> >
>
http://bugs.sun.com/bugdatabase/view_bug.do?bug_id=6550283
> >>> "ThreadLocal.initialValue() may be called
> multiple
> >> times in some cases"
> >>> There is a bit of a race condition on
> >> initialisation, *across all 
> >>> threadlocal classes in use in that thread*. if
> you
> >> are only using your 
> >>> own classes, you need to sync off something
> common
> >> (like the current 
> >>> thread). But you are still vulnerable to any
> other
> >> class using Thread 
> >>> local storage making an operation that increases
> >> the size of the hash 
> >>> table of threadlocal values, in which case you
> are
> >> stuffed.
> >>> This is a WONTFIX for Java<1.6.
> >>>
> >>> Accordingly, you can't reliably use ThreadLocal
> on
> >> a multicpu system for 
> >>> Java <=1.5 as you cannot be sure that other
> >> classes wont stamp on you. 
> >>> This is pretty serious, as the reason you would
> >> use TLS is to avoid 
> >>> concurrency problems.
> >> followup based on looking at the code.
> > 
> > Which code, Steve?
> 
> Any code that uses threadlocal needs to be looked
> at.

Oh--for some reason I had gotten the impression from
your previous message that there was some particular
piece of code at which you had already looked.

-Matt

> You mustnt 
> subclass it and override the initialvalue method
> with one that itself 
> creates objects that may use threadlocal. If you
> leave it with a null 
> init value all is well, but once you start
> subclassing, you run a risk 
> of same-thread reentrant access to datastructures
> that, while thread 
> safe, are not locked against re-entrant operations
> 
> oh, lots of ambuiguity about ThreadLocal cleanup
> too: stuff in a 
> ThreadLocal can hang around for the life of a thread
> 
>
http://bugs.sun.com/bugdatabase/view_bug.do?bug_id=6209042
> 
> Which means you ought to clean them up when you are
> finished.
> 
> I've filed this as todo items @work
> http://jira.smartfrog.org/jira/browse/SFOS-469
> http://jira.smartfrog.org/jira/browse/SFOS-470
> 
> but not looked through ant's code, not yet
> -Steve
> 
> 
> 
>
---------------------------------------------------------------------
> To unsubscribe, e-mail:
> dev-unsubscribe@ant.apache.org
> For additional commands, e-mail:
> dev-help@ant.apache.org
> 
> 



       
____________________________________________________________________________________
Moody friends. Drama queens. Your life? Nope! - their life, your story. Play Sims Stories at Yahoo! Games.
http://sims.yahoo.com/  

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@ant.apache.org
For additional commands, e-mail: dev-help@ant.apache.org


Re: race conditions in Ant

Posted by Steve Loughran <st...@apache.org>.
Matt Benson wrote:
> --- Steve Loughran <st...@apache.org> wrote:
> 
>> Steve Loughran wrote:
>>>
>>> I've been running the new build and havent seen
>> any more loops; I think 
>>> race conditions are gone.
>>>
>>> Incidentally, given we didnt see any way that the
>> thing could loop, 
>>> given we were using threadlocal to store a
>> per-thread datastructure, and 
>>> given that threadlocals are implicitly thread
>> safe, I'm still not sure 
>>> where the problem arose. but I have been pointed
>> at some bugs in 
>>> ThreadLocal
>>>
>>>
>>>
> http://bugs.sun.com/bugdatabase/view_bug.do?bug_id=6550283
>>> "ThreadLocal.initialValue() may be called multiple
>> times in some cases"
>>> There is a bit of a race condition on
>> initialisation, *across all 
>>> threadlocal classes in use in that thread*. if you
>> are only using your 
>>> own classes, you need to sync off something common
>> (like the current 
>>> thread). But you are still vulnerable to any other
>> class using Thread 
>>> local storage making an operation that increases
>> the size of the hash 
>>> table of threadlocal values, in which case you are
>> stuffed.
>>> This is a WONTFIX for Java<1.6.
>>>
>>> Accordingly, you can't reliably use ThreadLocal on
>> a multicpu system for 
>>> Java <=1.5 as you cannot be sure that other
>> classes wont stamp on you. 
>>> This is pretty serious, as the reason you would
>> use TLS is to avoid 
>>> concurrency problems.
>> followup based on looking at the code.
> 
> Which code, Steve?

Any code that uses threadlocal needs to be looked at. You mustnt 
subclass it and override the initialvalue method with one that itself 
creates objects that may use threadlocal. If you leave it with a null 
init value all is well, but once you start subclassing, you run a risk 
of same-thread reentrant access to datastructures that, while thread 
safe, are not locked against re-entrant operations

oh, lots of ambuiguity about ThreadLocal cleanup too: stuff in a 
ThreadLocal can hang around for the life of a thread

http://bugs.sun.com/bugdatabase/view_bug.do?bug_id=6209042

Which means you ought to clean them up when you are finished.

I've filed this as todo items @work
http://jira.smartfrog.org/jira/browse/SFOS-469
http://jira.smartfrog.org/jira/browse/SFOS-470

but not looked through ant's code, not yet
-Steve



---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@ant.apache.org
For additional commands, e-mail: dev-help@ant.apache.org


Re: race conditions in Ant

Posted by Matt Benson <gu...@yahoo.com>.
--- Steve Loughran <st...@apache.org> wrote:

> Steve Loughran wrote:
> > 
> > 
> > I've been running the new build and havent seen
> any more loops; I think 
> > race conditions are gone.
> > 
> > Incidentally, given we didnt see any way that the
> thing could loop, 
> > given we were using threadlocal to store a
> per-thread datastructure, and 
> > given that threadlocals are implicitly thread
> safe, I'm still not sure 
> > where the problem arose. but I have been pointed
> at some bugs in 
> > ThreadLocal
> > 
> > 
> >
>
http://bugs.sun.com/bugdatabase/view_bug.do?bug_id=6550283
> > "ThreadLocal.initialValue() may be called multiple
> times in some cases"
> > 
> > There is a bit of a race condition on
> initialisation, *across all 
> > threadlocal classes in use in that thread*. if you
> are only using your 
> > own classes, you need to sync off something common
> (like the current 
> > thread). But you are still vulnerable to any other
> class using Thread 
> > local storage making an operation that increases
> the size of the hash 
> > table of threadlocal values, in which case you are
> stuffed.
> > 
> > This is a WONTFIX for Java<1.6.
> > 
> > Accordingly, you can't reliably use ThreadLocal on
> a multicpu system for 
> > Java <=1.5 as you cannot be sure that other
> classes wont stamp on you. 
> > This is pretty serious, as the reason you would
> use TLS is to avoid 
> > concurrency problems.
> 
> followup based on looking at the code.

Which code, Steve?

Thanks,
Matt

> 
> Its not a race condition, so much as a re-entrancy
> bug. the code here is 
> thread safe, but doesnt lock out the per-thread
> hashmap  while 
> initialising a ThreadLocal instance. If the thing
> being initialised is 
> subclassed to make a complex initial value, and that
> initial value 
> itself creates ThreadLocal content, then you are in
> trouble. If you dont 
> do that, you should be ok.
> 
> This is probably not the cause of the looping I saw.
> 
> -steve
> 
> 
>
---------------------------------------------------------------------
> To unsubscribe, e-mail:
> dev-unsubscribe@ant.apache.org
> For additional commands, e-mail:
> dev-help@ant.apache.org
> 
> 



       
____________________________________________________________________________________
Pinpoint customers who are looking for what you sell. 
http://searchmarketing.yahoo.com/

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@ant.apache.org
For additional commands, e-mail: dev-help@ant.apache.org


Re: race conditions in Ant

Posted by Steve Loughran <st...@apache.org>.
Steve Loughran wrote:
> 
> 
> I've been running the new build and havent seen any more loops; I think 
> race conditions are gone.
> 
> Incidentally, given we didnt see any way that the thing could loop, 
> given we were using threadlocal to store a per-thread datastructure, and 
> given that threadlocals are implicitly thread safe, I'm still not sure 
> where the problem arose. but I have been pointed at some bugs in 
> ThreadLocal
> 
> 
> http://bugs.sun.com/bugdatabase/view_bug.do?bug_id=6550283
> "ThreadLocal.initialValue() may be called multiple times in some cases"
> 
> There is a bit of a race condition on initialisation, *across all 
> threadlocal classes in use in that thread*. if you are only using your 
> own classes, you need to sync off something common (like the current 
> thread). But you are still vulnerable to any other class using Thread 
> local storage making an operation that increases the size of the hash 
> table of threadlocal values, in which case you are stuffed.
> 
> This is a WONTFIX for Java<1.6.
> 
> Accordingly, you can't reliably use ThreadLocal on a multicpu system for 
> Java <=1.5 as you cannot be sure that other classes wont stamp on you. 
> This is pretty serious, as the reason you would use TLS is to avoid 
> concurrency problems.

followup based on looking at the code.

Its not a race condition, so much as a re-entrancy bug. the code here is 
thread safe, but doesnt lock out the per-thread hashmap  while 
initialising a ThreadLocal instance. If the thing being initialised is 
subclassed to make a complex initial value, and that initial value 
itself creates ThreadLocal content, then you are in trouble. If you dont 
do that, you should be ok.

This is probably not the cause of the looping I saw.

-steve


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@ant.apache.org
For additional commands, e-mail: dev-help@ant.apache.org