You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@harmony.apache.org by Mikhail Fursov <mi...@gmail.com> on 2006/10/20 12:15:02 UTC

[drlvm][threadmanager] Fast thread local data access

All,

The TM architecture implies any component that needs to store threadlocal
data to reserve a slot:
extern HY_CFUNC IDATA VMCALL hythread_tls_alloc
PROTOTYPE((hythread_tls_key_t* handle));

The key can be used to get/set the allocated memory before its freed.

The problem is that JIT must know the final offset of the memory inside of
the TLS local struct to emit optimized code. The examples are: BBP, escape
analyzer, fast-helpers...
We can solve the task by adding  this method to the TM interface:

size_t hythread_tls_get_storage_offset(hythread_tls_key_t handle);

Is it OK?
If yes, could I do this or anyone else wants to do it by himself?

-- 
Mikhail Fursov

Re: [drlvm][threadmanager] Fast thread local data access

Posted by Ivan Volosyuk <iv...@gmail.com>.
Good idea.

We can use it in fast path of GC allocation. We can have fast TLS base
access compiled in managed code + constant already known offset of
TLS-fields used by GC. Can be significant speedup in comparison to
current scheme. One thing to care about is intercomponent
communication of TLS values offsets used by GC.
--
Ivan

On 10/20/06, Mikhail Fursov <mi...@gmail.com> wrote:
> All,
>
> The TM architecture implies any component that needs to store threadlocal
> data to reserve a slot:
> extern HY_CFUNC IDATA VMCALL hythread_tls_alloc
> PROTOTYPE((hythread_tls_key_t* handle));
>
> The key can be used to get/set the allocated memory before its freed.
>
> The problem is that JIT must know the final offset of the memory inside of
> the TLS local struct to emit optimized code. The examples are: BBP, escape
> analyzer, fast-helpers...
> We can solve the task by adding  this method to the TM interface:
>
> size_t hythread_tls_get_storage_offset(hythread_tls_key_t handle);
>
> Is it OK?
> If yes, could I do this or anyone else wants to do it by himself?
>
> --
> Mikhail Fursov

---------------------------------------------------------------------
Terms of use : http://incubator.apache.org/harmony/mailing.html
To unsubscribe, e-mail: harmony-dev-unsubscribe@incubator.apache.org
For additional commands, e-mail: harmony-dev-help@incubator.apache.org


Re: [drlvm][threadmanager] Fast thread local data access

Posted by Mikhail Fursov <mi...@gmail.com>.
Artem, I was wrong. No need to answer the question about
hythread_get_tls_suspend_request_offset.
The 'suspend_request' is TM term. So your proposal to call TM from JIT was
right.

On 10/20/06, Mikhail Fursov <mi...@gmail.com> wrote:
>
> Just a small update.
> On 10/20/06, Mikhail Fursov <mi...@gmail.com> wrote:
> >
> > 1. Ok, it means that JIT reserves a key at startup
> >
>
> VM and JIT must share the key. So VM allocates the key and share it with
> all JITs.
> --
> Mikhail Fursov




-- 
Mikhail Fursov

Re: [drlvm][threadmanager] Fast thread local data access

Posted by Mikhail Fursov <mi...@gmail.com>.
Just a small update.
On 10/20/06, Mikhail Fursov <mi...@gmail.com> wrote:
>
> 1. Ok, it means that JIT reserves a key at startup
>

VM and JIT must share the key. So VM allocates the key and share it with all
JITs.
-- 
Mikhail Fursov

Re: [drlvm][threadmanager] Fast thread local data access

Posted by Mikhail Fursov <mi...@gmail.com>.
On 10/20/06, Artem Aliev <ar...@gmail.com> wrote:
>
> It will be also good to make order on TLS usage
>
> 1. refactor HyThread->jit_private_data, to be usual TSL data.
> 2. add hythread_get_tls_suspend_request_offset function for JIT
> back-branch polling.
> 3. refacor vm_get_gc_thread_local() to use hythread TLS directly.
>
> I'm looking forward for a patch from you.


Artem, I need some clarifications:
1. Ok, it means that JIT reserves a key at startup
2. Why do we need this function in TM interface when we have a key? We can
request the constant offset for this key right after we allocated it?
3. Yes VM-JIT BBP interface will be changed slightly


-- 
Mikhail Fursov

Re: [drlvm][threadmanager] Fast thread local data access

Posted by Slava Shakin <vy...@gmail.com>.
"Mikhail Fursov" <mi...@gmail.com> wrote in message 
news:bc79dd600610200656u460e5ddej799d726abd943362@mail.gmail.com...
> On 10/20/06, Slava Shakin <vy...@gmail.com> wrote:
>>
>> I think a generic encoding interface exposed by a component could return
>> for
>> a given helper a mask of affected registers, description of input and
>> return
>> parameters, and should not produce any other side effects including VM
>> calls
>> or exception throwing. BTW implementing a similar description mechanism
>> for
>> traditional VM helpers will also help modularity. Instead of helpers, we
>> could think about extensible magics framework where magics are
>> expanded/encoded by responsible components.
>
>
> Slava,
> The functionality you described could be reasonable for very low-level and
> platform+OS specific magics only like TLS access.
> The unboxed Java magics IMO are much more flexible because they are
> converted to the HIR by JIT translator.

Yes, sure.

>
> -- 
> Mikhail Fursov
> 




---------------------------------------------------------------------
Terms of use : http://incubator.apache.org/harmony/mailing.html
To unsubscribe, e-mail: harmony-dev-unsubscribe@incubator.apache.org
For additional commands, e-mail: harmony-dev-help@incubator.apache.org


Re: [drlvm][threadmanager] Fast thread local data access

Posted by Mikhail Fursov <mi...@gmail.com>.
On 10/20/06, Slava Shakin <vy...@gmail.com> wrote:
>
> I think a generic encoding interface exposed by a component could return
> for
> a given helper a mask of affected registers, description of input and
> return
> parameters, and should not produce any other side effects including VM
> calls
> or exception throwing. BTW implementing a similar description mechanism
> for
> traditional VM helpers will also help modularity. Instead of helpers, we
> could think about extensible magics framework where magics are
> expanded/encoded by responsible components.


Slava,
The functionality you described could be reasonable for very low-level and
platform+OS specific magics only like TLS access.
The unboxed Java magics IMO are much more flexible because they are
converted to the HIR by JIT translator.

-- 
Mikhail Fursov

Re: [drlvm][threadmanager] Fast thread local data access

Posted by Salikh Zakirov <Sa...@Intel.com>.
Mikhail Fursov wrote:
> The patch with the results of the discussion is in
> http://issues.apache.org/jira/browse/HARMONY-1942
> Please review if interested.

Looks great!


Re: [drlvm][threadmanager] Fast thread local data access

Posted by Mikhail Fursov <mi...@gmail.com>.
The patch with the results of the discussion is in
http://issues.apache.org/jira/browse/HARMONY-1942
Please review if interested.


-- 
Mikhail Fursov

Re: [drlvm][threadmanager] Fast thread local data access

Posted by Slava Shakin <vy...@gmail.com>.
I like the proposal about delegation of encoding to another responsible 
component (TM in this case). Or shall we call it binary inlining of VM 
helpers?
It seems to fix the remaining minor modularity issues we failed to solve 
with magics.

We could introduce something like TLSLoadMacroInst in CG which encoding 
would be delegated to TM.

My minor concern is that TM imposes additional register constraints on 
TLSLoadMacroInst receiver operand (EAX), but this makes generated code only 
very slightly worse in case of fs:[14h] load and all alternatives I thought 
about are more complicated.

I think a generic encoding interface exposed by a component could return for 
a given helper a mask of affected registers, description of input and return 
parameters, and should not produce any other side effects including VM calls 
or exception throwing. BTW implementing a similar description mechanism for 
traditional VM helpers will also help modularity. Instead of helpers, we 
could think about extensible magics framework where magics are 
expanded/encoded by responsible components.

Overall, we could probably start with the TLS-only specific solution 
provided that gen_hythread_self_helper doesn't change any register except 
for EAX on any OS.

--
Slava Shakin

"Artem Aliev" <ar...@gmail.com> wrote in message 
news:187bb05d0610200402y39f4791dk5a46f42fb20a2651@mail.gmail.com...
> Mikhail,
>
> Right now the TM uses vm helper approach for the fast TLS access.
> See vm/thread/src/thread_helpers.cpp:
> fast_tls_func* get_tls_helper(hythread_tls_key_t key);
>
> It looks reasonable for performance, to divide it in two part,
> get_tls_pointer_accesor() and get_tls_storage_offset().
>
> First one is already implemented but not added to interface:
>
> /**
>  *  Generates tmn_self() call.
>  *  The code should not contains safepoint.
>  *  The code uses and doesn't restore eax register.
>  *
>  *  @return tm_self() in eax register
>  */
> char* gen_hythread_self_helper(char *ss) {
> #ifdef FS14_TLS_USE
>    //ss = mov(ss,  eax_opnd,  M_Base_Opnd(fs_reg, 0x14));
>    *ss++ = (char)0x64;
>    *ss++ = (char)0xa1;
>    *ss++ = (char)0x14;
>    *ss++ = (char)0x00;
>    *ss++ = (char)0x00;
>    *ss++ = (char)0x00;
> #else
>    ss = call(ss, (char *)hythread_self);
> #endif
>    return ss;
> }
>
>
> It will be also good to make order on TLS usage
>
> 1. refactor HyThread->jit_private_data, to be usual TSL data.
> 2. add hythread_get_tls_suspend_request_offset function for JIT
> back-branch polling.
> 3. refacor vm_get_gc_thread_local() to use hythread TLS directly.
>
> I'm looking forward for a patch from you.
>
> Thanks
> Artem
>
> On 10/20/06, Mikhail Fursov <mi...@gmail.com> wrote:
>> All,
>>
>> The TM architecture implies any component that needs to store threadlocal
>> data to reserve a slot:
>> extern HY_CFUNC IDATA VMCALL hythread_tls_alloc
>> PROTOTYPE((hythread_tls_key_t* handle));
>>
>> The key can be used to get/set the allocated memory before its freed.
>>
>> The problem is that JIT must know the final offset of the memory inside 
>> of
>> the TLS local struct to emit optimized code. The examples are: BBP, 
>> escape
>> analyzer, fast-helpers...
>> We can solve the task by adding  this method to the TM interface:
>>
>> size_t hythread_tls_get_storage_offset(hythread_tls_key_t handle);
>>
>> Is it OK?
>> If yes, could I do this or anyone else wants to do it by himself?
>>
>> --
>> Mikhail Fursov
>>
>>
>
> ---------------------------------------------------------------------
> Terms of use : http://incubator.apache.org/harmony/mailing.html
> To unsubscribe, e-mail: harmony-dev-unsubscribe@incubator.apache.org
> For additional commands, e-mail: harmony-dev-help@incubator.apache.org
>
> 




---------------------------------------------------------------------
Terms of use : http://incubator.apache.org/harmony/mailing.html
To unsubscribe, e-mail: harmony-dev-unsubscribe@incubator.apache.org
For additional commands, e-mail: harmony-dev-help@incubator.apache.org


Re: [drlvm][threadmanager] Fast thread local data access

Posted by Artem Aliev <ar...@gmail.com>.
Mikhail,

Right now the TM uses vm helper approach for the fast TLS access.
See vm/thread/src/thread_helpers.cpp:
fast_tls_func* get_tls_helper(hythread_tls_key_t key);

It looks reasonable for performance, to divide it in two part,
get_tls_pointer_accesor() and get_tls_storage_offset().

First one is already implemented but not added to interface:

/**
  *  Generates tmn_self() call.
  *  The code should not contains safepoint.
  *  The code uses and doesn't restore eax register.
  *
  *  @return tm_self() in eax register
  */
char* gen_hythread_self_helper(char *ss) {
#ifdef FS14_TLS_USE
    //ss = mov(ss,  eax_opnd,  M_Base_Opnd(fs_reg, 0x14));
    *ss++ = (char)0x64;
    *ss++ = (char)0xa1;
    *ss++ = (char)0x14;
    *ss++ = (char)0x00;
    *ss++ = (char)0x00;
    *ss++ = (char)0x00;
#else
    ss = call(ss, (char *)hythread_self);
#endif
    return ss;
}


It will be also good to make order on TLS usage

1. refactor HyThread->jit_private_data, to be usual TSL data.
2. add hythread_get_tls_suspend_request_offset function for JIT
back-branch polling.
3. refacor vm_get_gc_thread_local() to use hythread TLS directly.

I'm looking forward for a patch from you.

Thanks
Artem

On 10/20/06, Mikhail Fursov <mi...@gmail.com> wrote:
> All,
>
> The TM architecture implies any component that needs to store threadlocal
> data to reserve a slot:
> extern HY_CFUNC IDATA VMCALL hythread_tls_alloc
> PROTOTYPE((hythread_tls_key_t* handle));
>
> The key can be used to get/set the allocated memory before its freed.
>
> The problem is that JIT must know the final offset of the memory inside of
> the TLS local struct to emit optimized code. The examples are: BBP, escape
> analyzer, fast-helpers...
> We can solve the task by adding  this method to the TM interface:
>
> size_t hythread_tls_get_storage_offset(hythread_tls_key_t handle);
>
> Is it OK?
> If yes, could I do this or anyone else wants to do it by himself?
>
> --
> Mikhail Fursov
>
>

---------------------------------------------------------------------
Terms of use : http://incubator.apache.org/harmony/mailing.html
To unsubscribe, e-mail: harmony-dev-unsubscribe@incubator.apache.org
For additional commands, e-mail: harmony-dev-help@incubator.apache.org