You are viewing a plain text version of this content. The canonical link for it is here.

Posted to dev@harmony.apache.org by Nathan Beyer <nd...@apache.org> on 2007/04/06 20:56:47 UTC

[native] mfence alternative WAS: [general] What platforms do we support?

I found this interesting article about the Linux kernel's approach to
using SSE2 fence operations [1] when available. Perhaps we can use a
similar technique to implement the memory barriers in the static
native code.

Anyone have thoughts or comments about this? Can this technique be
ported to windows platforms?

-Nathan

[1] http://lwn.net/Articles/164121/

---------- Forwarded message ----------
From: Nathan Beyer <nd...@apache.org>
Date: Apr 5, 2007 7:09 PM
Subject: Re: [general] What platforms do we support?
To: dev@harmony.apache.org


On 4/5/07, Rana Dasgupta <rd...@gmail.com> wrote:
> On 4/5/07, Gregory Shimansky <gs...@gmail.com> wrote:
> > On Thursday 05 April 2007 02:25 Nathan Beyer wrote:
> > > Yes, I believe what we want to say is code to the lowest common
> > > instruction set for static code in the VM, at least for each distinct
> > > instruction set (x86 32-bit, IPF, etc). For the x86 32-bit, the
> > > available instructions must be available in at least a P3.
> >
> > I don't quite understand why code patching doesn't help here? Classlib hythr
> > code does code patching to remove lock prefixes for some instructions (see
> > files in modules/portlib/src/main/native/thread/windows/windows.x86 and
> > modules/portlib/src/main/native/common/windows/lock386.c) on uniprocessors,
> > same could be done to patch away mfence and sfence.
> >
> >
>
> Technically, there is no reason why it cannot be dynamically patched
> as you describe. But if tomorrow there is a new helper that uses a
> fast platform library for memcpy that uses some SSE2 ( as eg., AMD
> fast memcpy does ) you will need to patch that too. If we want to
> support PIII, probably the simplest way is to restrict ourselves to
> the PIII instruction set and only let the jit detect platform and
> generate more advanced instruction sequences. There is no perf etc.
> loss in this because in any case, we want the applications to spend
> almost all their time in jitted code.
>

This would be the approach I think is most appropriate. We want to
minimize the static native code anyway. Additionally, just coding to
the lowest common instruction set would keep the code much cleaner and
easier to maintain.

-Nathan

Re: [native] mfence alternative WAS: [general] What platforms do we support?

Posted by Gregory Shimansky <gs...@gmail.com>.

On Friday 06 April 2007 22:56 Nathan Beyer wrote:
> I found this interesting article about the Linux kernel's approach to
> using SSE2 fence operations [1] when available. Perhaps we can use a
> similar technique to implement the memory barriers in the static
> native code.
>
> Anyone have thoughts or comments about this? Can this technique be
> ported to windows platforms?

I was thinking about something like this, probably not as powerful, but in any 
case I think it requires parsing ELF or COFF (on windows) to find the 
addresses where every alternative symbol has been placed into memory. In 
DRLVM case the additional problem is that the process consists of multiple 
relocated objects such as harmonyvm, hythr and others.

Anyway I think we can just get rid of mfence at this moment, as I've written 
in the "What platforms do we support" since it seems that it is just not 
needed.

> [1] http://lwn.net/Articles/164121/
>
> ---------- Forwarded message ----------
> From: Nathan Beyer <nd...@apache.org>
> Date: Apr 5, 2007 7:09 PM
> Subject: Re: [general] What platforms do we support?
> To: dev@harmony.apache.org
>
> On 4/5/07, Rana Dasgupta <rd...@gmail.com> wrote:
> > On 4/5/07, Gregory Shimansky <gs...@gmail.com> wrote:
> > > On Thursday 05 April 2007 02:25 Nathan Beyer wrote:
> > > > Yes, I believe what we want to say is code to the lowest common
> > > > instruction set for static code in the VM, at least for each distinct
> > > > instruction set (x86 32-bit, IPF, etc). For the x86 32-bit, the
> > > > available instructions must be available in at least a P3.
> > >
> > > I don't quite understand why code patching doesn't help here? Classlib
> > > hythr code does code patching to remove lock prefixes for some
> > > instructions (see files in
> > > modules/portlib/src/main/native/thread/windows/windows.x86 and
> > > modules/portlib/src/main/native/common/windows/lock386.c) on
> > > uniprocessors, same could be done to patch away mfence and sfence.
> >
> > Technically, there is no reason why it cannot be dynamically patched
> > as you describe. But if tomorrow there is a new helper that uses a
> > fast platform library for memcpy that uses some SSE2 ( as eg., AMD
> > fast memcpy does ) you will need to patch that too. If we want to
> > support PIII, probably the simplest way is to restrict ourselves to
> > the PIII instruction set and only let the jit detect platform and
> > generate more advanced instruction sequences. There is no perf etc.
> > loss in this because in any case, we want the applications to spend
> > almost all their time in jitted code.
>
> This would be the approach I think is most appropriate. We want to
> minimize the static native code anyway. Additionally, just coding to
> the lowest common instruction set would keep the code much cleaner and
> easier to maintain.
>
> -Nathan

-- 
Gregory