You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@lucy.apache.org by Marvin Humphrey <ma...@rectangular.com> on 2013/07/14 02:22:11 UTC

[lucy-dev] Member vars fragile ABI fix

Greets,

The IVARS fix that Nick proposed for the fragile ABI problem posed by member
variables has been implemented, on branch ivars-wip1.

If you define an instantiable `class Foo` with the parcel "Stuff", Clownfish
will generate a `stuff_FooIVARS` struct definition and a `stuff_Foo_IVARS`
inline function to get at it.

    Foo*
    Foo_init(Foo *self, int count, float score) {
        FooIVARS *const ivars = Foo_IVARS(self);
        ivars->count = count;
        ivars->score = score;
        return self;
    }

    int
    Foo_get_count(Foo *self) {
        return Foo_IVARS(self)->count;
    }

Implementation detail: the inline function adds the value of a per-class,
parcel-scoped `size_t` variable named e.g. `stuff_Foo_IVARS_OFFSET` to the top
of the object.

Member variables are now parcel scoped.  With regards to visibility of member
vars within a parcel, I made a judgment call that the IVARS structs for
subclasses would include variables declared in superclasses if those
superclasses are in the same parcel.  For example: the TermQueryIVARS struct
definition includes the `boost` member that is declared in Query.

Classes within the Clownfish runtime still have members defined within their
object structs and do not use IVARS.  I'm of two minds as to whether we should
go and update them to use IVARS for the sake of consistency, pound-defining
the IVARS_OFFSET symbols as integer constants by default but allowing for the
possibility of an override.  I'm not sure there's any use case where we'd need
a variable-sized object head.

Marvin Humphrey

Re: [lucy-dev] Member vars fragile ABI fix

Posted by Nick Wellnhofer <we...@aevum.de>.
On 16/07/2013 03:47, Marvin Humphrey wrote:
> What's interesting about those results is that on the modern CPU the
> micro-benchmark yields essentially identical results for C++ style fixed
> offset vtable dispatch, Clownfish-style variable offset vtable dispatch, or a
> saved raw function pointer, while on the older CPU the Clownfish-style
> variable offset dispatch performs slightly worse.

32 vs 64 bit might also play a role here.

>> OK, but this will mean to add MethodSpec structs for every method of a
>> class. I think it's best to use separate structs for novel, overridden, and
>> inherited methods then.
>
> I realize that may be a lot of code but until load-time latency becomes a
> problem, I think it's an acceptable implementation strategy.
>
> Would you like to work on this, or would you like me to take it on?

It won't be much additional code. I'll work on it.

Nick


Re: [lucy-dev] Member vars fragile ABI fix

Posted by Marvin Humphrey <ma...@rectangular.com>.
On Mon, Jul 15, 2013 at 4:50 AM, Nick Wellnhofer <we...@aevum.de> wrote:
> Micro-benchmarks are always a bit dangerous, but it seems that on modern
> CPUs our current implementation of method dispatch is really fast and
> probably hard to beat. I think a loop with a single method call took about
> 6-7 cycles per iteration (Ivy Bridge, x64). This is surprising since
> computing the method address alone requires three memory loads with a
> latency of 4-5 cycles each (the third load depending on the first two). But
> because of branch prediction, the CPU can take a speculative branch
> immediately without having to wait for the memory loads. The branch target
> is validated later, so the loads can be pipelined.

Here's an Intel Xeon E5430 from a few years ago running 32-bit CentOS 5.0:

    http://en.wikipedia.org/wiki/Xeon#5400-series_.22Harpertown.22

    $ make -f Makefile.linux
    LD_LIBRARY_PATH=. ./exe
    cycles/call with method ptr loop: 11.226522
    cycles/call with wrapper loop: 14.781987
    cycles/call with fixed offset wrapper loop: 10.538704
    cycles/call with wrapper: 19.622476
    cycles/call with simulated inline: 7.702859

Here's a more recent Intel Xeon E5620 running 64-bit CentOS 5.5:

    http://en.wikipedia.org/wiki/Xeon#3600.2F5600-series_.22Gulftown.22_.26_.22Westmere-EP.22

    $ make -f Makefile.linux
    LD_LIBRARY_PATH=. ./exe
    cycles/call with method ptr loop: 7.014678
    cycles/call with wrapper loop: 7.016887
    cycles/call with fixed offset wrapper loop: 7.014423
    cycles/call with wrapper: 10.520168
    cycles/call with simulated inline: 2.339327

What's interesting about those results is that on the modern CPU the
micro-benchmark yields essentially identical results for C++ style fixed
offset vtable dispatch, Clownfish-style variable offset vtable dispatch, or a
saved raw function pointer, while on the older CPU the Clownfish-style
variable offset dispatch performs slightly worse.

For what it's worth, the Clownfish "inside-out vtable" design is similar to
the techniques described by Dachuan Yu et al in 2002 -- benchmarks here:

  https://www.usenix.org/legacy/events/javavm02/yu/yu_html/node29.html

> My guess is that most of the time this happens, there will be another
> non-compatible API change anyway.

I would argue that there is still a large benefit in user interface simplicity
by making it impossible to break the ABI without also breaking the API.
Eliminating this last quirk is a big deal because it substantially reduces the
knowledge and mental effort required to write ABI-compatible code.

> OK, but this will mean to add MethodSpec structs for every method of a
> class. I think it's best to use separate structs for novel, overridden, and
> inherited methods then.

I realize that may be a lot of code but until load-time latency becomes a
problem, I think it's an acceptable implementation strategy.

Would you like to work on this, or would you like me to take it on?

Marvin Humphrey

Re: [lucy-dev] Member vars fragile ABI fix

Posted by Nick Wellnhofer <we...@aevum.de>.
On 14/07/2013 21:27, Marvin Humphrey wrote:
> I'm still tempted to do it for the sake of consistency.  It would be nice to
> be able to say "Clownfish object struct types are completely opaque.  Instance
> variables may only be accessed via IVARS."

We should say that anyway. Even if it is possible to get at the struct 
members of a Clownfish superclass, no other parcel should ever try that.

> Haha, I feel your irritation.  We've done all that work on thunks, which was
> largely motivated by the space issue!  By the way, your work on the thunk
> benchmarking has never come up on the dev list (branch
> `method-dispatch-benchmark`).  It was very illuminating!

Micro-benchmarks are always a bit dangerous, but it seems that on modern 
CPUs our current implementation of method dispatch is really fast and 
probably hard to beat. I think a loop with a single method call took 
about 6-7 cycles per iteration (Ivy Bridge, x64). This is surprising 
since computing the method address alone requires three memory loads 
with a latency of 4-5 cycles each (the third load depending on the first 
two). But because of branch prediction, the CPU can take a speculative 
branch immediately without having to wait for the memory loads. The 
branch target is validated later, so the loads can be pipelined.

In my experiments, precomputing the method address only saved a single 
cycle and sometimes wasn't faster at all. This is partly due to loop 
hoisting which might not be possible in real-life situations, but still 
surprising.

> I don't think the use case of moving a novel method to a superclass is all
> that rare.  It may not happen every day, but I'll bet it will happen multiple
> times over the lifetime of any project of significant size.

My guess is that most of the time this happens, there will be another 
non-compatible API change anyway.

> I think it's important that we not burden developers with having to understand
> that kind of edge case detail.  IMO, we have some serious work to do paring
> down the existing Clownfish header language, removing some experimental
> misfeatures that are no longer justified, and we should be reluctant to add
> any API complexity right now.  I think it's premature to compromise the API
> for the sake of a space issue which probably won't affect performance.  If
> you're amenable, can we please go with what we have now, with the
> understanding that we'll keep trying to find better implementations in the
> future?

OK, but this will mean to add MethodSpec structs for every method of a 
class. I think it's best to use separate structs for novel, overridden, 
and inherited methods then.

Nick


Re: [lucy-dev] Member vars fragile ABI fix

Posted by Marvin Humphrey <ma...@rectangular.com>.
On Sun, Jul 14, 2013 at 4:41 AM, Nick Wellnhofer <we...@aevum.de> wrote:
> On Jul 14, 2013, at 02:22 , Marvin Humphrey <ma...@rectangular.com> wrote:
>> Classes within the Clownfish runtime still have members defined within
>> their object structs and do not use IVARS.  I'm of two minds as to whether
>> we should go and update them to use IVARS for the sake of consistency,
>> pound-defining the IVARS_OFFSET symbols as integer constants by default but
>> allowing for the possibility of an override.  I'm not sure there's any use
>> case where we'd need a variable-sized object head.
>
> I can't think of a use case neither.

Yeah.  It would theoretically remove a barrier to using the same runtime
shared object with multiple hosts, but I don't think we ever want to do that.
Among other things, a full implementation of that would require changing our
dynamic method invocation functions to locate the vtable pointer at a variable
offset rather than an offset that's fixed at compile time.  And of course we
want the host to have full control over installation, so that you can upgrade
Clownfish in the Python 3 install tree and not affect your Perl install.

I'm still tempted to do it for the sake of consistency.  It would be nice to
be able to say "Clownfish object struct types are completely opaque.  Instance
variables may only be accessed via IVARS."

> Now it's time to finish the dynamic method offsets. I held it off because I
> hoped we could find a way to avoid the need for offset vars for every
> method. Having a single offset var per novel method would simplify the
> initialization of method offsets. I think the only downside is the thing
> with moving novel methods to a superclass. Maybe we should simply require
> manual intervention to support this use case? It should be rarely needed in
> practice.

Haha, I feel your irritation.  We've done all that work on thunks, which was
largely motivated by the space issue!  By the way, your work on the thunk
benchmarking has never come up on the dev list (branch
`method-dispatch-benchmark`).  It was very illuminating!

I don't think the use case of moving a novel method to a superclass is all
that rare.  It may not happen every day, but I'll bet it will happen multiple
times over the lifetime of any project of significant size.

I think it's important that we not burden developers with having to understand
that kind of edge case detail.  IMO, we have some serious work to do paring
down the existing Clownfish header language, removing some experimental
misfeatures that are no longer justified, and we should be reluctant to add
any API complexity right now.  I think it's premature to compromise the API
for the sake of a space issue which probably won't affect performance.  If
you're amenable, can we please go with what we have now, with the
understanding that we'll keep trying to find better implementations in the
future?

Marvin Humphrey

Re: [lucy-dev] Member vars fragile ABI fix

Posted by Nick Wellnhofer <we...@aevum.de>.
On Jul 14, 2013, at 02:22 , Marvin Humphrey <ma...@rectangular.com> wrote:

> The IVARS fix that Nick proposed for the fragile ABI problem posed by member
> variables has been implemented, on branch ivars-wip1.

Great stuff!

> Member variables are now parcel scoped.  With regards to visibility of member
> vars within a parcel, I made a judgment call that the IVARS structs for
> subclasses would include variables declared in superclasses if those
> superclasses are in the same parcel.  For example: the TermQueryIVARS struct
> definition includes the `boost` member that is declared in Query.

+1, this should safe a couple of accesses to IVARS structs.

> Classes within the Clownfish runtime still have members defined within their
> object structs and do not use IVARS.  I'm of two minds as to whether we should
> go and update them to use IVARS for the sake of consistency, pound-defining
> the IVARS_OFFSET symbols as integer constants by default but allowing for the
> possibility of an override.  I'm not sure there's any use case where we'd need
> a variable-sized object head.

I can't think of a use case neither.

Now it's time to finish the dynamic method offsets. I held it off because I hoped we could find a way to avoid the need for offset vars for every method. Having a single offset var per novel method would simplify the initialization of method offsets. I think the only downside is the thing with moving novel methods to a superclass. Maybe we should simply require manual intervention to support this use case? It should be rarely needed in practice.

Nick