You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@lucy.apache.org by Marvin Humphrey <ma...@rectangular.com> on 2013/10/10 06:03:38 UTC

[lucy-dev] Extensibility of Clownfish runtime core

On Sat, Sep 14, 2013 at 12:29 PM,  <nw...@apache.org> wrote:
> Replace *_Dec_RefCount with DECREF

> Project: http://git-wip-us.apache.org/repos/asf/lucy/repo
> Commit: http://git-wip-us.apache.org/repos/asf/lucy/commit/6f48dc35
> Tree: http://git-wip-us.apache.org/repos/asf/lucy/tree/6f48dc35
> Diff: http://git-wip-us.apache.org/repos/asf/lucy/diff/6f48dc35

This change reminds me of Python and how you call `str(obj)` instead of
invoking `obj.__to_string__()`.  It's an elegant system, and maybe it might be
nice to borrow some other aspects.

    $ python3
    Python 3.3.0 (default, Oct  3 2012, 17:10:41)
    [GCC 4.2.1 Compatible Apple Clang 4.0
((tags/Apple/clang-421.0.60))] on darwin
    Type "help", "copyright", "credits" or "license" for more information.
    >>> from pprint import pprint
    >>> obj = object()
    >>> pprint(dir(obj))
    ['__class__',
     '__delattr__',
     '__dir__',
     '__doc__',
     '__eq__',
     '__format__',
     '__ge__',
     '__getattribute__',
     '__gt__',
     '__hash__',
     '__init__',
     '__le__',
     '__lt__',
     '__ne__',
     '__new__',
     '__reduce__',
     '__reduce_ex__',
     '__repr__',
     '__setattr__',
     '__sizeof__',
     '__str__',
     '__subclasshook__']
    >>>

For what it's worth, DECREF and INCREF perform NULL-checks, while invoking the
methods directly do not.  In all the cases where Dec_RefCount() or
Inc_RefCount() had been invoked directly, the objects were guaranteed to be
non-NULL.  However, the cost of the extra NULL check is surely negligible and
so there's no harm in standardizing on DECREF and INCREF.

Something to consider is that there's little value in making the Clownfish
core runtime classes extensible.  Lucy subclasses Hash, but it's a nasty hack
and we could have just written our own hash table implementation from scratch
anyway.

What do you think of making all the core runtime classes final except for Obj?
There are advantages to limiting ourselves to a single carefully designed
extension point.

Marvin Humphrey

Re: [lucy-dev] Extensibility of Clownfish runtime core

Posted by Nick Wellnhofer <we...@aevum.de>.
On 27/12/2013 04:28, Marvin Humphrey wrote:
> Let's consider the costs and benefits of extendability for String.
>
> Allowing the String class to be extended enables certain kinds of problem
> solving approaches.  For example, the build tool Rake monkey patches the
> Ruby
> String class to add the `ext` method, a feature which we use in our
> Rakefiles:
>
>      http://rake.rubyforge.org/classes/String.html
>
>      filepath_o = filepath_c.ext(".o")
>
> What we're contemplating for Clownfish is actually less powerful than what
> Ruby facilitates, because we'd only allow subclassed instances to run the
> additional methods, while Ruby allows you to add methods to the original
> class.  In any case, I don't think that use case is very compelling because
> a
> pefectly plausible alternative is available: instead of extending string
> with
> new instance methods, provide the same functionality via inert functions in
> string manipulation libraries.
>
>      filepath_o = MyPathTools.ext(filepath_c, ".o")

Yes, that should be good enough.

> The problem we're up against is that Clownfish does some inherently
> ambitious
> things with object creation and destruction.  It has to be compatible with
> the
> memory management regimes of multiple host languages.  It has to provide
> conversion routines to and from the host language data types which operate
> at
> the C/host boundary.  Round-tripping through the conversion routines is
> already complicated enough, because the mapping from Clownfish data types to
> host data types is always imperfect.  I don't see how you implement a sane
> conversion routine which preserves subclassing for core data types.

I agree that this wouldn't work. Subclassing is also restricted by the 
fact that we don't allow access to instance variables from other parcels.

But extending the core classes can still be useful at the C level. Just 
look at NoCloneHash and ZombieKeyedHash.

> So, if we're going case-by-case, I'd advocate that we work hard to design
> the
> following types for extensibility by users...
>
>      Obj
>      Err
>
> ... and work to close everything else off.
>
> Marvin Humphrey
>

+0 for making the core classes final.

We shouldn't encourage users to extend the core classes but it might 
sometimes be useful if you know what you're doing.

Nick

Re: [lucy-dev] Extensibility of Clownfish runtime core

Posted by Marvin Humphrey <ma...@rectangular.com>.
Getting back to this issue...

On Wed, Oct 16, 2013 at 7:44 AM, Nick Wellnhofer <we...@aevum.de>
wrote:
>On 10/10/2013 06:03, Marvin Humphrey wrote:
>> Something to consider is that there's little value in making the
Clownfish
>> core runtime classes extensible.  Lucy subclasses Hash, but it's a nasty
hack
>> and we could have just written our own hash table implementation from
scratch
>> anyway.
>>
>> What do you think of making all the core runtime classes final except
for Obj?
>> There are advantages to limiting ourselves to a single carefully designed
>> extension point.
>
> This should probably be decided on a case-by-case basis. In
> Clownfish::String I would make every method final but it might still be
> useful to inherit from String and add new methods.

Let's consider the costs and benefits of extendability for String.

Allowing the String class to be extended enables certain kinds of problem
solving approaches.  For example, the build tool Rake monkey patches the
Ruby
String class to add the `ext` method, a feature which we use in our
Rakefiles:

    http://rake.rubyforge.org/classes/String.html

    filepath_o = filepath_c.ext(".o")

What we're contemplating for Clownfish is actually less powerful than what
Ruby facilitates, because we'd only allow subclassed instances to run the
additional methods, while Ruby allows you to add methods to the original
class.  In any case, I don't think that use case is very compelling because
a
pefectly plausible alternative is available: instead of extending string
with
new instance methods, provide the same functionality via inert functions in
string manipulation libraries.

    filepath_o = MyPathTools.ext(filepath_c, ".o")

Another possible use case for extending String might be to allow a subclass
to
add a marker property, as contemplated in this blog post:


http://www.nofluffjuststuff.com/blog/joe_walker/2007/04/java_7_idea_extensible_strings

However, adding markers is made more difficult by a core feature of
Clownfish:
automatic conversion of basic data types at the C/host border.  Since you
can't count on your subclass being preserved through that conversion
process,
the marker is easily lost.

Consider the principle that Joshua Bloch articulates in his book, _Effective
Java_:

    Item 17: Design and document for inheritance or else prohibit it

    Item 16 alerted you to the dangers of subclassing a “foreign” class that
    was not designed and documented for inheritance. So what does it mean
for
    a class to be designed and documented for inheritance?

    First, the class must document precisely the effects of overriding any
    method. In other words, the class must document its self-use of
    overridable methods. For each public or protected method or constructor,
    the documentation must indicate which overridable methods the method or
    constructor invokes, in what sequence, and how the results of each
    invocation affect subsequent processing. (By overridable, we mean
nonfinal
    and either public or protected.) More generally, a class must document
any
    circumstances under which it might invoke an overridable method. For
    example, invocations might come from background threads or static
    initializers.

The problem we're up against is that Clownfish does some inherently
ambitious
things with object creation and destruction.  It has to be compatible with
the
memory management regimes of multiple host languages.  It has to provide
conversion routines to and from the host language data types which operate
at
the C/host boundary.  Round-tripping through the conversion routines is
already complicated enough, because the mapping from Clownfish data types to
host data types is always imperfect.  I don't see how you implement a sane
conversion routine which preserves subclassing for core data types.

It's hard to "design and document for inheritance" under the best of
circumstances.  Other object systems close off core data types, notably
Java,
so it's perfectly legitimate for Clownfish to do so as well.  While I can
see
why one might appreciate Ruby-style extensibility, I think prohibiting it
for
most core data types is a more appropriate choice for Clownfish.  Not only
do
we save ourselves the time and trouble it would take to implement
extensibility well in the first place, the simplification allows us to
pursue
Clownfish's distinguishing features more effectively.

So, if we're going case-by-case, I'd advocate that we work hard to design
the
following types for extensibility by users...

    Obj
    Err

... and work to close everything else off.

Marvin Humphrey

Re: [lucy-dev] Extensibility of Clownfish runtime core

Posted by Nick Wellnhofer <we...@aevum.de>.
On 10/10/2013 06:03, Marvin Humphrey wrote:
> Something to consider is that there's little value in making the Clownfish
> core runtime classes extensible.  Lucy subclasses Hash, but it's a nasty hack
> and we could have just written our own hash table implementation from scratch
> anyway.
>
> What do you think of making all the core runtime classes final except for Obj?
> There are advantages to limiting ourselves to a single carefully designed
> extension point.

This should probably be decided on a case-by-case basis. In 
Clownfish::String I would make every method final but it might still be 
useful to inherit from String and add new methods.

Nick