You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@lucy.apache.org by Nick Wellnhofer <we...@aevum.de> on 2017/01/02 16:43:36 UTC

Re: [lucy-dev] Clownfish interfaces

On 28/12/2016 03:39, Marvin Humphrey wrote:
> Ruminating on my biases... I have found "fat pointers" hard to get used to
> because, honestly, I'm accustomed to casting of objects and containers without
> reallocation,

Even if fat pointers are two-element structs, they can be passed by value. 
There's no need for dynamic allocation.

> which imposes the requirement that objects be pointers and that
> all pointers be the same size.  Being able to cast `Obj**` to `Query**`
> without cost or with only CPU cost for run-time type checking is something
> that seems very natural.
>
> By now, I've done enough Go programming that the construct isn't as foreign.
> Still, the constraint that native Clownfish objects are struct pointers seems
> reasonable.

I still like the idea of fat pointers as an additional optimization. But it 
would be confusing to require an asterisk for normal objects (Obj*) but not 
for interface objects.

> This overview should be helpful:
>
>     https://en.wikipedia.org/wiki/Java_performance

This page seems to indicate that HotSpot uses a linear search over an array:

     https://wiki.openjdk.java.net/display/HotSpot/InterfaceCalls

But this probably isn't a performance problem for JVMs because of inline caching.

> The first approach seems closer to what Clownfish does now.  I think that's
> what you're leaning towards, right?  +1

I'd prefer the second approach but to accommodate languages like Rust, we'll 
probably have to go with the first option.

> Regarding backwards compatibility for Lucy, there is a certain amout of
> functionality which currently requires subclassing and overriding of methods.
> Off the top of my head, QueryParser, IndexManager, Schema, FieldType and
> Similarity all have such methods.

We should start with a list of classes that we officially allow to be 
subclassed now and that need to be reworked. In addition to the ones you 
mentioned, there are Analyzer, Highlighter, and Query/Compiler/Matcher.

> A general technique to solve this problem is to use composition: instead of
> expecting people to subclass IndexManager and override Recycle(), create a
> SegmentRecycler interface and allow customization through supplying a custom
> SegmentRecycler to IndexManager.

Another problem is that Deserialize and Load currently work on a blank object 
created with Class_Make_Obj. This would break with host-language interface 
objects but could be solved elegantly with class methods, i. e. methods that 
dynamically dispatch on a Class object. Instead of

     Query *query = (Query*)Class_Make_Obj(query_class);
     query = Query_Deserialize(query, instream);

we would write

     Query *query = Query_Deserialize(query_class);

Many dynamic languages like Smalltalk, Objective-C, Perl, Python, or Ruby 
support class methods, and it shouldn't be hard to support them in the 
Clownfish core. For languages like Go or Rust, we could create a separate 
interface hierarchy for class methods that are dispatched on singleton class 
objects. See this commit which is also useful for conversion from Clownfish to Go:

 
https://github.com/nwellnhof/lucy-clownfish/commit/30ed13800d10a3ff551a9f23ec288a04d5516911

Nick