You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@lucy.apache.org by Nick Wellnhofer <we...@aevum.de> on 2013/06/04 16:10:01 UTC

Re: [lucy-dev] C library documentation

On May 11, 2013, at 18:13 , Nick Wellnhofer <we...@aevum.de> wrote:

> There are some more classes, methods, and constructors which I think should be made public to be included in the documentation. See branch 'clownfish-public' for my suggestions.

Another thing I noticed is that the DocuComments for the Perl constructors are taken from the 'init' functions, not the 'new' functions. What's the rationale behind this? For the C library, we have to document primarily the 'new' functions. I could add a special case to the code that generates the C documentation, but it would make more sense to me to move the DocuComments from 'init' to 'new'.

Nick


Re: [lucy-dev] C library documentation

Posted by Peter Karman <pe...@peknet.com>.
Nick Wellnhofer wrote on 6/7/13 5:15 PM:
> But from a practical point of view, most host languages
> will use 'new' as constructor. So it would simplify things if we moved the
> constructor's documentation to the 'new' function and use an alias only for
> languages that have constructors with different names.
> 

+1



-- 
Peter Karman  .  http://peknet.com/  .  peter@peknet.com

Re: [lucy-dev] C library documentation

Posted by Nick Wellnhofer <we...@aevum.de>.
On Jun 9, 2013, at 08:41 , Marvin Humphrey <ma...@rectangular.com> wrote:

> If other folks are not satisfied with having docs attached to `init`, but I'm
> not satisfied with having them attached to `new`, can we please keep trying to
> find consensus for a little while longer?

OK, then what about writing individual documentation for all the 'new' and 'init' functions?

> I'm not enthusiastic about making exact duplicates of the documentation for
> every constructor, though. :(  That's the kind of ugliness they have to accept
> in Java because of signature overloading, but it would be nice if we could
> avoid it.

Generally, I can see three options:

    * Repeat all params in the 'new' and 'init' docs
    * Let the doc for 'new' refer to the 'init' params
    * Let the doc for 'init' refer to the 'new' params

I don't have a problem with duplicating the parameter descriptions. Redundancy in documentation can be a good thing, IMO. But I'm fine with any solution. We only need some docs for the C constructors, even if it's simply:

   "Constructor. See `init` for a description of the parameters."

Nick


Re: [lucy-dev] C library documentation

Posted by Marvin Humphrey <ma...@rectangular.com>.
On Fri, Jun 7, 2013 at 3:15 PM, Nick Wellnhofer <we...@aevum.de> wrote:
> It does make sense. But from a practical point of view, most host languages
> will use 'new' as constructor. So it would simplify things if we moved the
> constructor's documentation to the 'new' function and use an alias only for
> languages that have constructors with different names.

There are some practical problems with the approach of moving the constructor
documentation from `init` to `new`.

First, abstract classes, e.g. Lucy::Search::Query, don't *have* `new` -- they
only have `init`.  We could add a `new` function to such abstract classes, but
calling it would just result in an unavoidable runtime exception.  I question
whether it's for the best to increase library size and create "attractive
nuisance" functions which trick people into writing code that compiles but
inevitably crashes out, solely because we need something to attach our
documentation to.  (I know that's not your intent, either -- it's just a side
effect of the proposal.)

Second, Clownfish doesn't currently enforce the relationship between `new` and
`init`, so it's possible that the documentation may not be sync'd.  The Perl
bindings will provide a constructor called `new` -- because naming
constructors `new` is idiomatic for Perl -- but when you call Foo->new from
Perl, behind the scenes Foo_init() will be invoked, *not* Foo_new().  The same
will be true for Ruby and Python -- because Foo_new() does not support
subclassing, while Foo_init() does.  (Incidentally, Python constructors don't
use `new`, they use the class object as a function: `x = MyClass()` -- see
<http://docs.python.org/3/tutorial/classes.html#class-objects>.)

If other folks are not satisfied with having docs attached to `init`, but I'm
not satisfied with having them attached to `new`, can we please keep trying to
find consensus for a little while longer?

> I think it would also be more consistent for classes with multiple
> constructors.  In this case, we have to document the additional constructors
> because there aren't corresponding 'init' functions.

At the core level, Clownfish doesn't currently differentiate between inert
functions, whether they're named `new`, `init`, `decode_utf8_char`, `freeze`,
or whatever.  (The Perl level treats `init` specially, though.)

I think it makes sense to make `init` special -- like constructors in C++ or
Java, like `initialize` in Ruby and like `__init__` in Python.  Maybe we
should be trying to figure out some syntax, keyword, or capitalization scheme
to make *multiple* constructors special?  (They achieve that in Java etc. by
using the class name in conjunction with signature overloading.)

> But it's no problem to work with the current system. I can simply copy the
> 'init' documentation if 'new' doesn't have one.

I'm not enthusiastic about making exact duplicates of the documentation for
every constructor, though. :(  That's the kind of ugliness they have to accept
in Java because of signature overloading, but it would be nice if we could
avoid it.

Marvin Humphrey

Re: [lucy-dev] C library documentation

Posted by Nick Wellnhofer <we...@aevum.de>.
On Jun 5, 2013, at 02:29 , Marvin Humphrey <ma...@rectangular.com> wrote:

> There are a few odd cases which make the situation a little more complicated:
> 
> *   Abstract classes define `init` but not `new`.  (At the C level, at least.
>    The Perl bindings are different.)
> *   Some classes have no constructors: BoolNum, HashTombstone.
> *   Some classes need many custom constructors: CharBuf, Err.
> *   Several classes present constructors (named "open" by convention) which
>    attempt to return NULL and set an error variable on failure rather than
>    throw exceptions.
> 
> However, I don't think those oddities spoil the rationale.
> 
> Does that make sense?

It does make sense. But from a practical point of view, most host languages will use 'new' as constructor. So it would simplify things if we moved the constructor's documentation to the 'new' function and use an alias only for languages that have constructors with different names.

I think it would also be more consistent for classes with multiple constructors. In this case, we have to document the additional constructors because there aren't corresponding 'init' functions.

But it's no problem to work with the current system. I can simply copy the 'init' documentation if 'new' doesn't have one.

Nick


Re: [lucy-dev] C library documentation

Posted by Marvin Humphrey <ma...@rectangular.com>.
On Tue, Jun 4, 2013 at 7:10 AM, Nick Wellnhofer <we...@aevum.de> wrote:
> Another thing I noticed is that the DocuComments for the Perl constructors
> are taken from the 'init' functions, not the 'new' functions. What's the
> rationale behind this?

Our Perl constructors, which are named `new` by default, actually wrap `init`
rather than `new`.

The `init` functions allows us to supply objects blessed into arbitrary
classes at runtime.  In contrast, the vast majority of `new` functions defined
in core C code are convenience wrappers which allocate a blank object and then
immediately invoke `init` with the arguments which were passed in.  They are
shorter, but less flexible.

    // Equivalent:
    Hash *hash = Hash_init((Hash*)VTable_Make_Obj(HASH), 0);
    Hash *hash = Hash_new(0);

Other languages have special initialization constructors with essentially the
same behavior as our `init`: Ruby's `initialize`, Python's `__init__`, etc.

    http://ruby.about.com/od/oo/ss/Instantiation-And-The-Initialize-Method.htm
    http://docs.python.org/3/reference/datamodel.html?highlight=__init__#object.__init__

> For the C library, we have to document primarily the
> 'new' functions. I could add a special case to the code that generates the C
> documentation, but it would make more sense to me to move the DocuComments
> from 'init' to 'new'.

Now that you've brought this up and forced me to think it through... Perhaps
we should consider instead formalizing our commitment to `init` and keeping
the docs there.  In addition, maybe we should start autogenerating `new`
implicitly, allowing us to delete a few lines each across a broad number of
files.

There are a few odd cases which make the situation a little more complicated:

*   Abstract classes define `init` but not `new`.  (At the C level, at least.
    The Perl bindings are different.)
*   Some classes have no constructors: BoolNum, HashTombstone.
*   Some classes need many custom constructors: CharBuf, Err.
*   Several classes present constructors (named "open" by convention) which
    attempt to return NULL and set an error variable on failure rather than
    throw exceptions.

However, I don't think those oddities spoil the rationale.

Does that make sense?

Marvin Humphrey