You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@lucy.apache.org by Marvin Humphrey <ma...@rectangular.com> on 2013/01/16 17:11:43 UTC

[lucy-dev] Fwd: [lucy-user] Add delete_by_doc_id to Lucy::Index::Indexer

Greets,

Dag Lem proposes to add delete_by_doc_id() to Indexer.  Since this means
a change to the public API, I'm forwarding the message to the Lucy dev list.

Deleting by doc id is something of an expert feature.  I'm not clear on what
the specific use case is and there may be workarounds, but I can certainly
imagine how it would come up from time to time.  I doubt that the underlying
implementation is likely to need changing any time soon, so I don't think this
addition limits our flexibility much.

Indexer is a high-profile public class and we would like to keep its API
small, so another approach would be to expose the DeletionsWriter
subcomponent.  However, IMO document deletion is central enough to Indexer's
purpose to justify top-level convenience methods.

+1 from me for adding delete_by_doc_id() to Indexer.

Marvin Humphrey

---------- Forwarded message ----------
From: Dag Lem <da...@nimrod.no>
Date: Wed, Jan 16, 2013 at 7:14 AM
Subject: [lucy-user] Add delete_by_doc_id to Lucy::Index::Indexer
To: user@lucy.apache.org


Hi,

While attempting to modify an index I found that I missed a function
to delete a document by it's ID through Lucy::Index::Indexer (only
delete_by_term and delete_by_query are available).

Please find attached a patch - is this OK for inclusion?

--
Best regards,

Dag Lem

Re: [lucy-dev] Fwd: [lucy-user] Add delete_by_doc_id to Lucy::Index::Indexer

Posted by Peter Karman <pe...@peknet.com>.
Marvin Humphrey wrote on 1/16/13 10:11 AM:

> +1 from me for adding delete_by_doc_id() to Indexer.
> 

+1


-- 
Peter Karman  .  http://peknet.com/  .  peter@peknet.com

Re: [lucy-dev] [lucy-user] Add delete_by_doc_id to Lucy::Index::Indexer

Posted by Dan Markham <dm...@gmail.com>.
+1

On Jan 16, 2013, at 8:11 AM, Marvin Humphrey <ma...@rectangular.com> wrote:

> +1 from me for adding delete_by_doc_id() to Indexer.


Re: [lucy-dev] [lucy-user] Add delete_by_doc_id to Lucy::Index::Indexer

Posted by Marvin Humphrey <ma...@rectangular.com>.
On Wed, Jan 16, 2013 at 9:13 AM, Dan Markham <dm...@gmail.com> wrote:
> A change like this to the public API. Would this cause every host  binding
> to *have* to be updated at the same time to support it?

Clownfish was recently changed to generate host bindings for all methods by
default, so long as it can figure out the type mapping.  The
delete_by_doc_id() method signature is simple, so as soon as we add it to
Indexer.cfh/Indexer.c it will automatically become usable from Perl (albeit
hidden).  In contrast, methods whose signatures include non-mappable types
such as `void*`, `int32_t*`, `char*`, must have bindings coded manually.

However, Clownfish will not generate host **documentation** by default for any
class or method -- because it's better to accidentally omit something from a
public API than it is to accidentally expose something.  Every symbol to be
documented must be whitelisted explicitly for each host.

This allows us to consider at leisure whether there might be a better host API
for a given method, rather than forcing us to consider the consequences of
making an API public across all hosts simultaneously the moment a method is
committed to core.

There is actually a lot of functionality in Lucy which is ostensibly public
at the Clownfish level but which is not exposed via Perl.  Most of the time
this is because a class is a work-in-progress and we're abusing the fact that
documentation will not be generated.  It would be good to clean those up so
that only stuff which really is public gets labeled as `public`.

Marvin Humphrey

Re: [lucy-dev] [lucy-user] Add delete_by_doc_id to Lucy::Index::Indexer

Posted by Dan Markham <dm...@gmail.com>.
Silly question.

A change like this to the public API. Would this cause every host  binding to *have* to be updated at the same time to support it?


-Dan


On Jan 16, 2013, at 8:11 AM, Marvin Humphrey <ma...@rectangular.com> wrote:

> Greets,
> 
> Dag Lem proposes to add delete_by_doc_id() to Indexer.  Since this means
> a change to the public API, I'm forwarding the message to the Lucy dev list.
> 
> Deleting by doc id is something of an expert feature.  I'm not clear on what
> the specific use case is and there may be workarounds, but I can certainly
> imagine how it would come up from time to time.  I doubt that the underlying
> implementation is likely to need changing any time soon, so I don't think this
> addition limits our flexibility much.
> 
> Indexer is a high-profile public class and we would like to keep its API
> small, so another approach would be to expose the DeletionsWriter
> subcomponent.  However, IMO document deletion is central enough to Indexer's
> purpose to justify top-level convenience methods.
> 
> +1 from me for adding delete_by_doc_id() to Indexer.
> 
> Marvin Humphrey
> 
> ---------- Forwarded message ----------
> From: Dag Lem <da...@nimrod.no>
> Date: Wed, Jan 16, 2013 at 7:14 AM
> Subject: [lucy-user] Add delete_by_doc_id to Lucy::Index::Indexer
> To: user@lucy.apache.org
> 
> 
> Hi,
> 
> While attempting to modify an index I found that I missed a function
> to delete a document by it's ID through Lucy::Index::Indexer (only
> delete_by_term and delete_by_query are available).
> 
> Please find attached a patch - is this OK for inclusion?
> 
> --
> Best regards,
> 
> Dag Lem
> <Lucy-0.3.2-delete_by_doc_id.patch>


Re: [lucy-dev] Fwd: [lucy-user] Add delete_by_doc_id to Lucy::Index::Indexer

Posted by Nick Wellnhofer <we...@aevum.de>.
On Jan 16, 2013, at 17:11 , Marvin Humphrey <ma...@rectangular.com> wrote:

> +1 from me for adding delete_by_doc_id() to Indexer.

+1 from me, too.

Nick