You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@lucy.apache.org by Nathan Kurz <na...@verse.com> on 2011/11/21 23:28:33 UTC

[lucy-dev] Scorers and Formats and Indexes, Oh-My!

A problem that I keep coming back to is how to allow custom Scorers to
work efficiently with custom Index formats.   For efficiency, you want
to provide direct access to the underlying data rather than requiring
multiple function calls per match, but you don't want to have to
subclass each Scorer for each Index.  Ideally, ou want every custom
Scorer to work with every new Index out of the box.

One solution is to come up with a common data format that each Scorer
uses, and have the Index capable of producing making that available to
the Scorer. I thought this article did a good job of explaining this
approach: http://fgiesen.wordpress.com/2011/11/21/buffer-centric-io/

It's essentially what I was envisioning, but also includes some
"tricks" that allow for easier error handling.  It's not directly
applicable to Lucy, but is in C and I thought it might be a good
starting point for defining terms and thinking about approaches.

--nate