You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@lucy.apache.org by Andreas Altergott <al...@mira-consulting.net> on 2009/03/13 22:55:28 UTC

Porting the Lucene benchmarking suite

Hi,

First thank you Marvin for taking the time and giving me some good
options to start supporting Lucy.

Probably the porting of the Lucene benchmark suite would be a good
starting point.  This way I will get a good overview of the project and
as a little bonus KinoSearch will also benefit from the work.

I will focus mainly on this task and port the classes
org.apache.lucene.benchmark.* to C and get them to work with
Lucy/KinoSearch.

I will keep you up to date about the development status with patches on
the mailing list.

But I'll also spend some time porting the test suite from Perl to C.  I
will probably get back about this soon with a few questions.  I was
already taking a look about the work done and what is still there to be
done.

Thank you for the suggestions. :-)


Regards,
Andreas
-- 
MIRA Consulting GmbH
Bruckwiesenstr. 1
D 72336 Balingen

T (+49) 7433 907231.0
F (+49) 7433 907231.18
http://www.mira-consulting.net
Sitz der Gesellschaft: Balingen
HRB 411184 Amtsgericht Stuttgart
Geschäftsführerin: Petra Hauschke


Re: Porting the Lucene benchmarking suite

Posted by Marvin Humphrey <ma...@rectangular.com>.
On Sat, Mar 14, 2009 at 04:15:46PM +0100, Andreas Altergott wrote:

> This test case has nothing to do with the benchmark suite, does it?  

That's right.  It's just a smaller task, and directly applicable to ongoing
discussions between Nate, Mike, and I on this list.

> If it doesn't then why is it preferable to write it in Perl?  

It might not be.  :)  

Either would be fine. 

> Later on we'll have to port it to C anyway.  

Within trunk/perl/t in the KinoSearch repository, you'll see the following.

  * A trunk/perl/t/core/ directory, with about 30 test files in it.  These
    are done; the .t files are just stubs.
  * A trunk/perl/t/binding directory, with a handful of files in it.  These
    are also done; their purpose is to test the Perl binding.
  * About 100 test files in t/ itself.  These files need to migrate to either
    t/core or t/binding.

I was thinking that this would be a "binding" test, but you're right --
pluggable deletions need to be tested in core.  We might also write tests to
verify that the binding works (and doesn't leak memory, etc), but that's more
superficial.

Marvin Humphrey


Re: Porting the Lucene benchmarking suite

Posted by Andreas Altergott <al...@mira-consulting.net>.
Hi,

Marvin Humphrey wrote:
> Might you also be interested in starting off with a pure-Perl test case to
> verify that the pluggable deletions apparatus works as it should?  If so, I'll
> try to write up the necessary docs.

Just to verify, that I don't get it wrong.  This test case has nothing
to do with the benchmark suite, does it?  If it doesn't then why is it
preferable to write it in Perl?  Later on we'll have to port it to C
anyway.  I don't mind doing it in Perl though :-)

I'll be glad to do it when you finish the docs.

> FYI, I had intended to use Perl, because this is mostly a scripting task and I
> wanted to get things done reasonably quickly.  C would give us the advantage
> of being able to test multiple bindings, but it's a lot of work up front.

Yes, developing in Perl will be much faster than C.  We still can port
the benchmark test to C later on if there'll be an urgent need for this.


Andreas


Re: Porting the Lucene benchmarking suite

Posted by Marvin Humphrey <ma...@rectangular.com>.
Hello Andreas,

Welcome.  :)

> First thank you Marvin for taking the time and giving me some good
> options to start supporting Lucy.

To put this in context, Andreas asked me off-list about options for
contributing to Lucy and/or KinoSearch.  (He's new to the library and doesn't
have any particular itches yet.)  I mentioned porting the KS test suite and
the Lucene benchmarking suite and recommended that we discuss the matter on
lucy-dev.

> Probably the porting of the Lucene benchmark suite would be a good
> starting point.  This way I will get a good overview of the project and
> as a little bonus KinoSearch will also benefit from the work.

Being able to benchmark Lucy (eventually) and KinoSearch would be very handy.
Just this week the discussion on Matcher ran into problems because we lack
search-time benchmarking capabilities.

However, please be aware that taking on the benchmarker is a bit of an
ambitious task, particularly since KinoSearch and Lucene have many API
differences.

Might you also be interested in starting off with a pure-Perl test case to
verify that the pluggable deletions apparatus works as it should?  If so, I'll
try to write up the necessary docs.

> I will focus mainly on this task and port the classes
> org.apache.lucene.benchmark.* to C and get them to work with
> Lucy/KinoSearch.

FYI, I had intended to use Perl, because this is mostly a scripting task and I
wanted to get things done reasonably quickly.  C would give us the advantage
of being able to test multiple bindings, but it's a lot of work up front.

If we do go with C, I think we should only worry about GCC on modern Unixen
for the time being.  

> But I'll also spend some time porting the test suite from Perl to C.  I will
> probably get back about this soon with a few questions.  I was already taking
> a look about the work done and what is still there to be done.                               

Core tests live in trunk/core/KinoSearch/Test/ and generate TAP output like
ordinary Perl tests. [1]

The testing infrastructure lives in trunk/charmonizer/src/Charmonizer/Test.*
and trunk/core/KinoSearch/Test.bp.  Documentation is lacking; however, there's
not that much to it.  (Writing a simple TAP-producing test harness isn't very
difficult.)

To run a core test using the Perl bindings, you need an entry in
trunk/perl/lib/KinoSearch/Test.pm and an individual test file a la
trunk/perl/t/core/013-bit_vector.t.  

Marvin Humphrey

[1] Test Anything Protocol: <http://testanything.org/>