You are viewing a plain text version of this content. The canonical link for it is here.
Posted to java-user@lucene.apache.org by Tom Burton-West <tb...@umich.edu> on 2013/06/18 18:48:03 UTC

TestGrouping.Java seems to combine multiple tests into one huge test

Hello,

I'm trying to understand BlockGroupingCollector.   I thought I would start
by running the tests in the debugger.  However the only test I can find is
lucene/grouping/src/test/org/apache/lucene/search/grouping/TestGrouping.java


In TestGrouping.java,  in the second test, "testRandom" it seems like a
fairly large number of things are tested, including a test of
BlockGroupingCollector.

I'm having a hard time understanding the tests and wonder if perhaps
TestGrouping.java might be refactored to make it easier to understand.

Is there some reason that separate tests are not broken out?
I also could not find setup and teardown methods.  Is there some reason
they are not being used?
Would it make sense to open a JIRA issue to refactor  TestGrouping.java to
make a clearer separation between different things being tested?
In particular, perhaps BlockGroupingCollector could have a separate test?

Tom

Re: TestGrouping.Java seems to combine multiple tests into one huge test

Posted by Michael McCandless <lu...@mikemccandless.com>.
On Thu, Jun 20, 2013 at 7:09 PM, Tom Burton-West <tb...@umich.edu> wrote:
> Sine my objective is to undersdand BlockGroupingCollector, I thought I
> would extract a test for BlockGroupingCollector, maybe in another file as
> Robert suggested.

I think that makes sense!

> I was looking at testRandom because that is where BlockGroupingCollector is
> tested, but it looks to me like some trickery is involved creating the
> block join type index  near the comment where it says" // Build 2nd index,
> where docs are added in blocks by group, so we can use single pass
> collector "
>
> In testBasic, I can see the call to indexWriter.addDocuments(), but I don't
> see that happening anywhere in testRandom.

It's in the confusingly named getDocBlocksReader method.  That method
takes the GroupDoc array and uses IW.addDocuments to add each group as
a block of documents.

> Should I just be looking at testBasic for a simple/understandable way to
> create a block-join index for later testing of the BlockGroupingCollector?

Hmm but testBasic seems not to test BlockGroupingCollector.  Maybe the
first step is to write testBasic in your new TestBlockGrouping...

> Also I'm a bit confused by the use of Doc Values.  Is that necessary to use
> the BlockGroupingCollector, or are they in there in order to test some
> feature related to Doc Values and the Collectors?

Not necessary; doc values are mixed into that test because we can do
non-block grouping by doc values.

Mike McCandless

http://blog.mikemccandless.com

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Re: TestGrouping.Java seems to combine multiple tests into one huge test

Posted by Tom Burton-West <tb...@umich.edu>.
Sine my objective is to undersdand BlockGroupingCollector, I thought I
would extract a test for BlockGroupingCollector, maybe in another file as
Robert suggested.

I was looking at testRandom because that is where BlockGroupingCollector is
tested, but it looks to me like some trickery is involved creating the
block join type index  near the comment where it says" // Build 2nd index,
where docs are added in blocks by group, so we can use single pass
collector "

In testBasic, I can see the call to indexWriter.addDocuments(), but I don't
see that happening anywhere in testRandom.

Should I just be looking at testBasic for a simple/understandable way to
create a block-join index for later testing of the BlockGroupingCollector?

Also I'm a bit confused by the use of Doc Values.  Is that necessary to use
the BlockGroupingCollector, or are they in there in order to test some
feature related to Doc Values and the Collectors?

Tom


On Tue, Jun 18, 2013 at 1:14 PM, Michael McCandless <
lucene@mikemccandless.com> wrote:

> +1 to somehow refactor this scary test to make it more understandable!
>
> Mike McCandless
>
> http://blog.mikemccandless.com
>
>
> On Tue, Jun 18, 2013 at 12:48 PM, Tom Burton-West <tb...@umich.edu>
> wrote:
> > Hello,
> >
> > I'm trying to understand BlockGroupingCollector.   I thought I would
> start
> > by running the tests in the debugger.  However the only test I can find
> is
> >
> lucene/grouping/src/test/org/apache/lucene/search/grouping/TestGrouping.java
> >
> >
> > In TestGrouping.java,  in the second test, "testRandom" it seems like a
> > fairly large number of things are tested, including a test of
> > BlockGroupingCollector.
> >
> > I'm having a hard time understanding the tests and wonder if perhaps
> > TestGrouping.java might be refactored to make it easier to understand.
> >
> > Is there some reason that separate tests are not broken out?
> > I also could not find setup and teardown methods.  Is there some reason
> > they are not being used?
> > Would it make sense to open a JIRA issue to refactor  TestGrouping.java
> to
> > make a clearer separation between different things being tested?
> > In particular, perhaps BlockGroupingCollector could have a separate test?
> >
> > Tom
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org
>
>

Re: TestGrouping.Java seems to combine multiple tests into one huge test

Posted by Tom Burton-West <tb...@umich.edu>.
Thanks Mike and Robert,

>>Refactoring this test would be fantastic, but I wouldn't want to take it
on
:)

>>Maybe an easier step would be to rename this test something like
TestRandomGrouping, and add some brand new very simple-easy-to-understand
tests to a new file(s).

I opened LUCENE-5065.  I'll try your "easier step. Since the reason I'm
looking at the test is that I'm trying to understand
BlockGroupingCollector, I'll start by trying to just pull out as simple a
test as I can figure out for BlockGroupingCollector and add that.   It may
take me a while, since I need to walk through the code a lot more before I
understand it enough to feel confident that I can pull out an appropriate
test.

Tom



On Tue, Jun 18, 2013 at 1:14 PM, Michael McCandless <
lucene@mikemccandless.com> wrote:

> +1 to somehow refactor this scary test to make it more understandable!
>
> Mike McCandless
>
> http://blog.mikemccandless.com
>
>
> On Tue, Jun 18, 2013 at 12:48 PM, Tom Burton-West <tb...@umich.edu>
> wrote:
> > Hello,
> >
> > I'm trying to understand BlockGroupingCollector.   I thought I would
> start
> > by running the tests in the debugger.  However the only test I can find
> is
> >
> lucene/grouping/src/test/org/apache/lucene/search/grouping/TestGrouping.java
> >
> >
> > In TestGrouping.java,  in the second test, "testRandom" it seems like a
> > fairly large number of things are tested, including a test of
> > BlockGroupingCollector.
> >
> > I'm having a hard time understanding the tests and wonder if perhaps
> > TestGrouping.java might be refactored to make it easier to understand.
> >
> > Is there some reason that separate tests are not broken out?
> > I also could not find setup and teardown methods.  Is there some reason
> > they are not being used?
> > Would it make sense to open a JIRA issue to refactor  TestGrouping.java
> to
> > make a clearer separation between different things being tested?
> > In particular, perhaps BlockGroupingCollector could have a separate test?
> >
> > Tom
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org
>
>

Re: TestGrouping.Java seems to combine multiple tests into one huge test

Posted by Michael McCandless <lu...@mikemccandless.com>.
+1 to somehow refactor this scary test to make it more understandable!

Mike McCandless

http://blog.mikemccandless.com


On Tue, Jun 18, 2013 at 12:48 PM, Tom Burton-West <tb...@umich.edu> wrote:
> Hello,
>
> I'm trying to understand BlockGroupingCollector.   I thought I would start
> by running the tests in the debugger.  However the only test I can find is
> lucene/grouping/src/test/org/apache/lucene/search/grouping/TestGrouping.java
>
>
> In TestGrouping.java,  in the second test, "testRandom" it seems like a
> fairly large number of things are tested, including a test of
> BlockGroupingCollector.
>
> I'm having a hard time understanding the tests and wonder if perhaps
> TestGrouping.java might be refactored to make it easier to understand.
>
> Is there some reason that separate tests are not broken out?
> I also could not find setup and teardown methods.  Is there some reason
> they are not being used?
> Would it make sense to open a JIRA issue to refactor  TestGrouping.java to
> make a clearer separation between different things being tested?
> In particular, perhaps BlockGroupingCollector could have a separate test?
>
> Tom

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Re: TestGrouping.Java seems to combine multiple tests into one huge test

Posted by Robert Muir <rc...@gmail.com>.
On Tue, Jun 18, 2013 at 9:48 AM, Tom Burton-West <tb...@umich.edu> wrote:

> Hello,
>
> I'm trying to understand BlockGroupingCollector.   I thought I would start
> by running the tests in the debugger.  However the only test I can find is
>
> lucene/grouping/src/test/org/apache/lucene/search/grouping/TestGrouping.java
>
>
> In TestGrouping.java,  in the second test, "testRandom" it seems like a
> fairly large number of things are tested, including a test of
> BlockGroupingCollector.
>
> I'm having a hard time understanding the tests and wonder if perhaps
> TestGrouping.java might be refactored to make it easier to understand.
>
> Is there some reason that separate tests are not broken out?
> I also could not find setup and teardown methods.  Is there some reason
> they are not being used?
> Would it make sense to open a JIRA issue to refactor  TestGrouping.java to
> make a clearer separation between different things being tested?
> In particular, perhaps BlockGroupingCollector could have a separate test?
>

+1, I think it would be great to simplify this. This test seems to be
testing the entire kitchen sink: grouping by function, simulating
distributed grouping, grouping by docvalues, etc.

Refactoring this test would be fantastic, but I wouldn't want to take it on
:)

Maybe an easier step would be to rename this test something like
TestRandomGrouping, and add some brand new very simple-easy-to-understand
tests to a new file(s).

Something that looks more like
http://svn.apache.org/repos/asf/lucene/dev/trunk/lucene/core/src/test/org/apache/lucene/search/TestSort.java:)