You are viewing a plain text version of this content. The canonical link for it is here.

Posted to dev@lucene.apache.org by Ryan Ernst <ry...@iernst.net> on 2014/08/07 23:35:57 UTC

Test iterations

Only in the last month or so did I learn that -Dtests.iters doesn't
really "work".  What I mean is in regards to randomization.  Each
iteration currently is *exactly* the same as far as randomization
(each iteration uses the same master seed).  And because of this, I
understand that different people have their own "beasting" scripts
that run the test essentially N times from a shell to force different
seeds in each iteration.

Why not create a different seed for each iteration when -Dtests.iters
is used?  This way the test would still spit out a reproducible run
line for a specific iteration, but each iteration would have good
randomization (so trying to hit a rare bug could be done with
-Dtests.iters).

I'm curious if there is history here as to why test iters is done this
way, or what peoples opinions are on moving towards the approach I
suggested above.

Thanks!
Ryan

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org

RE: Test iterations

Posted by Uwe Schindler <uw...@thetaphi.de>.

I opened https://issues.apache.org/jira/browse/LUCENE-5881

Uwe

-----
Uwe Schindler
H.-H.-Meier-Allee 63, D-28213 Bremen
http://www.thetaphi.de
eMail: uwe@thetaphi.de


> -----Original Message-----
> From: Uwe Schindler [mailto:uwe@thetaphi.de]
> Sent: Saturday, August 09, 2014 8:58 PM
> To: dev@lucene.apache.org
> Subject: RE: Test iterations
> 
> Slightly improved patch:
> forbidding tests.iters is not needed. It still makes sense to beast 20 rounds
> and each test repeated (with same static class seed) 20 times, too -> 400
> reps. Also more groovy-like loop.
> 
> -----
> Uwe Schindler
> H.-H.-Meier-Allee 63, D-28213 Bremen
> http://www.thetaphi.de
> eMail: uwe@thetaphi.de
> 
> 
> > -----Original Message-----
> > From: Uwe Schindler [mailto:uwe@thetaphi.de]
> > Sent: Saturday, August 09, 2014 8:08 PM
> > To: dev@lucene.apache.org
> > Subject: RE: Test iterations
> >
> > Hi,
> >
> > attached you will find the beaster:
> >
> > - Only modifies common-build.xml, so no inherit down (makes no sense
> > otherwise, as you would never run "ant beast-test" from top-level. So
> > you have to go to correct submodule and run "ant beast-test
> > -Dbeast.iters=n - Dtestcase=..." from there
> > - Uses "antcall" in a loop, invoking the internal dependency-less "-test"
> > target. My first impl used the test-macro directly, but this did not
> > work, because test-macro sets non-local properties, which are then
> > available on second round, causing errors or use always same seed.
> > Antcall creates a new project each time and runs tests.
> >
> > I can open an issue or just commit this :-)
> >
> > Uwe
> >
> > -----
> > Uwe Schindler
> > H.-H.-Meier-Allee 63, D-28213 Bremen
> > http://www.thetaphi.de
> > eMail: uwe@thetaphi.de
> >
> >
> > > -----Original Message-----
> > > From: Uwe Schindler [mailto:uwe@thetaphi.de]
> > > Sent: Friday, August 08, 2014 8:13 PM
> > > To: dev@lucene.apache.org
> > > Subject: RE: Test iterations
> > >
> > > Hi,
> > >
> > > I will look into that as a Groovy Skript: The main problem is: You
> > > cannot simply use <antcall/> in a loop, because this would also
> > > execute the dependencies on each run.
> > >
> > > My idea is to do the following:
> > > - maybe subclass antcall Task with Groovy (not sure if this is
> > > needed)
> > > - instantiate it with current project
> > > - execute dependent targets
> > > - execute the inner target multiple times: store the project
> > > properties first and restore them after execution. This is done,
> > > because ANT properties can only be set *once*. If you don't give a
> > > fixed test seed, each run would pick a new one (because the project
> > > properties are reset, so the seed from the previous execution is gone).
> > >
> > > Uwe
> > >
> > > -----
> > > Uwe Schindler
> > > H.-H.-Meier-Allee 63, D-28213 Bremen http://www.thetaphi.de
> > > eMail: uwe@thetaphi.de
> > >
> > >
> > > > -----Original Message-----
> > > > From: Ryan Ernst [mailto:ryan@iernst.net]
> > > > Sent: Friday, August 08, 2014 5:08 PM
> > > > To: dev@lucene.apache.org
> > > > Subject: Re: Test iterations
> > > >
> > > > Thanks for the extremely thorough answer, Dawid!  Entertaining as
> > > > always. :)
> > > >
> > > > > Should we provide this "beaster" in common-build?
> > > >
> > > > I would use it! It sounds like there is a lot of work involved in
> > > > making tests.iters work better with LuceneTestCase.  In the mean
> > > > time, this sounds like a quick solution that might not be as
> > > > efficient (multiple JVMs), but still better than having to come up with a
> > bash script?
> > > >
> > > > On Fri, Aug 8, 2014 at 7:28 AM, Michael McCandless
> > > > <lu...@mikemccandless.com> wrote:
> > > > > +1, this sounds awesome?
> > > > >
> > > > > Mike McCandless
> > > > >
> > > > > http://blog.mikemccandless.com
> > > > >
> > > > >
> > > > > On Fri, Aug 8, 2014 at 10:06 AM, Uwe Schindler <uw...@thetaphi.de>
> > > wrote:
> > > > >> Hi,
> > > > >>
> > > > >> We could emulate the same thing (the repeating beaster) with pure
> > > Ant:
> > > > >>
> > > > >> Just repeat the "test" target, which can be done using ant-contrib's
> > "for"
> > > > task or (much simplier) a groovy script using antcall on the test target.
> > > > >> Should we provide this "beaster" in common-build?
> > > > >>
> > > > >> "ant beast-tests -Dbeast.iter=100 -Dtestcase=..."
> > > > >>
> > > > >> Very easy to implement and makes it easier to use for the python
> > > > >> haters -
> > > > and comes embedded...
> > > > >>
> > > > >> Uwe
> > > > >>
> > > > >> -----
> > > > >> Uwe Schindler
> > > > >> H.-H.-Meier-Allee 63, D-28213 Bremen http://www.thetaphi.de
> > > > >> eMail: uwe@thetaphi.de
> > > > >>
> > > > >>
> > > > >>> -----Original Message-----
> > > > >>> From: Michael McCandless [mailto:lucene@mikemccandless.com]
> > > > >>> Sent: Friday, August 08, 2014 3:48 PM
> > > > >>> To: Lucene/Solr dev
> > > > >>> Subject: Re: Test iterations
> > > > >>>
> > > > >>> On Fri, Aug 8, 2014 at 9:35 AM, Uwe Schindler <uw...@thetaphi.de>
> > > > wrote:
> > > > >>> > Hi Dawid,
> > > > >>> >
> > > > >>> > Thanks for the very good explanation! Indeed the main problem
> > > > >>> > with
> > > > >>> tests.iters is the static initializers. Maybe put that
> > > > >>> explanation into the Wiki! I sometimes also need to remember it,
> > > > >>> so it should be
> > > > documented.
> > > > >>> >
> > > > >>> > One (only theoretical) way to solve the whole thing could be:
> > > > >>> > Load the class(es) in a separate classloader for every
> > > > >>> > repeated execution,... but of course this will very fast blow
> > > > >>> > up your permgen (java 6, 7) or anything else we don't know
> about
> > (java 8).
> > > > >>> > In fact the separate classloader approach is not different
> > > > >>> > from Mike's scripts, just that Mike's script creates a new
> > > > >>> > classloader by forking a new JVM. In fact I don't think the
> > > > >>> > separate classloader approach would be much faster, because
> > > > >>> > the class clones will all have separate compilation paths in
> > > > >>> > Hotspot, so Hotspot cannot share the same assembler code. So
> > > > >>> > except the JVM startup time, you gain nothing. Just permgen
> > > > >>> > issues :-)
> > > > >>>
> > > > >>> The big thing the python beasting scripts avoids is all the ant
> > > > >>> overhead to just get to the point where it actually spawns the
> > > > >>> JVM to run the test.  Really, that's all the beasting script does:
> > > > >>> directly spawn the JVM on the test runner (after running "ant
> > > > >>> test-compile" up
> > > > >>> front) and then parse its output/events.
> > > > >>>
> > > > >>> The distributed test runner, which uses rsync/ssh to run tests
> > > > >>> on N machines, is very different from the beasting script: it
> > > > >>> runs all Lucene's tests (instead of a single test over and over)
> > > > >>> across N JVMs on M machines.  It "cheats" by taking the union of
> > > > >>> all
> > > CLASSPATHs
> > > > ...
> > > > >>> but this is a huge win because it means all testing is fully
> > > > >>> concurrent, not just concurrent within one module.  This script
> > > > >>> can also repeat, which means once all lucene tests finish,
> > > > >>> re-en-queue all
> > > of
> > > > them again.
> > > > >>>
> > > > >>> Mike McCandless
> > > > >>>
> > > > >>> http://blog.mikemccandless.com
> > > > >>>
> > > > >>> ----------------------------------------------------------------
> > > > >>> ----
> > > > >>> - To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org For
> > > > >>> additional commands, e-mail: dev-help@lucene.apache.org
> > > > >>
> > > > >>
> > > > >> -----------------------------------------------------------------
> > > > >> ---- To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
> > > > >> For additional commands, e-mail: dev-help@lucene.apache.org
> > > > >>
> > > > >
> > > > > ------------------------------------------------------------------
> > > > > --- To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org For
> > > > > additional commands, e-mail: dev-help@lucene.apache.org
> > > > >
> > > >
> > > > --------------------------------------------------------------------
> > > > - To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org For
> > > > additional commands, e-mail: dev-help@lucene.apache.org
> > >
> > >
> > > ---------------------------------------------------------------------
> > > To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org For
> > > additional commands, e-mail: dev-help@lucene.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org

RE: Test iterations

Posted by Uwe Schindler <uw...@thetaphi.de>.

Slightly improved patch:
forbidding tests.iters is not needed. It still makes sense to beast 20 rounds and each test repeated (with same static class seed) 20 times, too -> 400 reps. Also more groovy-like loop.

-----
Uwe Schindler
H.-H.-Meier-Allee 63, D-28213 Bremen
http://www.thetaphi.de
eMail: uwe@thetaphi.de


> -----Original Message-----
> From: Uwe Schindler [mailto:uwe@thetaphi.de]
> Sent: Saturday, August 09, 2014 8:08 PM
> To: dev@lucene.apache.org
> Subject: RE: Test iterations
> 
> Hi,
> 
> attached you will find the beaster:
> 
> - Only modifies common-build.xml, so no inherit down (makes no sense
> otherwise, as you would never run "ant beast-test" from top-level. So you
> have to go to correct submodule and run "ant beast-test -Dbeast.iters=n -
> Dtestcase=..." from there
> - Uses "antcall" in a loop, invoking the internal dependency-less "-test"
> target. My first impl used the test-macro directly, but this did not work,
> because test-macro sets non-local properties, which are then available on
> second round, causing errors or use always same seed. Antcall creates a new
> project each time and runs tests.
> 
> I can open an issue or just commit this :-)
> 
> Uwe
> 
> -----
> Uwe Schindler
> H.-H.-Meier-Allee 63, D-28213 Bremen
> http://www.thetaphi.de
> eMail: uwe@thetaphi.de
> 
> 
> > -----Original Message-----
> > From: Uwe Schindler [mailto:uwe@thetaphi.de]
> > Sent: Friday, August 08, 2014 8:13 PM
> > To: dev@lucene.apache.org
> > Subject: RE: Test iterations
> >
> > Hi,
> >
> > I will look into that as a Groovy Skript: The main problem is: You
> > cannot simply use <antcall/> in a loop, because this would also
> > execute the dependencies on each run.
> >
> > My idea is to do the following:
> > - maybe subclass antcall Task with Groovy (not sure if this is needed)
> > - instantiate it with current project
> > - execute dependent targets
> > - execute the inner target multiple times: store the project
> > properties first and restore them after execution. This is done,
> > because ANT properties can only be set *once*. If you don't give a
> > fixed test seed, each run would pick a new one (because the project
> > properties are reset, so the seed from the previous execution is gone).
> >
> > Uwe
> >
> > -----
> > Uwe Schindler
> > H.-H.-Meier-Allee 63, D-28213 Bremen
> > http://www.thetaphi.de
> > eMail: uwe@thetaphi.de
> >
> >
> > > -----Original Message-----
> > > From: Ryan Ernst [mailto:ryan@iernst.net]
> > > Sent: Friday, August 08, 2014 5:08 PM
> > > To: dev@lucene.apache.org
> > > Subject: Re: Test iterations
> > >
> > > Thanks for the extremely thorough answer, Dawid!  Entertaining as
> > > always. :)
> > >
> > > > Should we provide this "beaster" in common-build?
> > >
> > > I would use it! It sounds like there is a lot of work involved in
> > > making tests.iters work better with LuceneTestCase.  In the mean
> > > time, this sounds like a quick solution that might not be as
> > > efficient (multiple JVMs), but still better than having to come up with a
> bash script?
> > >
> > > On Fri, Aug 8, 2014 at 7:28 AM, Michael McCandless
> > > <lu...@mikemccandless.com> wrote:
> > > > +1, this sounds awesome?
> > > >
> > > > Mike McCandless
> > > >
> > > > http://blog.mikemccandless.com
> > > >
> > > >
> > > > On Fri, Aug 8, 2014 at 10:06 AM, Uwe Schindler <uw...@thetaphi.de>
> > wrote:
> > > >> Hi,
> > > >>
> > > >> We could emulate the same thing (the repeating beaster) with pure
> > Ant:
> > > >>
> > > >> Just repeat the "test" target, which can be done using ant-contrib's
> "for"
> > > task or (much simplier) a groovy script using antcall on the test target.
> > > >> Should we provide this "beaster" in common-build?
> > > >>
> > > >> "ant beast-tests -Dbeast.iter=100 -Dtestcase=..."
> > > >>
> > > >> Very easy to implement and makes it easier to use for the python
> > > >> haters -
> > > and comes embedded...
> > > >>
> > > >> Uwe
> > > >>
> > > >> -----
> > > >> Uwe Schindler
> > > >> H.-H.-Meier-Allee 63, D-28213 Bremen http://www.thetaphi.de
> > > >> eMail: uwe@thetaphi.de
> > > >>
> > > >>
> > > >>> -----Original Message-----
> > > >>> From: Michael McCandless [mailto:lucene@mikemccandless.com]
> > > >>> Sent: Friday, August 08, 2014 3:48 PM
> > > >>> To: Lucene/Solr dev
> > > >>> Subject: Re: Test iterations
> > > >>>
> > > >>> On Fri, Aug 8, 2014 at 9:35 AM, Uwe Schindler <uw...@thetaphi.de>
> > > wrote:
> > > >>> > Hi Dawid,
> > > >>> >
> > > >>> > Thanks for the very good explanation! Indeed the main problem
> > > >>> > with
> > > >>> tests.iters is the static initializers. Maybe put that
> > > >>> explanation into the Wiki! I sometimes also need to remember it,
> > > >>> so it should be
> > > documented.
> > > >>> >
> > > >>> > One (only theoretical) way to solve the whole thing could be:
> > > >>> > Load the class(es) in a separate classloader for every
> > > >>> > repeated execution,... but of course this will very fast blow
> > > >>> > up your permgen (java 6, 7) or anything else we don't know about
> (java 8).
> > > >>> > In fact the separate classloader approach is not different
> > > >>> > from Mike's scripts, just that Mike's script creates a new
> > > >>> > classloader by forking a new JVM. In fact I don't think the
> > > >>> > separate classloader approach would be much faster, because
> > > >>> > the class clones will all have separate compilation paths in
> > > >>> > Hotspot, so Hotspot cannot share the same assembler code. So
> > > >>> > except the JVM startup time, you gain nothing. Just permgen
> > > >>> > issues :-)
> > > >>>
> > > >>> The big thing the python beasting scripts avoids is all the ant
> > > >>> overhead to just get to the point where it actually spawns the
> > > >>> JVM to run the test.  Really, that's all the beasting script does:
> > > >>> directly spawn the JVM on the test runner (after running "ant
> > > >>> test-compile" up
> > > >>> front) and then parse its output/events.
> > > >>>
> > > >>> The distributed test runner, which uses rsync/ssh to run tests
> > > >>> on N machines, is very different from the beasting script: it
> > > >>> runs all Lucene's tests (instead of a single test over and over)
> > > >>> across N JVMs on M machines.  It "cheats" by taking the union of
> > > >>> all
> > CLASSPATHs
> > > ...
> > > >>> but this is a huge win because it means all testing is fully
> > > >>> concurrent, not just concurrent within one module.  This script
> > > >>> can also repeat, which means once all lucene tests finish,
> > > >>> re-en-queue all
> > of
> > > them again.
> > > >>>
> > > >>> Mike McCandless
> > > >>>
> > > >>> http://blog.mikemccandless.com
> > > >>>
> > > >>> ----------------------------------------------------------------
> > > >>> ----
> > > >>> - To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org For
> > > >>> additional commands, e-mail: dev-help@lucene.apache.org
> > > >>
> > > >>
> > > >> -----------------------------------------------------------------
> > > >> ---- To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
> > > >> For additional commands, e-mail: dev-help@lucene.apache.org
> > > >>
> > > >
> > > > ------------------------------------------------------------------
> > > > --- To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org For
> > > > additional commands, e-mail: dev-help@lucene.apache.org
> > > >
> > >
> > > --------------------------------------------------------------------
> > > - To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org For
> > > additional commands, e-mail: dev-help@lucene.apache.org
> >
> >
> > ---------------------------------------------------------------------
> > To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org For
> > additional commands, e-mail: dev-help@lucene.apache.org

RE: Test iterations

Posted by Uwe Schindler <uw...@thetaphi.de>.

Hi,

attached you will find the beaster:

- Only modifies common-build.xml, so no inherit down (makes no sense otherwise, as you would never run "ant beast-test" from top-level. So you have to go to correct submodule and run "ant beast-test -Dbeast.iters=n -Dtestcase=..." from there
- Uses "antcall" in a loop, invoking the internal dependency-less "-test" target. My first impl used the test-macro directly, but this did not work, because test-macro sets non-local properties, which are then available on second round, causing errors or use always same seed. Antcall creates a new project each time and runs tests.

I can open an issue or just commit this :-)

Uwe

-----
Uwe Schindler
H.-H.-Meier-Allee 63, D-28213 Bremen
http://www.thetaphi.de
eMail: uwe@thetaphi.de


> -----Original Message-----
> From: Uwe Schindler [mailto:uwe@thetaphi.de]
> Sent: Friday, August 08, 2014 8:13 PM
> To: dev@lucene.apache.org
> Subject: RE: Test iterations
> 
> Hi,
> 
> I will look into that as a Groovy Skript: The main problem is: You cannot simply
> use <antcall/> in a loop, because this would also execute the dependencies
> on each run.
> 
> My idea is to do the following:
> - maybe subclass antcall Task with Groovy (not sure if this is needed)
> - instantiate it with current project
> - execute dependent targets
> - execute the inner target multiple times: store the project properties first
> and restore them after execution. This is done, because ANT properties can
> only be set *once*. If you don't give a fixed test seed, each run would pick a
> new one (because the project properties are reset, so the seed from the
> previous execution is gone).
> 
> Uwe
> 
> -----
> Uwe Schindler
> H.-H.-Meier-Allee 63, D-28213 Bremen
> http://www.thetaphi.de
> eMail: uwe@thetaphi.de
> 
> 
> > -----Original Message-----
> > From: Ryan Ernst [mailto:ryan@iernst.net]
> > Sent: Friday, August 08, 2014 5:08 PM
> > To: dev@lucene.apache.org
> > Subject: Re: Test iterations
> >
> > Thanks for the extremely thorough answer, Dawid!  Entertaining as
> > always. :)
> >
> > > Should we provide this "beaster" in common-build?
> >
> > I would use it! It sounds like there is a lot of work involved in
> > making tests.iters work better with LuceneTestCase.  In the mean time,
> > this sounds like a quick solution that might not be as efficient
> > (multiple JVMs), but still better than having to come up with a bash script?
> >
> > On Fri, Aug 8, 2014 at 7:28 AM, Michael McCandless
> > <lu...@mikemccandless.com> wrote:
> > > +1, this sounds awesome?
> > >
> > > Mike McCandless
> > >
> > > http://blog.mikemccandless.com
> > >
> > >
> > > On Fri, Aug 8, 2014 at 10:06 AM, Uwe Schindler <uw...@thetaphi.de>
> wrote:
> > >> Hi,
> > >>
> > >> We could emulate the same thing (the repeating beaster) with pure
> Ant:
> > >>
> > >> Just repeat the "test" target, which can be done using ant-contrib's "for"
> > task or (much simplier) a groovy script using antcall on the test target.
> > >> Should we provide this "beaster" in common-build?
> > >>
> > >> "ant beast-tests -Dbeast.iter=100 -Dtestcase=..."
> > >>
> > >> Very easy to implement and makes it easier to use for the python
> > >> haters -
> > and comes embedded...
> > >>
> > >> Uwe
> > >>
> > >> -----
> > >> Uwe Schindler
> > >> H.-H.-Meier-Allee 63, D-28213 Bremen http://www.thetaphi.de
> > >> eMail: uwe@thetaphi.de
> > >>
> > >>
> > >>> -----Original Message-----
> > >>> From: Michael McCandless [mailto:lucene@mikemccandless.com]
> > >>> Sent: Friday, August 08, 2014 3:48 PM
> > >>> To: Lucene/Solr dev
> > >>> Subject: Re: Test iterations
> > >>>
> > >>> On Fri, Aug 8, 2014 at 9:35 AM, Uwe Schindler <uw...@thetaphi.de>
> > wrote:
> > >>> > Hi Dawid,
> > >>> >
> > >>> > Thanks for the very good explanation! Indeed the main problem with
> > >>> tests.iters is the static initializers. Maybe put that explanation
> > >>> into the Wiki! I sometimes also need to remember it, so it should be
> > documented.
> > >>> >
> > >>> > One (only theoretical) way to solve the whole thing could be:
> > >>> > Load the class(es) in a separate classloader for every repeated
> > >>> > execution,... but of course this will very fast blow up your
> > >>> > permgen (java 6, 7) or anything else we don't know about (java 8).
> > >>> > In fact the separate classloader approach is not different from
> > >>> > Mike's scripts, just that Mike's script creates a new classloader
> > >>> > by forking a new JVM. In fact I don't think the separate
> > >>> > classloader approach would be much faster, because the class
> > >>> > clones will all have separate compilation paths in Hotspot, so
> > >>> > Hotspot cannot share the same assembler code. So except the JVM
> > >>> > startup time, you gain nothing. Just permgen issues :-)
> > >>>
> > >>> The big thing the python beasting scripts avoids is all the ant
> > >>> overhead to just get to the point where it actually spawns the JVM
> > >>> to run the test.  Really, that's all the beasting script does:
> > >>> directly spawn the JVM on the test runner (after running "ant
> > >>> test-compile" up
> > >>> front) and then parse its output/events.
> > >>>
> > >>> The distributed test runner, which uses rsync/ssh to run tests on N
> > >>> machines, is very different from the beasting script: it runs all
> > >>> Lucene's tests (instead of a single test over and over) across N
> > >>> JVMs on M machines.  It "cheats" by taking the union of all
> CLASSPATHs
> > ...
> > >>> but this is a huge win because it means all testing is fully
> > >>> concurrent, not just concurrent within one module.  This script can
> > >>> also repeat, which means once all lucene tests finish, re-en-queue all
> of
> > them again.
> > >>>
> > >>> Mike McCandless
> > >>>
> > >>> http://blog.mikemccandless.com
> > >>>
> > >>> --------------------------------------------------------------------
> > >>> - To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org For
> > >>> additional commands, e-mail: dev-help@lucene.apache.org
> > >>
> > >>
> > >> ---------------------------------------------------------------------
> > >> To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org For
> > >> additional commands, e-mail: dev-help@lucene.apache.org
> > >>
> > >
> > > ---------------------------------------------------------------------
> > > To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org For
> > > additional commands, e-mail: dev-help@lucene.apache.org
> > >
> >
> > ---------------------------------------------------------------------
> > To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org For additional
> > commands, e-mail: dev-help@lucene.apache.org
> 
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
> For additional commands, e-mail: dev-help@lucene.apache.org

Re: Test iterations

Posted by Dawid Weiss <da...@cs.put.poznan.pl>.

I know we could fork separate class loaders, Uwe. But I had exactly
the same kind of concerns you already so accurately pinpointed; if no
real gain is to be had I typically vote for simplicity.

Mike -- Ant's overhead is indeed a problem. Uwe's solution with
antcontrib (which I already mentioned a while ago I believe) will make
it more palatable, but it's still a half-way thing because if we had
"real" seed reiteration in the runner then it could also run multiple
concurrent copies of the same class (with different master seeds) in
the forked JVMs. This would nicely play with what's already in the
code.

I will get down to it and look at the possibilities and problems
again. Thanks for bringing it up, Ryan. I admit it's been on my queue
for a long time but I was hesitant to open this particular can of
worms...

Dawid


On Fri, Aug 8, 2014 at 8:13 PM, Uwe Schindler <uw...@thetaphi.de> wrote:
> Hi,
>
> I will look into that as a Groovy Skript: The main problem is: You cannot simply use <antcall/> in a loop, because this would also execute the dependencies on each run.
>
> My idea is to do the following:
> - maybe subclass antcall Task with Groovy (not sure if this is needed)
> - instantiate it with current project
> - execute dependent targets
> - execute the inner target multiple times: store the project properties first and restore them after execution. This is done, because ANT properties can only be set *once*. If you don't give a fixed test seed, each run would pick a new one (because the project properties are reset, so the seed from the previous execution is gone).
>
> Uwe
>
> -----
> Uwe Schindler
> H.-H.-Meier-Allee 63, D-28213 Bremen
> http://www.thetaphi.de
> eMail: uwe@thetaphi.de
>
>
>> -----Original Message-----
>> From: Ryan Ernst [mailto:ryan@iernst.net]
>> Sent: Friday, August 08, 2014 5:08 PM
>> To: dev@lucene.apache.org
>> Subject: Re: Test iterations
>>
>> Thanks for the extremely thorough answer, Dawid!  Entertaining as always. :)
>>
>> > Should we provide this "beaster" in common-build?
>>
>> I would use it! It sounds like there is a lot of work involved in making
>> tests.iters work better with LuceneTestCase.  In the mean time, this sounds
>> like a quick solution that might not be as efficient (multiple JVMs), but still
>> better than having to come up with a bash script?
>>
>> On Fri, Aug 8, 2014 at 7:28 AM, Michael McCandless
>> <lu...@mikemccandless.com> wrote:
>> > +1, this sounds awesome?
>> >
>> > Mike McCandless
>> >
>> > http://blog.mikemccandless.com
>> >
>> >
>> > On Fri, Aug 8, 2014 at 10:06 AM, Uwe Schindler <uw...@thetaphi.de> wrote:
>> >> Hi,
>> >>
>> >> We could emulate the same thing (the repeating beaster) with pure Ant:
>> >>
>> >> Just repeat the "test" target, which can be done using ant-contrib's "for"
>> task or (much simplier) a groovy script using antcall on the test target.
>> >> Should we provide this "beaster" in common-build?
>> >>
>> >> "ant beast-tests -Dbeast.iter=100 -Dtestcase=..."
>> >>
>> >> Very easy to implement and makes it easier to use for the python haters -
>> and comes embedded...
>> >>
>> >> Uwe
>> >>
>> >> -----
>> >> Uwe Schindler
>> >> H.-H.-Meier-Allee 63, D-28213 Bremen
>> >> http://www.thetaphi.de
>> >> eMail: uwe@thetaphi.de
>> >>
>> >>
>> >>> -----Original Message-----
>> >>> From: Michael McCandless [mailto:lucene@mikemccandless.com]
>> >>> Sent: Friday, August 08, 2014 3:48 PM
>> >>> To: Lucene/Solr dev
>> >>> Subject: Re: Test iterations
>> >>>
>> >>> On Fri, Aug 8, 2014 at 9:35 AM, Uwe Schindler <uw...@thetaphi.de>
>> wrote:
>> >>> > Hi Dawid,
>> >>> >
>> >>> > Thanks for the very good explanation! Indeed the main problem with
>> >>> tests.iters is the static initializers. Maybe put that explanation
>> >>> into the Wiki! I sometimes also need to remember it, so it should be
>> documented.
>> >>> >
>> >>> > One (only theoretical) way to solve the whole thing could be:
>> >>> > Load the class(es) in a separate classloader for every repeated
>> >>> > execution,... but of course this will very fast blow up your
>> >>> > permgen (java 6, 7) or anything else we don't know about (java 8).
>> >>> > In fact the separate classloader approach is not different from
>> >>> > Mike's scripts, just that Mike's script creates a new classloader
>> >>> > by forking a new JVM. In fact I don't think the separate
>> >>> > classloader approach would be much faster, because the class
>> >>> > clones will all have separate compilation paths in Hotspot, so
>> >>> > Hotspot cannot share the same assembler code. So except the JVM
>> >>> > startup time, you gain nothing. Just permgen issues :-)
>> >>>
>> >>> The big thing the python beasting scripts avoids is all the ant
>> >>> overhead to just get to the point where it actually spawns the JVM
>> >>> to run the test.  Really, that's all the beasting script does:
>> >>> directly spawn the JVM on the test runner (after running "ant
>> >>> test-compile" up
>> >>> front) and then parse its output/events.
>> >>>
>> >>> The distributed test runner, which uses rsync/ssh to run tests on N
>> >>> machines, is very different from the beasting script: it runs all
>> >>> Lucene's tests (instead of a single test over and over) across N
>> >>> JVMs on M machines.  It "cheats" by taking the union of all CLASSPATHs
>> ...
>> >>> but this is a huge win because it means all testing is fully
>> >>> concurrent, not just concurrent within one module.  This script can
>> >>> also repeat, which means once all lucene tests finish, re-en-queue all of
>> them again.
>> >>>
>> >>> Mike McCandless
>> >>>
>> >>> http://blog.mikemccandless.com
>> >>>
>> >>> --------------------------------------------------------------------
>> >>> - To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org For
>> >>> additional commands, e-mail: dev-help@lucene.apache.org
>> >>
>> >>
>> >> ---------------------------------------------------------------------
>> >> To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org For
>> >> additional commands, e-mail: dev-help@lucene.apache.org
>> >>
>> >
>> > ---------------------------------------------------------------------
>> > To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org For
>> > additional commands, e-mail: dev-help@lucene.apache.org
>> >
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org For additional
>> commands, e-mail: dev-help@lucene.apache.org
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
> For additional commands, e-mail: dev-help@lucene.apache.org
>

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org

RE: Test iterations

Posted by Uwe Schindler <uw...@thetaphi.de>.

Hi,

I will look into that as a Groovy Skript: The main problem is: You cannot simply use <antcall/> in a loop, because this would also execute the dependencies on each run.

My idea is to do the following:
- maybe subclass antcall Task with Groovy (not sure if this is needed)
- instantiate it with current project
- execute dependent targets
- execute the inner target multiple times: store the project properties first and restore them after execution. This is done, because ANT properties can only be set *once*. If you don't give a fixed test seed, each run would pick a new one (because the project properties are reset, so the seed from the previous execution is gone).

Uwe

-----
Uwe Schindler
H.-H.-Meier-Allee 63, D-28213 Bremen
http://www.thetaphi.de
eMail: uwe@thetaphi.de


> -----Original Message-----
> From: Ryan Ernst [mailto:ryan@iernst.net]
> Sent: Friday, August 08, 2014 5:08 PM
> To: dev@lucene.apache.org
> Subject: Re: Test iterations
> 
> Thanks for the extremely thorough answer, Dawid!  Entertaining as always. :)
> 
> > Should we provide this "beaster" in common-build?
> 
> I would use it! It sounds like there is a lot of work involved in making
> tests.iters work better with LuceneTestCase.  In the mean time, this sounds
> like a quick solution that might not be as efficient (multiple JVMs), but still
> better than having to come up with a bash script?
> 
> On Fri, Aug 8, 2014 at 7:28 AM, Michael McCandless
> <lu...@mikemccandless.com> wrote:
> > +1, this sounds awesome?
> >
> > Mike McCandless
> >
> > http://blog.mikemccandless.com
> >
> >
> > On Fri, Aug 8, 2014 at 10:06 AM, Uwe Schindler <uw...@thetaphi.de> wrote:
> >> Hi,
> >>
> >> We could emulate the same thing (the repeating beaster) with pure Ant:
> >>
> >> Just repeat the "test" target, which can be done using ant-contrib's "for"
> task or (much simplier) a groovy script using antcall on the test target.
> >> Should we provide this "beaster" in common-build?
> >>
> >> "ant beast-tests -Dbeast.iter=100 -Dtestcase=..."
> >>
> >> Very easy to implement and makes it easier to use for the python haters -
> and comes embedded...
> >>
> >> Uwe
> >>
> >> -----
> >> Uwe Schindler
> >> H.-H.-Meier-Allee 63, D-28213 Bremen
> >> http://www.thetaphi.de
> >> eMail: uwe@thetaphi.de
> >>
> >>
> >>> -----Original Message-----
> >>> From: Michael McCandless [mailto:lucene@mikemccandless.com]
> >>> Sent: Friday, August 08, 2014 3:48 PM
> >>> To: Lucene/Solr dev
> >>> Subject: Re: Test iterations
> >>>
> >>> On Fri, Aug 8, 2014 at 9:35 AM, Uwe Schindler <uw...@thetaphi.de>
> wrote:
> >>> > Hi Dawid,
> >>> >
> >>> > Thanks for the very good explanation! Indeed the main problem with
> >>> tests.iters is the static initializers. Maybe put that explanation
> >>> into the Wiki! I sometimes also need to remember it, so it should be
> documented.
> >>> >
> >>> > One (only theoretical) way to solve the whole thing could be:
> >>> > Load the class(es) in a separate classloader for every repeated
> >>> > execution,... but of course this will very fast blow up your
> >>> > permgen (java 6, 7) or anything else we don't know about (java 8).
> >>> > In fact the separate classloader approach is not different from
> >>> > Mike's scripts, just that Mike's script creates a new classloader
> >>> > by forking a new JVM. In fact I don't think the separate
> >>> > classloader approach would be much faster, because the class
> >>> > clones will all have separate compilation paths in Hotspot, so
> >>> > Hotspot cannot share the same assembler code. So except the JVM
> >>> > startup time, you gain nothing. Just permgen issues :-)
> >>>
> >>> The big thing the python beasting scripts avoids is all the ant
> >>> overhead to just get to the point where it actually spawns the JVM
> >>> to run the test.  Really, that's all the beasting script does:
> >>> directly spawn the JVM on the test runner (after running "ant
> >>> test-compile" up
> >>> front) and then parse its output/events.
> >>>
> >>> The distributed test runner, which uses rsync/ssh to run tests on N
> >>> machines, is very different from the beasting script: it runs all
> >>> Lucene's tests (instead of a single test over and over) across N
> >>> JVMs on M machines.  It "cheats" by taking the union of all CLASSPATHs
> ...
> >>> but this is a huge win because it means all testing is fully
> >>> concurrent, not just concurrent within one module.  This script can
> >>> also repeat, which means once all lucene tests finish, re-en-queue all of
> them again.
> >>>
> >>> Mike McCandless
> >>>
> >>> http://blog.mikemccandless.com
> >>>
> >>> --------------------------------------------------------------------
> >>> - To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org For
> >>> additional commands, e-mail: dev-help@lucene.apache.org
> >>
> >>
> >> ---------------------------------------------------------------------
> >> To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org For
> >> additional commands, e-mail: dev-help@lucene.apache.org
> >>
> >
> > ---------------------------------------------------------------------
> > To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org For
> > additional commands, e-mail: dev-help@lucene.apache.org
> >
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org For additional
> commands, e-mail: dev-help@lucene.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org

Re: Test iterations

Posted by Ryan Ernst <ry...@iernst.net>.

Thanks for the extremely thorough answer, Dawid!  Entertaining as always. :)

> Should we provide this "beaster" in common-build?

I would use it! It sounds like there is a lot of work involved in
making tests.iters work better with LuceneTestCase.  In the mean time,
this sounds like a quick solution that might not be as efficient
(multiple JVMs), but still better than having to come up with a bash
script?

On Fri, Aug 8, 2014 at 7:28 AM, Michael McCandless
<lu...@mikemccandless.com> wrote:
> +1, this sounds awesome?
>
> Mike McCandless
>
> http://blog.mikemccandless.com
>
>
> On Fri, Aug 8, 2014 at 10:06 AM, Uwe Schindler <uw...@thetaphi.de> wrote:
>> Hi,
>>
>> We could emulate the same thing (the repeating beaster) with pure Ant:
>>
>> Just repeat the "test" target, which can be done using ant-contrib's "for" task or (much simplier) a groovy script using antcall on the test target.
>> Should we provide this "beaster" in common-build?
>>
>> "ant beast-tests -Dbeast.iter=100 -Dtestcase=..."
>>
>> Very easy to implement and makes it easier to use for the python haters - and comes embedded...
>>
>> Uwe
>>
>> -----
>> Uwe Schindler
>> H.-H.-Meier-Allee 63, D-28213 Bremen
>> http://www.thetaphi.de
>> eMail: uwe@thetaphi.de
>>
>>
>>> -----Original Message-----
>>> From: Michael McCandless [mailto:lucene@mikemccandless.com]
>>> Sent: Friday, August 08, 2014 3:48 PM
>>> To: Lucene/Solr dev
>>> Subject: Re: Test iterations
>>>
>>> On Fri, Aug 8, 2014 at 9:35 AM, Uwe Schindler <uw...@thetaphi.de> wrote:
>>> > Hi Dawid,
>>> >
>>> > Thanks for the very good explanation! Indeed the main problem with
>>> tests.iters is the static initializers. Maybe put that explanation into the Wiki! I
>>> sometimes also need to remember it, so it should be documented.
>>> >
>>> > One (only theoretical) way to solve the whole thing could be:
>>> > Load the class(es) in a separate classloader for every repeated
>>> > execution,... but of course this will very fast blow up your permgen
>>> > (java 6, 7) or anything else we don't know about (java 8). In fact the
>>> > separate classloader approach is not different from Mike's scripts,
>>> > just that Mike's script creates a new classloader by forking a new
>>> > JVM. In fact I don't think the separate classloader approach would be
>>> > much faster, because the class clones will all have separate
>>> > compilation paths in Hotspot, so Hotspot cannot share the same
>>> > assembler code. So except the JVM startup time, you gain nothing. Just
>>> > permgen issues :-)
>>>
>>> The big thing the python beasting scripts avoids is all the ant overhead to just
>>> get to the point where it actually spawns the JVM to run the test.  Really,
>>> that's all the beasting script does: directly spawn the JVM on the test runner
>>> (after running "ant test-compile" up
>>> front) and then parse its output/events.
>>>
>>> The distributed test runner, which uses rsync/ssh to run tests on N machines,
>>> is very different from the beasting script: it runs all Lucene's tests (instead of
>>> a single test over and over) across N JVMs on M machines.  It "cheats" by
>>> taking the union of all CLASSPATHs ...
>>> but this is a huge win because it means all testing is fully concurrent, not just
>>> concurrent within one module.  This script can also repeat, which means once
>>> all lucene tests finish, re-en-queue all of them again.
>>>
>>> Mike McCandless
>>>
>>> http://blog.mikemccandless.com
>>>
>>> ---------------------------------------------------------------------
>>> To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org For additional
>>> commands, e-mail: dev-help@lucene.apache.org
>>
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
>> For additional commands, e-mail: dev-help@lucene.apache.org
>>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
> For additional commands, e-mail: dev-help@lucene.apache.org
>

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org

Re: Test iterations

Posted by Michael McCandless <lu...@mikemccandless.com>.

+1, this sounds awesome?

Mike McCandless

http://blog.mikemccandless.com


On Fri, Aug 8, 2014 at 10:06 AM, Uwe Schindler <uw...@thetaphi.de> wrote:
> Hi,
>
> We could emulate the same thing (the repeating beaster) with pure Ant:
>
> Just repeat the "test" target, which can be done using ant-contrib's "for" task or (much simplier) a groovy script using antcall on the test target.
> Should we provide this "beaster" in common-build?
>
> "ant beast-tests -Dbeast.iter=100 -Dtestcase=..."
>
> Very easy to implement and makes it easier to use for the python haters - and comes embedded...
>
> Uwe
>
> -----
> Uwe Schindler
> H.-H.-Meier-Allee 63, D-28213 Bremen
> http://www.thetaphi.de
> eMail: uwe@thetaphi.de
>
>
>> -----Original Message-----
>> From: Michael McCandless [mailto:lucene@mikemccandless.com]
>> Sent: Friday, August 08, 2014 3:48 PM
>> To: Lucene/Solr dev
>> Subject: Re: Test iterations
>>
>> On Fri, Aug 8, 2014 at 9:35 AM, Uwe Schindler <uw...@thetaphi.de> wrote:
>> > Hi Dawid,
>> >
>> > Thanks for the very good explanation! Indeed the main problem with
>> tests.iters is the static initializers. Maybe put that explanation into the Wiki! I
>> sometimes also need to remember it, so it should be documented.
>> >
>> > One (only theoretical) way to solve the whole thing could be:
>> > Load the class(es) in a separate classloader for every repeated
>> > execution,... but of course this will very fast blow up your permgen
>> > (java 6, 7) or anything else we don't know about (java 8). In fact the
>> > separate classloader approach is not different from Mike's scripts,
>> > just that Mike's script creates a new classloader by forking a new
>> > JVM. In fact I don't think the separate classloader approach would be
>> > much faster, because the class clones will all have separate
>> > compilation paths in Hotspot, so Hotspot cannot share the same
>> > assembler code. So except the JVM startup time, you gain nothing. Just
>> > permgen issues :-)
>>
>> The big thing the python beasting scripts avoids is all the ant overhead to just
>> get to the point where it actually spawns the JVM to run the test.  Really,
>> that's all the beasting script does: directly spawn the JVM on the test runner
>> (after running "ant test-compile" up
>> front) and then parse its output/events.
>>
>> The distributed test runner, which uses rsync/ssh to run tests on N machines,
>> is very different from the beasting script: it runs all Lucene's tests (instead of
>> a single test over and over) across N JVMs on M machines.  It "cheats" by
>> taking the union of all CLASSPATHs ...
>> but this is a huge win because it means all testing is fully concurrent, not just
>> concurrent within one module.  This script can also repeat, which means once
>> all lucene tests finish, re-en-queue all of them again.
>>
>> Mike McCandless
>>
>> http://blog.mikemccandless.com
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org For additional
>> commands, e-mail: dev-help@lucene.apache.org
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
> For additional commands, e-mail: dev-help@lucene.apache.org
>

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org

RE: Test iterations

Posted by Uwe Schindler <uw...@thetaphi.de>.

Hi,

We could emulate the same thing (the repeating beaster) with pure Ant:

Just repeat the "test" target, which can be done using ant-contrib's "for" task or (much simplier) a groovy script using antcall on the test target.
Should we provide this "beaster" in common-build?

"ant beast-tests -Dbeast.iter=100 -Dtestcase=..."

Very easy to implement and makes it easier to use for the python haters - and comes embedded...

Uwe

-----
Uwe Schindler
H.-H.-Meier-Allee 63, D-28213 Bremen
http://www.thetaphi.de
eMail: uwe@thetaphi.de


> -----Original Message-----
> From: Michael McCandless [mailto:lucene@mikemccandless.com]
> Sent: Friday, August 08, 2014 3:48 PM
> To: Lucene/Solr dev
> Subject: Re: Test iterations
> 
> On Fri, Aug 8, 2014 at 9:35 AM, Uwe Schindler <uw...@thetaphi.de> wrote:
> > Hi Dawid,
> >
> > Thanks for the very good explanation! Indeed the main problem with
> tests.iters is the static initializers. Maybe put that explanation into the Wiki! I
> sometimes also need to remember it, so it should be documented.
> >
> > One (only theoretical) way to solve the whole thing could be:
> > Load the class(es) in a separate classloader for every repeated
> > execution,... but of course this will very fast blow up your permgen
> > (java 6, 7) or anything else we don't know about (java 8). In fact the
> > separate classloader approach is not different from Mike's scripts,
> > just that Mike's script creates a new classloader by forking a new
> > JVM. In fact I don't think the separate classloader approach would be
> > much faster, because the class clones will all have separate
> > compilation paths in Hotspot, so Hotspot cannot share the same
> > assembler code. So except the JVM startup time, you gain nothing. Just
> > permgen issues :-)
> 
> The big thing the python beasting scripts avoids is all the ant overhead to just
> get to the point where it actually spawns the JVM to run the test.  Really,
> that's all the beasting script does: directly spawn the JVM on the test runner
> (after running "ant test-compile" up
> front) and then parse its output/events.
> 
> The distributed test runner, which uses rsync/ssh to run tests on N machines,
> is very different from the beasting script: it runs all Lucene's tests (instead of
> a single test over and over) across N JVMs on M machines.  It "cheats" by
> taking the union of all CLASSPATHs ...
> but this is a huge win because it means all testing is fully concurrent, not just
> concurrent within one module.  This script can also repeat, which means once
> all lucene tests finish, re-en-queue all of them again.
> 
> Mike McCandless
> 
> http://blog.mikemccandless.com
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org For additional
> commands, e-mail: dev-help@lucene.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org

Re: Test iterations

Posted by Michael McCandless <lu...@mikemccandless.com>.

On Fri, Aug 8, 2014 at 9:35 AM, Uwe Schindler <uw...@thetaphi.de> wrote:
> Hi Dawid,
>
> Thanks for the very good explanation! Indeed the main problem with tests.iters is the static initializers. Maybe put that explanation into the Wiki! I sometimes also need to remember it, so it should be documented.
>
> One (only theoretical) way to solve the whole thing could be:
> Load the class(es) in a separate classloader for every repeated execution,... but of course this will very fast blow up your permgen (java 6, 7) or anything else we don't know about (java 8). In fact the separate classloader approach is not different from Mike's scripts, just that Mike's script creates a new classloader by forking a new JVM. In fact I don't think the separate classloader approach would be much faster, because the class clones will all have separate compilation paths in Hotspot, so Hotspot cannot share the same assembler code. So except the JVM startup time, you gain nothing. Just permgen issues :-)

The big thing the python beasting scripts avoids is all the ant
overhead to just get to the point where it actually spawns the JVM to
run the test.  Really, that's all the beasting script does: directly
spawn the JVM on the test runner (after running "ant test-compile" up
front) and then parse its output/events.

The distributed test runner, which uses rsync/ssh to run tests on N
machines, is very different from the beasting script: it runs all
Lucene's tests (instead of a single test over and over) across N JVMs
on M machines.  It "cheats" by taking the union of all CLASSPATHs ...
but this is a huge win because it means all testing is fully
concurrent, not just concurrent within one module.  This script can
also repeat, which means once all lucene tests finish, re-en-queue all
of them again.

Mike McCandless

http://blog.mikemccandless.com

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org

RE: Test iterations

Posted by Uwe Schindler <uw...@thetaphi.de>.

Hi Dawid,

Thanks for the very good explanation! Indeed the main problem with tests.iters is the static initializers. Maybe put that explanation into the Wiki! I sometimes also need to remember it, so it should be documented.

One (only theoretical) way to solve the whole thing could be:
Load the class(es) in a separate classloader for every repeated execution,... but of course this will very fast blow up your permgen (java 6, 7) or anything else we don't know about (java 8). In fact the separate classloader approach is not different from Mike's scripts, just that Mike's script creates a new classloader by forking a new JVM. In fact I don't think the separate classloader approach would be much faster, because the class clones will all have separate compilation paths in Hotspot, so Hotspot cannot share the same assembler code. So except the JVM startup time, you gain nothing. Just permgen issues :-)

Uwe

-----
Uwe Schindler
H.-H.-Meier-Allee 63, D-28213 Bremen
http://www.thetaphi.de
eMail: uwe@thetaphi.de


> -----Original Message-----
> From: Dawid Weiss [mailto:dawid.weiss@gmail.com]
> Sent: Friday, August 08, 2014 3:10 PM
> To: dev@lucene.apache.org
> Subject: Re: Test iterations
> 
> Hi Ryan,
> 
> So. I discussed this a while ago, but here it comes again. Let me first clear a
> few things from what you said.
> 
> > Only in the last month or so did I learn that -Dtests.iters doesn't really
> "work".  What I mean is in regards to randomization.
> 
> This is not true. It works (as I will explain below). Try it, for example (the
> annotation has the same effect as providing
> -Dtests.iters=10):
> 
> @Repeat(iterations = 10)
> @Seed("0")
> public class Test001 extends RandomizedTest {
>   @Test public void test() {
>     System.out.println(randomAsciiOfLength(10));
>   }
> }
> 
> I fixed the initial seed to make it reproducible. This will print:
> 
> nKtjLXhWQw
> awHTHLIGAq
> vEYgnxTkWv
> mSAloRXtIV
> iBhCJuZNzP
> DHAIyqecSS
> zaEoTAWAOa
> CoraUrKuib
> fKxUZnyQTx
> beFtvsUTHc
> 
> > Each iteration currently is *exactly* the same as far as randomization
> >> (each iteration uses the same master seed).
> 
> You can see above that it isn't true. Every iteration is different and uses
> different randomness (and this randomness is "derived" from the (master,
> iteration) pair so it is fully reproducible in each run).
> 
> >> Why not create a different seed for each iteration when -Dtests.iters is
> used?
> 
> Let's talk about JUnit unit tests and how (any) runner should execute them. I
> will demonstrate this on a simple class like this one (pseudo
> code):
> 
> class Foo {
>   @BeforeClass beforeClassHook() {}
> 
>   @Before beforeHook() {}
>   @Test test1() {}
>   @After afterHook() {}
> 
>   @AfterClass afterClassHook() {}
> }
> 
> There are a couple of "stages" to be executed. Simplifying a bit, it looks like
> this.
> 
> 0. Prerequsite
> 
> - class available, possible loaded and initialized
> 
> 1. Setup:
> 
> - extract test methods
> 
> 2. Execution.
> 
> - run class-before hooks (rules, @BeforeClass)
> - for each test:
>     run before hooks (@Before, rules)
>     run the test itself
>     run after hooks (@After, rules)
> - run class-after hooks (rules, @AfterClass)
> 
> For the class above, the sequence of method calls would be:
> 
> beforeClassHook()
> 
> new() // constructor
> beforeHook()
> test1()
> afterHook()
> 
> afterClassHook()
> 
> If you were to multiply tests execution manually, you would copy-paste the
> test method giving it a different name:
> 
> class Foo {
>   @BeforeClass beforeClassHook() {}
> 
>   @Before beforeHook() {}
>   @Test test1() {}
>   @Test test2() {}
>   @After afterHook() {}
> 
>   @AfterClass afterClassHook() {}
> }
> 
> which would result in a sequence of calls like this one:
> 
> beforeClassHook()
> 
> new() // constructor
> beforeHook()
> test1()
> afterHook()
> 
> new() // constructor (new instance)
> beforeHook()
> test2()
> afterHook()
> 
> afterClassHook()
> 
> So, first of all, note that duplicating tests is *not* equivalent to just looping
> around method body. Each execution should be run on a new instance and
> wrapped with setup and teardown hooks, otherwise it's not really an isolated
> JUnit test anymore (and it would be against JUnit informal execution flow).
> 
> This is, in short, what -Dtests.iters does (and what @Repeat does) -- it
> replicates every test, making sure they have unique names (IDEs get
> confused if they don't) and trying to work around other issues I won't discuss
> here. It does work. The reason you believe it doesn't work is because most
> of the stuff in LuceneTestCase is initialized at *static* class level, which by
> definition is executed only once, regardless of the number of tests in a class.
> Let's modify our initial example a
> bit:
> 
> @Repeat(iterations = 10)
> @Seed("0")
> public class Test002 extends RandomizedTest {
>   static String s;
> 
>   @BeforeClass
>   public static void beforeClass() {
>     s = randomAsciiOfLength(10);
>   }
> 
>   @Test public void test() {
>     System.out.println(s);
>   }
> }
> 
> If you run this, you'll see:
> 
> SXVNjhPdQD
> SXVNjhPdQD
> SXVNjhPdQD
> SXVNjhPdQD
> SXVNjhPdQD
> SXVNjhPdQD
> SXVNjhPdQD
> SXVNjhPdQD
> SXVNjhPdQD
> SXVNjhPdQD
> 
> This works as expected because beforeClass() is invoked once (even if every
> test has a different randomness available to it). LuceneTestCase does it for
> performance reasons (that static initialization is fairly costly). This ends the
> JUnit part of the story.
> 
> But wait, there is more. If you take a look above at how JUnit runners should
> work they load the class (or in fact are given an initialized
> class) before they can do anything. So if there are static class initializers (static
> { field = foo(); }) then these may get executed well before the runner has
> any chance to initialize its own stuff -- that's why you *have* to use
> @BeforeClass methods if you want to use RandomizedTest's randomness;
> doing
> this:
> 
> @Repeat(iterations = 10)
> @Seed("0")
> public class Test003 extends RandomizedTest {
>   static final String s;
>   static {
>     s = randomAsciiOfLength(10);
>   }
> 
>   @Test public void test() {
>     System.out.println(s);
>   }
> }
> 
> will result in an initialization exception complaining about missing random
> context:
> 
> java.lang.IllegalStateException: No context information for thread:
> Thread[id=11, name=SUITE-Test003-seed#[0], state=RUNNABLE,
> group=TGRP-Test003]. Is this thread running under a class
> com.carrotsearch.randomizedtesting.RandomizedRunner runner context?
> Add @RunWith(class
> com.carrotsearch.randomizedtesting.RandomizedRunner.class) to your test
> class. Make sure your code accesses random contexts within @BeforeClass
> and @AfterClass boundary (for example, static test class initializers are not
> permitted to access random contexts).
> 
> Finally, all the above probably begs the question why can't the runner just
> wrap every iteration one level up the ladder -- re-execute static hooks, etc.
> 
> The answer is that you are only considering the level of a single class and the
> master seed is in fact used for other things, most notably for ordering test
> classes for predictable execution within one forked JVM (this is done in
> JUnit4-Ant task). If you re-executed one class's static hooks with a different
> seed for different iterations how would they report the "real master" seed
> back for re-running the entire suite? Should it be a pair master-
> seed:iteration? Only the derived class seed? Or maybe we should have two
> seeds -- one for ordering classes and one for the actual class execution?
> 
> It gets really messy if you get down to details like this, not to mention missing
> support from IDEs for running multiple tests of the same class (it's already
> pretty complicated to make it work in Eclipse, IntelliJ Idea, etc. when multiple
> identical methods are executed).
> 
> Finally:
> 
> >> different people have their own "beasting" scripts that run the test
> >> essentially N times from a shell to force different seeds in each
> >> iteration.
> 
> This is true. ButI don't think Mike, for example, will resign from his scripts
> even if "real" class-level tests.iters would be implemented -- Mike's script
> runs tests across multiple machines via SSH; I won't have the time for such
> distributed extensions in any near future.
> 
> If you want to try to implement reiteration at the 'static context'
> feel free to give it a shot though; I would certainly be interested in how you
> approach the problems I mentioned above. A good place to start would be to
> modify JUnit4-ant (Ant task), around here:
> 
> https://github.com/carrotsearch/randomizedtesting/blob/master/junit4-
> ant/src/main/java/com/carrotsearch/ant/tasks/junit4/JUnit4.java#L887
> 
> if you somehow associated a class with its master seed (and duplicated
> classes as well), then you could fork off multiple class executions with
> different seed (you'd still have to modify the forked subprocess to accept
> class name and master seed; currently only one master seed is passed at the
> start of each JVM.
> 
> Or you could do what I mentioned above -- separate the seed used for class
> ordering from the one used for executing every class (every iteration of a
> class).
> 
> Or maybe it'd be easier to modify the runner itself (then duplication would
> work from IDE level)... but then you'll hit the IDE issues and constraints... I
> really don't know which way is better (or worse).
> That's part of the challenge I guess. :)
> 
> I'll try to look at the code again in the next few days and will give you some
> feedback. I can't log to Jira for some reason, but I'll lookup the issue number
> for this, it's been there for a good while.
> 
> Dawid
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org For additional
> commands, e-mail: dev-help@lucene.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org

Re: Test iterations

Posted by Dawid Weiss <da...@gmail.com>.

Hi Ryan,

So. I discussed this a while ago, but here it comes again. Let me
first clear a few things from what you said.

> Only in the last month or so did I learn that -Dtests.iters doesn't really "work".  What I mean is in regards to randomization.

This is not true. It works (as I will explain below). Try it, for
example (the annotation has the same effect as providing
-Dtests.iters=10):

@Repeat(iterations = 10)
@Seed("0")
public class Test001 extends RandomizedTest {
  @Test public void test() {
    System.out.println(randomAsciiOfLength(10));
  }
}

I fixed the initial seed to make it reproducible. This will print:

nKtjLXhWQw
awHTHLIGAq
vEYgnxTkWv
mSAloRXtIV
iBhCJuZNzP
DHAIyqecSS
zaEoTAWAOa
CoraUrKuib
fKxUZnyQTx
beFtvsUTHc

> Each iteration currently is *exactly* the same as far as randomization
>> (each iteration uses the same master seed).

You can see above that it isn't true. Every iteration is different and
uses different randomness (and this randomness is "derived" from the
(master, iteration) pair so it is fully reproducible in each run).

>> Why not create a different seed for each iteration when -Dtests.iters is used?

Let's talk about JUnit unit tests and how (any) runner should execute
them. I will demonstrate this on a simple class like this one (pseudo
code):

class Foo {
  @BeforeClass beforeClassHook() {}

  @Before beforeHook() {}
  @Test test1() {}
  @After afterHook() {}

  @AfterClass afterClassHook() {}
}

There are a couple of "stages" to be executed. Simplifying a bit, it
looks like this.

0. Prerequsite

- class available, possible loaded and initialized

1. Setup:

- extract test methods

2. Execution.

- run class-before hooks (rules, @BeforeClass)
- for each test:
    run before hooks (@Before, rules)
    run the test itself
    run after hooks (@After, rules)
- run class-after hooks (rules, @AfterClass)

For the class above, the sequence of method calls would be:

beforeClassHook()

new() // constructor
beforeHook()
test1()
afterHook()

afterClassHook()

If you were to multiply tests execution manually, you would copy-paste
the test method giving it a different name:

class Foo {
  @BeforeClass beforeClassHook() {}

  @Before beforeHook() {}
  @Test test1() {}
  @Test test2() {}
  @After afterHook() {}

  @AfterClass afterClassHook() {}
}

which would result in a sequence of calls like this one:

beforeClassHook()

new() // constructor
beforeHook()
test1()
afterHook()

new() // constructor (new instance)
beforeHook()
test2()
afterHook()

afterClassHook()

So, first of all, note that duplicating tests is *not* equivalent to
just looping around method body. Each execution should be run on a new
instance and wrapped with setup and teardown hooks, otherwise it's not
really an isolated JUnit test anymore (and it would be against JUnit
informal execution flow).

This is, in short, what -Dtests.iters does (and what @Repeat does) --
it replicates every test, making sure they have unique names (IDEs get
confused if they don't) and trying to work around other issues I won't
discuss here. It does work. The reason you believe it doesn't work is
because most of the stuff in LuceneTestCase is initialized at *static*
class level, which by definition is executed only once, regardless of
the number of tests in a class. Let's modify our initial example a
bit:

@Repeat(iterations = 10)
@Seed("0")
public class Test002 extends RandomizedTest {
  static String s;

  @BeforeClass
  public static void beforeClass() {
    s = randomAsciiOfLength(10);
  }

  @Test public void test() {
    System.out.println(s);
  }
}

If you run this, you'll see:

SXVNjhPdQD
SXVNjhPdQD
SXVNjhPdQD
SXVNjhPdQD
SXVNjhPdQD
SXVNjhPdQD
SXVNjhPdQD
SXVNjhPdQD
SXVNjhPdQD
SXVNjhPdQD

This works as expected because beforeClass() is invoked once (even if
every test has a different randomness available to it). LuceneTestCase
does it for performance reasons (that static initialization is fairly
costly). This ends the
JUnit part of the story.

But wait, there is more. If you take a look above at how JUnit runners
should work they load the class (or in fact are given an initialized
class) before they can do anything. So if there are static class
initializers (static { field = foo(); }) then these may get executed
well before the runner has any chance to initialize its own stuff --
that's why you *have* to use @BeforeClass methods if you want to use
RandomizedTest's randomness; doing
this:

@Repeat(iterations = 10)
@Seed("0")
public class Test003 extends RandomizedTest {
  static final String s;
  static {
    s = randomAsciiOfLength(10);
  }

  @Test public void test() {
    System.out.println(s);
  }
}

will result in an initialization exception complaining about missing
random context:

java.lang.IllegalStateException: No context information for thread:
Thread[id=11, name=SUITE-Test003-seed#[0], state=RUNNABLE,
group=TGRP-Test003]. Is this thread running under a class
com.carrotsearch.randomizedtesting.RandomizedRunner runner context?
Add @RunWith(class
com.carrotsearch.randomizedtesting.RandomizedRunner.class) to your
test class. Make sure your code accesses random contexts within
@BeforeClass and @AfterClass boundary (for example, static test class
initializers are not permitted to access random contexts).

Finally, all the above probably begs the question why can't the runner
just wrap every iteration one level up the ladder -- re-execute static
hooks, etc.

The answer is that you are only considering the level of a single
class and the master seed is in fact used for other things, most
notably for ordering test classes for predictable execution within one
forked JVM (this is done in JUnit4-Ant task). If you re-executed one
class's static hooks with a different seed for different iterations
how would they report the "real master" seed back for re-running the
entire suite? Should it be a pair master-seed:iteration? Only the derived
class seed? Or maybe we should have two seeds -- one for ordering classes
and one for the actual class execution?

It gets really messy if you get down to details like this, not
to mention missing support from IDEs for running multiple tests of the
same class (it's already pretty complicated to make it work in
Eclipse, IntelliJ Idea, etc. when multiple identical methods are executed).

Finally:

>> different people have their own "beasting" scripts
>> that run the test essentially N times from a shell to force different
>> seeds in each iteration.

This is true. ButI don't think Mike, for example, will resign from
his scripts even if "real" class-level tests.iters would be
implemented -- Mike's
script runs tests across multiple machines via SSH; I won't have the
time for such  distributed extensions in any near future.

If you want to try to implement reiteration at the 'static context'
feel free to give it a shot though; I would certainly be interested in
how you approach the problems I mentioned above. A good place to start
would be to modify JUnit4-ant (Ant task), around here:

https://github.com/carrotsearch/randomizedtesting/blob/master/junit4-ant/src/main/java/com/carrotsearch/ant/tasks/junit4/JUnit4.java#L887

if you somehow associated a class with its master seed (and duplicated
classes as well), then you could fork off multiple class executions
with different seed (you'd still have to modify the forked subprocess
to accept class name and master seed; currently only one master seed
is passed at the start of each JVM.

Or you could do what I mentioned above -- separate the seed used for
class ordering from the one used for executing
every class (every iteration of a class).

Or maybe it'd be easier to modify the runner itself (then duplication
would work from IDE level)... but then you'll hit the IDE issues and
constraints... I really don't know which way is better (or worse).
That's part of the challenge I guess. :)

I'll try to look at the code again in the next few days and will give
you some feedback. I can't log to Jira for some reason, but I'll
lookup the issue number for this, it's been there for a good while.

Dawid

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org

Re: Test iterations

Posted by Dawid Weiss <da...@gmail.com>.

It is a longer story, Ryan. And *not* a trivial change to the runner. I
will reply tomorrow. I am at a pub right now. To you, cheers :)
On Aug 7, 2014 11:36 PM, "Ryan Ernst" <ry...@iernst.net> wrote:

> Only in the last month or so did I learn that -Dtests.iters doesn't
> really "work".  What I mean is in regards to randomization.  Each
> iteration currently is *exactly* the same as far as randomization
> (each iteration uses the same master seed).  And because of this, I
> understand that different people have their own "beasting" scripts
> that run the test essentially N times from a shell to force different
> seeds in each iteration.
>
> Why not create a different seed for each iteration when -Dtests.iters
> is used?  This way the test would still spit out a reproducible run
> line for a specific iteration, but each iteration would have good
> randomization (so trying to hit a rare bug could be done with
> -Dtests.iters).
>
> I'm curious if there is history here as to why test iters is done this
> way, or what peoples opinions are on moving towards the approach I
> suggested above.
>
> Thanks!
> Ryan
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
> For additional commands, e-mail: dev-help@lucene.apache.org
>
>