You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@lucene.apache.org by Benson Margulies <bi...@gmail.com> on 2014/02/17 13:18:49 UTC

Exposing random string generation

Down in the bottom of the randomized testing apparatus is some code
for generating random stress data. The only public/protected API for
it is to push it into an analysis chain. Would anyone object to a
patch to allow direct access to methods that just deliver the
randomized text? I'd like some random strings for code below the level
of the analysis components.

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


Re: Exposing random string generation

Posted by Robert Muir <rc...@gmail.com>.
+1 it would be better if this logic was in _TestUtil as a separate method
with all the others.


On Mon, Feb 17, 2014 at 7:18 AM, Benson Margulies <bi...@gmail.com>wrote:

> Down in the bottom of the randomized testing apparatus is some code
> for generating random stress data. The only public/protected API for
> it is to push it into an analysis chain. Would anyone object to a
> patch to allow direct access to methods that just deliver the
> randomized text? I'd like some random strings for code below the level
> of the analysis components.
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
> For additional commands, e-mail: dev-help@lucene.apache.org
>
>

Re: Exposing random string generation

Posted by Dawid Weiss <da...@cs.put.poznan.pl>.
I copied a lot of these static utilities from _TestUtil to classes
such as this one (they're also available in Lucene tests):

https://github.com/carrotsearch/randomizedtesting/blob/master/randomized-runner/src/main/java/com/carrotsearch/randomizedtesting/generators/RandomStrings.java?source=c

Dawid


On Mon, Feb 17, 2014 at 1:29 PM, Uwe Schindler <uw...@thetaphi.de> wrote:
> Hi Benson,
>
> See
> http://lucene.apache.org/core/4_6_0/test-framework/org/apache/lucene/util/_TestUtil.html
>
> There are methods like:
> randomRealisticUnicodeString(Random r, int minLength, int maxLength)
>
> or more fancy:
> randomRegexpishString(Random r, int maxLength)
>
> Those methods are all static, so you can use from anywhere. The name _TestUtil goes back to older days. But it is part of test-framework.
>
> -----
> Uwe Schindler
> H.-H.-Meier-Allee 63, D-28213 Bremen
> http://www.thetaphi.de
> eMail: uwe@thetaphi.de
>
>> -----Original Message-----
>> From: Benson Margulies [mailto:bimargulies@gmail.com]
>> Sent: Monday, February 17, 2014 1:19 PM
>> To: dev@lucene.apache.org
>> Subject: Exposing random string generation
>>
>> Down in the bottom of the randomized testing apparatus is some code for
>> generating random stress data. The only public/protected API for it is to push
>> it into an analysis chain. Would anyone object to a patch to allow direct access
>> to methods that just deliver the randomized text? I'd like some random
>> strings for code below the level of the analysis components.
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org For additional
>> commands, e-mail: dev-help@lucene.apache.org
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
> For additional commands, e-mail: dev-help@lucene.apache.org
>

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


Re: Exposing random string generation

Posted by Robert Muir <rc...@gmail.com>.
Missed? I don't see a release branch :)


On Mon, Feb 17, 2014 at 8:01 AM, Benson Margulies <bi...@gmail.com>wrote:

> Yes, that's for sure. I've set up a JIRA and PR for the simple
> function move, let's do that and backmerge before making the big
> noise.
> Pity I missed 4.7 with this.
>
> On Mon, Feb 17, 2014 at 7:52 AM, Robert Muir <rc...@gmail.com> wrote:
> > Maybe good to separate the two items. renaming that class should be very
> > noisy!
> >
> >
> > On Mon, Feb 17, 2014 at 7:48 AM, Benson Margulies <bimargulies@gmail.com
> >
> > wrote:
> >>
> >> Right, that's my target.
> >>
> >> Might I rename _TestUtil for 5.0 :-?
> >>
> >> On Mon, Feb 17, 2014 at 7:44 AM, Robert Muir <rc...@gmail.com> wrote:
> >> > There are, but the stuff inside BaseTokenStreamTestCase is "better" at
> >> > finding bugs than any of those methods. it should be moved there too.
> >> >
> >> >
> >> > On Mon, Feb 17, 2014 at 7:29 AM, Uwe Schindler <uw...@thetaphi.de>
> wrote:
> >> >>
> >> >> Hi Benson,
> >> >>
> >> >> See
> >> >>
> >> >>
> >> >>
> http://lucene.apache.org/core/4_6_0/test-framework/org/apache/lucene/util/_TestUtil.html
> >> >>
> >> >> There are methods like:
> >> >> randomRealisticUnicodeString(Random r, int minLength, int maxLength)
> >> >>
> >> >> or more fancy:
> >> >> randomRegexpishString(Random r, int maxLength)
> >> >>
> >> >> Those methods are all static, so you can use from anywhere. The name
> >> >> _TestUtil goes back to older days. But it is part of test-framework.
> >> >>
> >> >> -----
> >> >> Uwe Schindler
> >> >> H.-H.-Meier-Allee 63, D-28213 Bremen
> >> >> http://www.thetaphi.de
> >> >> eMail: uwe@thetaphi.de
> >> >>
> >> >> > -----Original Message-----
> >> >> > From: Benson Margulies [mailto:bimargulies@gmail.com]
> >> >> > Sent: Monday, February 17, 2014 1:19 PM
> >> >> > To: dev@lucene.apache.org
> >> >> > Subject: Exposing random string generation
> >> >> >
> >> >> > Down in the bottom of the randomized testing apparatus is some code
> >> >> > for
> >> >> > generating random stress data. The only public/protected API for it
> >> >> > is
> >> >> > to push
> >> >> > it into an analysis chain. Would anyone object to a patch to allow
> >> >> > direct access
> >> >> > to methods that just deliver the randomized text? I'd like some
> >> >> > random
> >> >> > strings for code below the level of the analysis components.
> >> >> >
> >> >> >
> ---------------------------------------------------------------------
> >> >> > To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org For
> >> >> > additional
> >> >> > commands, e-mail: dev-help@lucene.apache.org
> >> >>
> >> >>
> >> >> ---------------------------------------------------------------------
> >> >> To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
> >> >> For additional commands, e-mail: dev-help@lucene.apache.org
> >> >>
> >> >
> >>
> >> ---------------------------------------------------------------------
> >> To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
> >> For additional commands, e-mail: dev-help@lucene.apache.org
> >>
> >
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
> For additional commands, e-mail: dev-help@lucene.apache.org
>
>

Re: Exposing random string generation

Posted by Benson Margulies <bi...@gmail.com>.
Yes, that's for sure. I've set up a JIRA and PR for the simple
function move, let's do that and backmerge before making the big
noise.
Pity I missed 4.7 with this.

On Mon, Feb 17, 2014 at 7:52 AM, Robert Muir <rc...@gmail.com> wrote:
> Maybe good to separate the two items. renaming that class should be very
> noisy!
>
>
> On Mon, Feb 17, 2014 at 7:48 AM, Benson Margulies <bi...@gmail.com>
> wrote:
>>
>> Right, that's my target.
>>
>> Might I rename _TestUtil for 5.0 :-?
>>
>> On Mon, Feb 17, 2014 at 7:44 AM, Robert Muir <rc...@gmail.com> wrote:
>> > There are, but the stuff inside BaseTokenStreamTestCase is "better" at
>> > finding bugs than any of those methods. it should be moved there too.
>> >
>> >
>> > On Mon, Feb 17, 2014 at 7:29 AM, Uwe Schindler <uw...@thetaphi.de> wrote:
>> >>
>> >> Hi Benson,
>> >>
>> >> See
>> >>
>> >>
>> >> http://lucene.apache.org/core/4_6_0/test-framework/org/apache/lucene/util/_TestUtil.html
>> >>
>> >> There are methods like:
>> >> randomRealisticUnicodeString(Random r, int minLength, int maxLength)
>> >>
>> >> or more fancy:
>> >> randomRegexpishString(Random r, int maxLength)
>> >>
>> >> Those methods are all static, so you can use from anywhere. The name
>> >> _TestUtil goes back to older days. But it is part of test-framework.
>> >>
>> >> -----
>> >> Uwe Schindler
>> >> H.-H.-Meier-Allee 63, D-28213 Bremen
>> >> http://www.thetaphi.de
>> >> eMail: uwe@thetaphi.de
>> >>
>> >> > -----Original Message-----
>> >> > From: Benson Margulies [mailto:bimargulies@gmail.com]
>> >> > Sent: Monday, February 17, 2014 1:19 PM
>> >> > To: dev@lucene.apache.org
>> >> > Subject: Exposing random string generation
>> >> >
>> >> > Down in the bottom of the randomized testing apparatus is some code
>> >> > for
>> >> > generating random stress data. The only public/protected API for it
>> >> > is
>> >> > to push
>> >> > it into an analysis chain. Would anyone object to a patch to allow
>> >> > direct access
>> >> > to methods that just deliver the randomized text? I'd like some
>> >> > random
>> >> > strings for code below the level of the analysis components.
>> >> >
>> >> > ---------------------------------------------------------------------
>> >> > To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org For
>> >> > additional
>> >> > commands, e-mail: dev-help@lucene.apache.org
>> >>
>> >>
>> >> ---------------------------------------------------------------------
>> >> To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
>> >> For additional commands, e-mail: dev-help@lucene.apache.org
>> >>
>> >
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
>> For additional commands, e-mail: dev-help@lucene.apache.org
>>
>

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


Re: Exposing random string generation

Posted by Robert Muir <rc...@gmail.com>.
Maybe good to separate the two items. renaming that class should be very
noisy!


On Mon, Feb 17, 2014 at 7:48 AM, Benson Margulies <bi...@gmail.com>wrote:

> Right, that's my target.
>
> Might I rename _TestUtil for 5.0 :-?
>
> On Mon, Feb 17, 2014 at 7:44 AM, Robert Muir <rc...@gmail.com> wrote:
> > There are, but the stuff inside BaseTokenStreamTestCase is "better" at
> > finding bugs than any of those methods. it should be moved there too.
> >
> >
> > On Mon, Feb 17, 2014 at 7:29 AM, Uwe Schindler <uw...@thetaphi.de> wrote:
> >>
> >> Hi Benson,
> >>
> >> See
> >>
> >>
> http://lucene.apache.org/core/4_6_0/test-framework/org/apache/lucene/util/_TestUtil.html
> >>
> >> There are methods like:
> >> randomRealisticUnicodeString(Random r, int minLength, int maxLength)
> >>
> >> or more fancy:
> >> randomRegexpishString(Random r, int maxLength)
> >>
> >> Those methods are all static, so you can use from anywhere. The name
> >> _TestUtil goes back to older days. But it is part of test-framework.
> >>
> >> -----
> >> Uwe Schindler
> >> H.-H.-Meier-Allee 63, D-28213 Bremen
> >> http://www.thetaphi.de
> >> eMail: uwe@thetaphi.de
> >>
> >> > -----Original Message-----
> >> > From: Benson Margulies [mailto:bimargulies@gmail.com]
> >> > Sent: Monday, February 17, 2014 1:19 PM
> >> > To: dev@lucene.apache.org
> >> > Subject: Exposing random string generation
> >> >
> >> > Down in the bottom of the randomized testing apparatus is some code
> for
> >> > generating random stress data. The only public/protected API for it is
> >> > to push
> >> > it into an analysis chain. Would anyone object to a patch to allow
> >> > direct access
> >> > to methods that just deliver the randomized text? I'd like some random
> >> > strings for code below the level of the analysis components.
> >> >
> >> > ---------------------------------------------------------------------
> >> > To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org For
> additional
> >> > commands, e-mail: dev-help@lucene.apache.org
> >>
> >>
> >> ---------------------------------------------------------------------
> >> To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
> >> For additional commands, e-mail: dev-help@lucene.apache.org
> >>
> >
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
> For additional commands, e-mail: dev-help@lucene.apache.org
>
>

RE: Exposing random string generation

Posted by Benson Margulies <bi...@gmail.com>.
OK. Yes, refactor is our friend here.

On February 17, 2014 8:25:25 AM EST, Uwe Schindler <uw...@thetaphi.de> wrote:
>Hi Benson,
>
>Open an issue to change _TestUtil and _TestHelper's name! The crazy
>name is no longer needed. This is mostly an Eclipse->Refactor->Rename
>task :-)
>I would be happy to change it. And maybe use static imports in the
>future.
>
>Uwe
>
>-----
>Uwe Schindler
>H.-H.-Meier-Allee 63, D-28213 Bremen
>http://www.thetaphi.de
>eMail: uwe@thetaphi.de
>
>
>> -----Original Message-----
>> From: Benson Margulies [mailto:bimargulies@gmail.com]
>> Sent: Monday, February 17, 2014 2:22 PM
>> To: dev@lucene.apache.org
>> Subject: Re: Exposing random string generation
>> 
>> I'm not on a campaign of noisiness. If people prefer to leave _
>alone, alone it
>> is.
>> 
>> On Mon, Feb 17, 2014 at 8:20 AM, Uwe Schindler <uw...@thetaphi.de>
>wrote:
>> > Hi,
>> >
>> > the crazy name for this class is based on the fact that in earlier
>days this
>> class was part of the main test cases. So the class name needed to
>not match
>> the filename pattern "**/Test*.java **/*Test.java", otherwise JUnit
>would
>> have ran it as testcase. In Lucene 4 we have a separate module for
>the "test-
>> framework", and we never run tests inside the "test-framework"
>module, so
>> there is no issue with file names. Everything in test-framework is
>just "utility"
>> classes to be extended by tests outside of the module. The classes in
>"test-
>> framework/src/java" are never ran as test, so file names don't care
>anymore.
>> It is just verbose to rename the class. Ideally you would refactor
>the import
>> statements to something like "import static ..._TestUtil.*" and use
>them as
>> simple static external methods in affected tests.
>> >
>> > Uwe
>> >
>> > -----
>> > Uwe Schindler
>> > H.-H.-Meier-Allee 63, D-28213 Bremen
>> > http://www.thetaphi.de
>> > eMail: uwe@thetaphi.de
>> >
>> >
>> >> -----Original Message-----
>> >> From: Benson Margulies [mailto:bimargulies@gmail.com]
>> >> Sent: Monday, February 17, 2014 1:49 PM
>> >> To: dev@lucene.apache.org
>> >> Subject: Re: Exposing random string generation
>> >>
>> >> Right, that's my target.
>> >>
>> >> Might I rename _TestUtil for 5.0 :-?
>> >>
>> >> On Mon, Feb 17, 2014 at 7:44 AM, Robert Muir <rc...@gmail.com>
>> wrote:
>> >> > There are, but the stuff inside BaseTokenStreamTestCase is
>"better"
>> >> > at finding bugs than any of those methods. it should be moved
>there
>> too.
>> >> >
>> >> >
>> >> > On Mon, Feb 17, 2014 at 7:29 AM, Uwe Schindler <uw...@thetaphi.de>
>> >> wrote:
>> >> >>
>> >> >> Hi Benson,
>> >> >>
>> >> >> See
>> >> >>
>> >> >> http://lucene.apache.org/core/4_6_0/test-
>> >> framework/org/apache/lucene/
>> >> >> util/_TestUtil.html
>> >> >>
>> >> >> There are methods like:
>> >> >> randomRealisticUnicodeString(Random r, int minLength, int
>> >> >> maxLength)
>> >> >>
>> >> >> or more fancy:
>> >> >> randomRegexpishString(Random r, int maxLength)
>> >> >>
>> >> >> Those methods are all static, so you can use from anywhere. The
>> >> >> name _TestUtil goes back to older days. But it is part of test-
>> framework.
>> >> >>
>> >> >> -----
>> >> >> Uwe Schindler
>> >> >> H.-H.-Meier-Allee 63, D-28213 Bremen http://www.thetaphi.de
>> >> >> eMail: uwe@thetaphi.de
>> >> >>
>> >> >> > -----Original Message-----
>> >> >> > From: Benson Margulies [mailto:bimargulies@gmail.com]
>> >> >> > Sent: Monday, February 17, 2014 1:19 PM
>> >> >> > To: dev@lucene.apache.org
>> >> >> > Subject: Exposing random string generation
>> >> >> >
>> >> >> > Down in the bottom of the randomized testing apparatus is
>some
>> >> >> > code for generating random stress data. The only
>> >> >> > public/protected API for it is to push it into an analysis
>> >> >> > chain. Would anyone object to a patch to allow direct access
>to
>> >> >> > methods that just deliver the randomized text? I'd like some
>> >> >> > random strings for code below the level of the analysis
>components.
>> >> >> >
>> >> >> >
>----------------------------------------------------------------
>> >> >> > ---
>> >> >> > -- To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
>For
>> >> >> > additional commands, e-mail: dev-help@lucene.apache.org
>> >> >>
>> >> >>
>> >> >>
>------------------------------------------------------------------
>> >> >> --- To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
>For
>> >> >> additional commands, e-mail: dev-help@lucene.apache.org
>> >> >>
>> >> >
>> >>
>> >>
>---------------------------------------------------------------------
>> >> To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org For
>> >> additional commands, e-mail: dev-help@lucene.apache.org
>> >
>> >
>> >
>---------------------------------------------------------------------
>> > To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org For
>> > additional commands, e-mail: dev-help@lucene.apache.org
>> >
>> 
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org For
>additional
>> commands, e-mail: dev-help@lucene.apache.org
>
>
>---------------------------------------------------------------------
>To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
>For additional commands, e-mail: dev-help@lucene.apache.org

-- 
Sent from my Android device with K-9 Mail. Please excuse my brevity.

RE: Exposing random string generation

Posted by Uwe Schindler <uw...@thetaphi.de>.
Hi Benson,

Open an issue to change _TestUtil and _TestHelper's name! The crazy name is no longer needed. This is mostly an Eclipse->Refactor->Rename task :-)
I would be happy to change it. And maybe use static imports in the future.

Uwe

-----
Uwe Schindler
H.-H.-Meier-Allee 63, D-28213 Bremen
http://www.thetaphi.de
eMail: uwe@thetaphi.de


> -----Original Message-----
> From: Benson Margulies [mailto:bimargulies@gmail.com]
> Sent: Monday, February 17, 2014 2:22 PM
> To: dev@lucene.apache.org
> Subject: Re: Exposing random string generation
> 
> I'm not on a campaign of noisiness. If people prefer to leave _ alone, alone it
> is.
> 
> On Mon, Feb 17, 2014 at 8:20 AM, Uwe Schindler <uw...@thetaphi.de> wrote:
> > Hi,
> >
> > the crazy name for this class is based on the fact that in earlier days this
> class was part of the main test cases. So the class name needed to not match
> the filename pattern "**/Test*.java **/*Test.java", otherwise JUnit would
> have ran it as testcase. In Lucene 4 we have a separate module for the "test-
> framework", and we never run tests inside the "test-framework" module, so
> there is no issue with file names. Everything in test-framework is just "utility"
> classes to be extended by tests outside of the module. The classes in "test-
> framework/src/java" are never ran as test, so file names don't care anymore.
> It is just verbose to rename the class. Ideally you would refactor the import
> statements to something like "import static ..._TestUtil.*" and use them as
> simple static external methods in affected tests.
> >
> > Uwe
> >
> > -----
> > Uwe Schindler
> > H.-H.-Meier-Allee 63, D-28213 Bremen
> > http://www.thetaphi.de
> > eMail: uwe@thetaphi.de
> >
> >
> >> -----Original Message-----
> >> From: Benson Margulies [mailto:bimargulies@gmail.com]
> >> Sent: Monday, February 17, 2014 1:49 PM
> >> To: dev@lucene.apache.org
> >> Subject: Re: Exposing random string generation
> >>
> >> Right, that's my target.
> >>
> >> Might I rename _TestUtil for 5.0 :-?
> >>
> >> On Mon, Feb 17, 2014 at 7:44 AM, Robert Muir <rc...@gmail.com>
> wrote:
> >> > There are, but the stuff inside BaseTokenStreamTestCase is "better"
> >> > at finding bugs than any of those methods. it should be moved there
> too.
> >> >
> >> >
> >> > On Mon, Feb 17, 2014 at 7:29 AM, Uwe Schindler <uw...@thetaphi.de>
> >> wrote:
> >> >>
> >> >> Hi Benson,
> >> >>
> >> >> See
> >> >>
> >> >> http://lucene.apache.org/core/4_6_0/test-
> >> framework/org/apache/lucene/
> >> >> util/_TestUtil.html
> >> >>
> >> >> There are methods like:
> >> >> randomRealisticUnicodeString(Random r, int minLength, int
> >> >> maxLength)
> >> >>
> >> >> or more fancy:
> >> >> randomRegexpishString(Random r, int maxLength)
> >> >>
> >> >> Those methods are all static, so you can use from anywhere. The
> >> >> name _TestUtil goes back to older days. But it is part of test-
> framework.
> >> >>
> >> >> -----
> >> >> Uwe Schindler
> >> >> H.-H.-Meier-Allee 63, D-28213 Bremen http://www.thetaphi.de
> >> >> eMail: uwe@thetaphi.de
> >> >>
> >> >> > -----Original Message-----
> >> >> > From: Benson Margulies [mailto:bimargulies@gmail.com]
> >> >> > Sent: Monday, February 17, 2014 1:19 PM
> >> >> > To: dev@lucene.apache.org
> >> >> > Subject: Exposing random string generation
> >> >> >
> >> >> > Down in the bottom of the randomized testing apparatus is some
> >> >> > code for generating random stress data. The only
> >> >> > public/protected API for it is to push it into an analysis
> >> >> > chain. Would anyone object to a patch to allow direct access to
> >> >> > methods that just deliver the randomized text? I'd like some
> >> >> > random strings for code below the level of the analysis components.
> >> >> >
> >> >> > ----------------------------------------------------------------
> >> >> > ---
> >> >> > -- To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org For
> >> >> > additional commands, e-mail: dev-help@lucene.apache.org
> >> >>
> >> >>
> >> >> ------------------------------------------------------------------
> >> >> --- To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org For
> >> >> additional commands, e-mail: dev-help@lucene.apache.org
> >> >>
> >> >
> >>
> >> ---------------------------------------------------------------------
> >> To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org For
> >> additional commands, e-mail: dev-help@lucene.apache.org
> >
> >
> > ---------------------------------------------------------------------
> > To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org For
> > additional commands, e-mail: dev-help@lucene.apache.org
> >
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org For additional
> commands, e-mail: dev-help@lucene.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


Re: Exposing random string generation

Posted by Benson Margulies <bi...@gmail.com>.
I'm not on a campaign of noisiness. If people prefer to leave _ alone,
alone it is.

On Mon, Feb 17, 2014 at 8:20 AM, Uwe Schindler <uw...@thetaphi.de> wrote:
> Hi,
>
> the crazy name for this class is based on the fact that in earlier days this class was part of the main test cases. So the class name needed to not match the filename pattern "**/Test*.java **/*Test.java", otherwise JUnit would have ran it as testcase. In Lucene 4 we have a separate module for the "test-framework", and we never run tests inside the "test-framework" module, so there is no issue with file names. Everything in test-framework is just "utility" classes to be extended by tests outside of the module. The classes in "test-framework/src/java" are never ran as test, so file names don't care anymore. It is just verbose to rename the class. Ideally you would refactor the import statements to something like "import static ..._TestUtil.*" and use them as simple static external methods in affected tests.
>
> Uwe
>
> -----
> Uwe Schindler
> H.-H.-Meier-Allee 63, D-28213 Bremen
> http://www.thetaphi.de
> eMail: uwe@thetaphi.de
>
>
>> -----Original Message-----
>> From: Benson Margulies [mailto:bimargulies@gmail.com]
>> Sent: Monday, February 17, 2014 1:49 PM
>> To: dev@lucene.apache.org
>> Subject: Re: Exposing random string generation
>>
>> Right, that's my target.
>>
>> Might I rename _TestUtil for 5.0 :-?
>>
>> On Mon, Feb 17, 2014 at 7:44 AM, Robert Muir <rc...@gmail.com> wrote:
>> > There are, but the stuff inside BaseTokenStreamTestCase is "better" at
>> > finding bugs than any of those methods. it should be moved there too.
>> >
>> >
>> > On Mon, Feb 17, 2014 at 7:29 AM, Uwe Schindler <uw...@thetaphi.de>
>> wrote:
>> >>
>> >> Hi Benson,
>> >>
>> >> See
>> >>
>> >> http://lucene.apache.org/core/4_6_0/test-
>> framework/org/apache/lucene/
>> >> util/_TestUtil.html
>> >>
>> >> There are methods like:
>> >> randomRealisticUnicodeString(Random r, int minLength, int maxLength)
>> >>
>> >> or more fancy:
>> >> randomRegexpishString(Random r, int maxLength)
>> >>
>> >> Those methods are all static, so you can use from anywhere. The name
>> >> _TestUtil goes back to older days. But it is part of test-framework.
>> >>
>> >> -----
>> >> Uwe Schindler
>> >> H.-H.-Meier-Allee 63, D-28213 Bremen
>> >> http://www.thetaphi.de
>> >> eMail: uwe@thetaphi.de
>> >>
>> >> > -----Original Message-----
>> >> > From: Benson Margulies [mailto:bimargulies@gmail.com]
>> >> > Sent: Monday, February 17, 2014 1:19 PM
>> >> > To: dev@lucene.apache.org
>> >> > Subject: Exposing random string generation
>> >> >
>> >> > Down in the bottom of the randomized testing apparatus is some code
>> >> > for generating random stress data. The only public/protected API
>> >> > for it is to push it into an analysis chain. Would anyone object to
>> >> > a patch to allow direct access to methods that just deliver the
>> >> > randomized text? I'd like some random strings for code below the
>> >> > level of the analysis components.
>> >> >
>> >> > -------------------------------------------------------------------
>> >> > -- To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org For
>> >> > additional commands, e-mail: dev-help@lucene.apache.org
>> >>
>> >>
>> >> ---------------------------------------------------------------------
>> >> To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org For
>> >> additional commands, e-mail: dev-help@lucene.apache.org
>> >>
>> >
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org For additional
>> commands, e-mail: dev-help@lucene.apache.org
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
> For additional commands, e-mail: dev-help@lucene.apache.org
>

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


RE: Exposing random string generation

Posted by Uwe Schindler <uw...@thetaphi.de>.
Hi,

the crazy name for this class is based on the fact that in earlier days this class was part of the main test cases. So the class name needed to not match the filename pattern "**/Test*.java **/*Test.java", otherwise JUnit would have ran it as testcase. In Lucene 4 we have a separate module for the "test-framework", and we never run tests inside the "test-framework" module, so there is no issue with file names. Everything in test-framework is just "utility" classes to be extended by tests outside of the module. The classes in "test-framework/src/java" are never ran as test, so file names don't care anymore. It is just verbose to rename the class. Ideally you would refactor the import statements to something like "import static ..._TestUtil.*" and use them as simple static external methods in affected tests.

Uwe

-----
Uwe Schindler
H.-H.-Meier-Allee 63, D-28213 Bremen
http://www.thetaphi.de
eMail: uwe@thetaphi.de


> -----Original Message-----
> From: Benson Margulies [mailto:bimargulies@gmail.com]
> Sent: Monday, February 17, 2014 1:49 PM
> To: dev@lucene.apache.org
> Subject: Re: Exposing random string generation
> 
> Right, that's my target.
> 
> Might I rename _TestUtil for 5.0 :-?
> 
> On Mon, Feb 17, 2014 at 7:44 AM, Robert Muir <rc...@gmail.com> wrote:
> > There are, but the stuff inside BaseTokenStreamTestCase is "better" at
> > finding bugs than any of those methods. it should be moved there too.
> >
> >
> > On Mon, Feb 17, 2014 at 7:29 AM, Uwe Schindler <uw...@thetaphi.de>
> wrote:
> >>
> >> Hi Benson,
> >>
> >> See
> >>
> >> http://lucene.apache.org/core/4_6_0/test-
> framework/org/apache/lucene/
> >> util/_TestUtil.html
> >>
> >> There are methods like:
> >> randomRealisticUnicodeString(Random r, int minLength, int maxLength)
> >>
> >> or more fancy:
> >> randomRegexpishString(Random r, int maxLength)
> >>
> >> Those methods are all static, so you can use from anywhere. The name
> >> _TestUtil goes back to older days. But it is part of test-framework.
> >>
> >> -----
> >> Uwe Schindler
> >> H.-H.-Meier-Allee 63, D-28213 Bremen
> >> http://www.thetaphi.de
> >> eMail: uwe@thetaphi.de
> >>
> >> > -----Original Message-----
> >> > From: Benson Margulies [mailto:bimargulies@gmail.com]
> >> > Sent: Monday, February 17, 2014 1:19 PM
> >> > To: dev@lucene.apache.org
> >> > Subject: Exposing random string generation
> >> >
> >> > Down in the bottom of the randomized testing apparatus is some code
> >> > for generating random stress data. The only public/protected API
> >> > for it is to push it into an analysis chain. Would anyone object to
> >> > a patch to allow direct access to methods that just deliver the
> >> > randomized text? I'd like some random strings for code below the
> >> > level of the analysis components.
> >> >
> >> > -------------------------------------------------------------------
> >> > -- To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org For
> >> > additional commands, e-mail: dev-help@lucene.apache.org
> >>
> >>
> >> ---------------------------------------------------------------------
> >> To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org For
> >> additional commands, e-mail: dev-help@lucene.apache.org
> >>
> >
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org For additional
> commands, e-mail: dev-help@lucene.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


Re: Exposing random string generation

Posted by Shai Erera <se...@gmail.com>.
 I think you can rename _TestUtil for 4.x too - we're not obligated to any
back-compat in test-framework.

Shai


On Mon, Feb 17, 2014 at 2:48 PM, Benson Margulies <bi...@gmail.com>wrote:

> Right, that's my target.
>
> Might I rename _TestUtil for 5.0 :-?
>
> On Mon, Feb 17, 2014 at 7:44 AM, Robert Muir <rc...@gmail.com> wrote:
> > There are, but the stuff inside BaseTokenStreamTestCase is "better" at
> > finding bugs than any of those methods. it should be moved there too.
> >
> >
> > On Mon, Feb 17, 2014 at 7:29 AM, Uwe Schindler <uw...@thetaphi.de> wrote:
> >>
> >> Hi Benson,
> >>
> >> See
> >>
> >>
> http://lucene.apache.org/core/4_6_0/test-framework/org/apache/lucene/util/_TestUtil.html
> >>
> >> There are methods like:
> >> randomRealisticUnicodeString(Random r, int minLength, int maxLength)
> >>
> >> or more fancy:
> >> randomRegexpishString(Random r, int maxLength)
> >>
> >> Those methods are all static, so you can use from anywhere. The name
> >> _TestUtil goes back to older days. But it is part of test-framework.
> >>
> >> -----
> >> Uwe Schindler
> >> H.-H.-Meier-Allee 63, D-28213 Bremen
> >> http://www.thetaphi.de
> >> eMail: uwe@thetaphi.de
> >>
> >> > -----Original Message-----
> >> > From: Benson Margulies [mailto:bimargulies@gmail.com]
> >> > Sent: Monday, February 17, 2014 1:19 PM
> >> > To: dev@lucene.apache.org
> >> > Subject: Exposing random string generation
> >> >
> >> > Down in the bottom of the randomized testing apparatus is some code
> for
> >> > generating random stress data. The only public/protected API for it is
> >> > to push
> >> > it into an analysis chain. Would anyone object to a patch to allow
> >> > direct access
> >> > to methods that just deliver the randomized text? I'd like some random
> >> > strings for code below the level of the analysis components.
> >> >
> >> > ---------------------------------------------------------------------
> >> > To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org For
> additional
> >> > commands, e-mail: dev-help@lucene.apache.org
> >>
> >>
> >> ---------------------------------------------------------------------
> >> To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
> >> For additional commands, e-mail: dev-help@lucene.apache.org
> >>
> >
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
> For additional commands, e-mail: dev-help@lucene.apache.org
>
>

RE: Exposing random string generation

Posted by Uwe Schindler <uw...@thetaphi.de>.
The same applies to _TestHelper.

Uwe

-----
Uwe Schindler
H.-H.-Meier-Allee 63, D-28213 Bremen
http://www.thetaphi.de
eMail: uwe@thetaphi.de


> -----Original Message-----
> From: Uwe Schindler [mailto:uwe@thetaphi.de]
> Sent: Monday, February 17, 2014 2:20 PM
> To: 'dev@lucene.apache.org'
> Subject: RE: Exposing random string generation
> 
> Hi,
> 
> the crazy name for this class is based on the fact that in earlier days this class
> was part of the main test cases. So the class name needed to not match the
> filename pattern "**/Test*.java **/*Test.java", otherwise JUnit would have
> ran it as testcase. In Lucene 4 we have a separate module for the "test-
> framework", and we never run tests inside the "test-framework" module, so
> there is no issue with file names. Everything in test-framework is just "utility"
> classes to be extended by tests outside of the module. The classes in "test-
> framework/src/java" are never ran as test, so file names don't care anymore.
> It is just verbose to rename the class. Ideally you would refactor the import
> statements to something like "import static ..._TestUtil.*" and use them as
> simple static external methods in affected tests.
> 
> Uwe
> 
> -----
> Uwe Schindler
> H.-H.-Meier-Allee 63, D-28213 Bremen
> http://www.thetaphi.de
> eMail: uwe@thetaphi.de
> 
> 
> > -----Original Message-----
> > From: Benson Margulies [mailto:bimargulies@gmail.com]
> > Sent: Monday, February 17, 2014 1:49 PM
> > To: dev@lucene.apache.org
> > Subject: Re: Exposing random string generation
> >
> > Right, that's my target.
> >
> > Might I rename _TestUtil for 5.0 :-?
> >
> > On Mon, Feb 17, 2014 at 7:44 AM, Robert Muir <rc...@gmail.com> wrote:
> > > There are, but the stuff inside BaseTokenStreamTestCase is "better"
> > > at finding bugs than any of those methods. it should be moved there too.
> > >
> > >
> > > On Mon, Feb 17, 2014 at 7:29 AM, Uwe Schindler <uw...@thetaphi.de>
> > wrote:
> > >>
> > >> Hi Benson,
> > >>
> > >> See
> > >>
> > >> http://lucene.apache.org/core/4_6_0/test-
> > framework/org/apache/lucene/
> > >> util/_TestUtil.html
> > >>
> > >> There are methods like:
> > >> randomRealisticUnicodeString(Random r, int minLength, int
> > >> maxLength)
> > >>
> > >> or more fancy:
> > >> randomRegexpishString(Random r, int maxLength)
> > >>
> > >> Those methods are all static, so you can use from anywhere. The
> > >> name _TestUtil goes back to older days. But it is part of test-framework.
> > >>
> > >> -----
> > >> Uwe Schindler
> > >> H.-H.-Meier-Allee 63, D-28213 Bremen http://www.thetaphi.de
> > >> eMail: uwe@thetaphi.de
> > >>
> > >> > -----Original Message-----
> > >> > From: Benson Margulies [mailto:bimargulies@gmail.com]
> > >> > Sent: Monday, February 17, 2014 1:19 PM
> > >> > To: dev@lucene.apache.org
> > >> > Subject: Exposing random string generation
> > >> >
> > >> > Down in the bottom of the randomized testing apparatus is some
> > >> > code for generating random stress data. The only public/protected
> > >> > API for it is to push it into an analysis chain. Would anyone
> > >> > object to a patch to allow direct access to methods that just
> > >> > deliver the randomized text? I'd like some random strings for
> > >> > code below the level of the analysis components.
> > >> >
> > >> > -----------------------------------------------------------------
> > >> > --
> > >> > -- To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org For
> > >> > additional commands, e-mail: dev-help@lucene.apache.org
> > >>
> > >>
> > >> -------------------------------------------------------------------
> > >> -- To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org For
> > >> additional commands, e-mail: dev-help@lucene.apache.org
> > >>
> > >
> >
> > ---------------------------------------------------------------------
> > To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org For
> > additional commands, e-mail: dev-help@lucene.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


Re: Exposing random string generation

Posted by Benson Margulies <bi...@gmail.com>.
Right, that's my target.

Might I rename _TestUtil for 5.0 :-?

On Mon, Feb 17, 2014 at 7:44 AM, Robert Muir <rc...@gmail.com> wrote:
> There are, but the stuff inside BaseTokenStreamTestCase is "better" at
> finding bugs than any of those methods. it should be moved there too.
>
>
> On Mon, Feb 17, 2014 at 7:29 AM, Uwe Schindler <uw...@thetaphi.de> wrote:
>>
>> Hi Benson,
>>
>> See
>>
>> http://lucene.apache.org/core/4_6_0/test-framework/org/apache/lucene/util/_TestUtil.html
>>
>> There are methods like:
>> randomRealisticUnicodeString(Random r, int minLength, int maxLength)
>>
>> or more fancy:
>> randomRegexpishString(Random r, int maxLength)
>>
>> Those methods are all static, so you can use from anywhere. The name
>> _TestUtil goes back to older days. But it is part of test-framework.
>>
>> -----
>> Uwe Schindler
>> H.-H.-Meier-Allee 63, D-28213 Bremen
>> http://www.thetaphi.de
>> eMail: uwe@thetaphi.de
>>
>> > -----Original Message-----
>> > From: Benson Margulies [mailto:bimargulies@gmail.com]
>> > Sent: Monday, February 17, 2014 1:19 PM
>> > To: dev@lucene.apache.org
>> > Subject: Exposing random string generation
>> >
>> > Down in the bottom of the randomized testing apparatus is some code for
>> > generating random stress data. The only public/protected API for it is
>> > to push
>> > it into an analysis chain. Would anyone object to a patch to allow
>> > direct access
>> > to methods that just deliver the randomized text? I'd like some random
>> > strings for code below the level of the analysis components.
>> >
>> > ---------------------------------------------------------------------
>> > To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org For additional
>> > commands, e-mail: dev-help@lucene.apache.org
>>
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
>> For additional commands, e-mail: dev-help@lucene.apache.org
>>
>

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


Re: Exposing random string generation

Posted by Robert Muir <rc...@gmail.com>.
There are, but the stuff inside BaseTokenStreamTestCase is "better" at
finding bugs than any of those methods. it should be moved there too.


On Mon, Feb 17, 2014 at 7:29 AM, Uwe Schindler <uw...@thetaphi.de> wrote:

> Hi Benson,
>
> See
>
> http://lucene.apache.org/core/4_6_0/test-framework/org/apache/lucene/util/_TestUtil.html
>
> There are methods like:
> randomRealisticUnicodeString(Random r, int minLength, int maxLength)
>
> or more fancy:
> randomRegexpishString(Random r, int maxLength)
>
> Those methods are all static, so you can use from anywhere. The name
> _TestUtil goes back to older days. But it is part of test-framework.
>
> -----
> Uwe Schindler
> H.-H.-Meier-Allee 63, D-28213 Bremen
> http://www.thetaphi.de
> eMail: uwe@thetaphi.de
>
> > -----Original Message-----
> > From: Benson Margulies [mailto:bimargulies@gmail.com]
> > Sent: Monday, February 17, 2014 1:19 PM
> > To: dev@lucene.apache.org
> > Subject: Exposing random string generation
> >
> > Down in the bottom of the randomized testing apparatus is some code for
> > generating random stress data. The only public/protected API for it is
> to push
> > it into an analysis chain. Would anyone object to a patch to allow
> direct access
> > to methods that just deliver the randomized text? I'd like some random
> > strings for code below the level of the analysis components.
> >
> > ---------------------------------------------------------------------
> > To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org For additional
> > commands, e-mail: dev-help@lucene.apache.org
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
> For additional commands, e-mail: dev-help@lucene.apache.org
>
>

Re: Exposing random string generation

Posted by Benson Margulies <bi...@gmail.com>.
I'll have a look and see if there is code to move from the test case
to the _TestUtil.


On Mon, Feb 17, 2014 at 7:29 AM, Uwe Schindler <uw...@thetaphi.de> wrote:
> Hi Benson,
>
> See
> http://lucene.apache.org/core/4_6_0/test-framework/org/apache/lucene/util/_TestUtil.html
>
> There are methods like:
> randomRealisticUnicodeString(Random r, int minLength, int maxLength)
>
> or more fancy:
> randomRegexpishString(Random r, int maxLength)
>
> Those methods are all static, so you can use from anywhere. The name _TestUtil goes back to older days. But it is part of test-framework.
>
> -----
> Uwe Schindler
> H.-H.-Meier-Allee 63, D-28213 Bremen
> http://www.thetaphi.de
> eMail: uwe@thetaphi.de
>
>> -----Original Message-----
>> From: Benson Margulies [mailto:bimargulies@gmail.com]
>> Sent: Monday, February 17, 2014 1:19 PM
>> To: dev@lucene.apache.org
>> Subject: Exposing random string generation
>>
>> Down in the bottom of the randomized testing apparatus is some code for
>> generating random stress data. The only public/protected API for it is to push
>> it into an analysis chain. Would anyone object to a patch to allow direct access
>> to methods that just deliver the randomized text? I'd like some random
>> strings for code below the level of the analysis components.
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org For additional
>> commands, e-mail: dev-help@lucene.apache.org
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
> For additional commands, e-mail: dev-help@lucene.apache.org
>

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


RE: Exposing random string generation

Posted by Uwe Schindler <uw...@thetaphi.de>.
Hi Benson,

See
http://lucene.apache.org/core/4_6_0/test-framework/org/apache/lucene/util/_TestUtil.html

There are methods like:
randomRealisticUnicodeString(Random r, int minLength, int maxLength)

or more fancy:
randomRegexpishString(Random r, int maxLength)

Those methods are all static, so you can use from anywhere. The name _TestUtil goes back to older days. But it is part of test-framework.

-----
Uwe Schindler
H.-H.-Meier-Allee 63, D-28213 Bremen
http://www.thetaphi.de
eMail: uwe@thetaphi.de

> -----Original Message-----
> From: Benson Margulies [mailto:bimargulies@gmail.com]
> Sent: Monday, February 17, 2014 1:19 PM
> To: dev@lucene.apache.org
> Subject: Exposing random string generation
> 
> Down in the bottom of the randomized testing apparatus is some code for
> generating random stress data. The only public/protected API for it is to push
> it into an analysis chain. Would anyone object to a patch to allow direct access
> to methods that just deliver the randomized text? I'd like some random
> strings for code below the level of the analysis components.
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org For additional
> commands, e-mail: dev-help@lucene.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


Re: Exposing random string generation

Posted by Christian Moen <cm...@atilika.com>.
Sounds useful. +1.

Christian

On Feb 17, 2014, at 9:18 PM, Benson Margulies <bi...@gmail.com> wrote:

> Down in the bottom of the randomized testing apparatus is some code
> for generating random stress data. The only public/protected API for
> it is to push it into an analysis chain. Would anyone object to a
> patch to allow direct access to methods that just deliver the
> randomized text? I'd like some random strings for code below the level
> of the analysis components.
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
> For additional commands, e-mail: dev-help@lucene.apache.org
> 


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org