You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@lucene.apache.org by Shai Erera <se...@gmail.com> on 2010/07/26 08:57:15 UTC

TestUTF32ToUTF8.testRandomRegexes fails

Hi

I was running tests on trunk (after merging the changes from LUCENE-2537)
and received this error message:

expected:<true> but was:<false>

junit.framework.AssertionFailedError: expected: but was:
at
org.apache.lucene.util.automaton.TestUTF32ToUTF8.assertAutomaton(TestUTF32ToUTF8.java:197)
at
org.apache.lucene.util.automaton.TestUTF32ToUTF8.testRandomRegexes(TestUTF32ToUTF8.java:170)
at org.apache.lucene.util.LuceneTestCase.runBare(LuceneTestCase.java:285)

NOTE: random seed of testcase 'testRandomRegexes' was: 3510820306304573866

I'm sure it's related to my changes. Has anyone else seen this before?

Shai

Re: TestUTF32ToUTF8.testRandomRegexes fails

Posted by Michael McCandless <lu...@mikemccandless.com>.
On Tue, Jul 27, 2010 at 7:47 PM, Robert Muir <rc...@gmail.com> wrote:
>
>
> On Tue, Jul 27, 2010 at 7:37 PM, Mark Miller <ma...@gmail.com> wrote:
>>
>> On 7/26/10 8:26 AM, Michael McCandless wrote:
>> > On a more general note...
>> >
>> > Any time any of you out there hit an "odd" test failure, please please
>> > please do just what Shai did: take it to the dev list!
>> >
>> > Think of Lucene's unit tests like SETI :)  We are desperately seeking
>> > bugs, and you and your machine may just be lucky enough to find one...
>> > go forth and buy expensive new power hungry computers just so you can
>> > run the random tests over and over, seeking the bugs!
>> >
>> > But be sure to include that random seed when you do hit a failure...
>> >
>> > Mike
>> >
>>
>> Okay - I'll hold off on buying the new power hungry computer, but I have
>> added lucene/solr tests to a hudson install I have on my linux machine.
>> It doesn't do much usually, so it can run tests on a 5-12 minute
>> interval depending on what other tests I'm running.
>>
>> Hopefully Robert is paying one beer at Lucene Revolution per random
>> find...
>
> yes, when we kidnap mikemccand we should also raid his kegorator...

Oh man I may need a bigger kegerator... mine only holds a 1/6th keg!!
But then I do have this on tap now:

    http://www.southerntierbrewing.com/for%20download%20page/downloads_unearthly.html

which is like 2X the alcohol of normal beer.  So it sort of cancels out....

Don't you all go finding bugs at once now!

Mike

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


Re: TestUTF32ToUTF8.testRandomRegexes fails

Posted by Robert Muir <rc...@gmail.com>.
On Tue, Jul 27, 2010 at 7:37 PM, Mark Miller <ma...@gmail.com> wrote:

> On 7/26/10 8:26 AM, Michael McCandless wrote:
> > On a more general note...
> >
> > Any time any of you out there hit an "odd" test failure, please please
> > please do just what Shai did: take it to the dev list!
> >
> > Think of Lucene's unit tests like SETI :)  We are desperately seeking
> > bugs, and you and your machine may just be lucky enough to find one...
> > go forth and buy expensive new power hungry computers just so you can
> > run the random tests over and over, seeking the bugs!
> >
> > But be sure to include that random seed when you do hit a failure...
> >
> > Mike
> >
>
> Okay - I'll hold off on buying the new power hungry computer, but I have
> added lucene/solr tests to a hudson install I have on my linux machine.
> It doesn't do much usually, so it can run tests on a 5-12 minute
> interval depending on what other tests I'm running.
>
> Hopefully Robert is paying one beer at Lucene Revolution per random find...
>

yes, when we kidnap mikemccand we should also raid his kegorator...


>
> Mark
> - lucidimagination.com
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
> For additional commands, e-mail: dev-help@lucene.apache.org
>
>


-- 
Robert Muir
rcmuir@gmail.com

Re: TestUTF32ToUTF8.testRandomRegexes fails

Posted by Mark Miller <ma...@gmail.com>.
On 7/26/10 8:26 AM, Michael McCandless wrote:
> On a more general note...
> 
> Any time any of you out there hit an "odd" test failure, please please
> please do just what Shai did: take it to the dev list!
> 
> Think of Lucene's unit tests like SETI :)  We are desperately seeking
> bugs, and you and your machine may just be lucky enough to find one...
> go forth and buy expensive new power hungry computers just so you can
> run the random tests over and over, seeking the bugs!
> 
> But be sure to include that random seed when you do hit a failure...
> 
> Mike
> 

Okay - I'll hold off on buying the new power hungry computer, but I have
added lucene/solr tests to a hudson install I have on my linux machine.
It doesn't do much usually, so it can run tests on a 5-12 minute
interval depending on what other tests I'm running.

Hopefully Robert is paying one beer at Lucene Revolution per random find...

Mark
- lucidimagination.com

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


Re: TestUTF32ToUTF8.testRandomRegexes fails

Posted by Michael McCandless <lu...@mikemccandless.com>.
On Mon, Jul 26, 2010 at 10:57 AM, Shai Erera <se...@gmail.com> wrote:
> Sorry for the delayed response.
>
> I ran it a couple more times, from Eclipse and Ant, and each time it fails
> (amazing !), w/ different seeds. More seeds that fail:
> NOTE: random seed of testcase 'testRandomRegexes' was: -4244174191361080127
> NOTE: random seed of testcase 'testRandomRegexes' was: -7059086272401721644
> NOTE: random seed of testcase 'testRandomRegexes' was: -1314734215611104147
>
> I use IBM JVM, tried w/ both 1.5 and 1.6 ...

Jeez this sounds nasty....

> Mike, can we use LUCENE-2565 to track this, or would you prefer that I open
> a separate one?

Can you open a new one?  That issue is just about the test running
forever, trying to find a good random character :)

Mike

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


Re: TestUTF32ToUTF8.testRandomRegexes fails

Posted by Robert Muir <rc...@gmail.com>.
I don't think it is a bug... we sorta took the route that if you send
invalid unicode at anything in lucene the results are "undefined" :)

Because really, the results of converting this invalid unicode are even
undefined to the JVM itself, as we see IBM and Sun have different
behavior...

On Tue, Jul 27, 2010 at 4:53 AM, Shai Erera <se...@gmail.com> wrote:

> As reported on the issue, the patch solves the problem.
>
> However, I was wondering whether that doesn't expose a bug in
> CharacterRunAutomaton -- it handles characters that the JVM ignores when
> dealing w/ the string (at least when converting them to bytes). Is that ok?
> Shouldn't we check somewhere that that character should be handled at all?
>
> Shai
>
>
> On Tue, Jul 27, 2010 at 12:41 AM, Michael McCandless <
> lucene@mikemccandless.com> wrote:
>
>> Shai can you try the patch on LUCENE-2568?  Thanks.
>>
>> Mike
>>
>> On Mon, Jul 26, 2010 at 4:25 PM, Michael McCandless
>> <lu...@mikemccandless.com> wrote:
>> > OK I think likely this is a bug in RAS.  And we are just seeing the
>> > difference in how Oracle's & IBM's JREs handle an unpaired
>> > surrogate...
>> >
>> > Lemme work out a patch...
>> >
>> > Mike
>> >
>> > On Mon, Jul 26, 2010 at 4:13 PM, Michael McCandless
>> > <lu...@mikemccandless.com> wrote:
>> >> Yeah that char is a high surrogate which is unpaired, which is no good
>> >> -- it's invalid.  Cool, though, that Google puts us first when you
>> >> search on this character :)
>> >>
>> >> Can you figure out how that bad string was created?  That "if
>> >> (random.nextBoolean())" either creates the string randomly (which
>> >> should never return unpaired surrogate), or, calls
>> >> RandomAcceptedString.getRandomAcceptedString... maybe the bug is in
>> >> RAS.
>> >>
>> >> Mike
>> >>
>> >> On Mon, Jul 26, 2010 at 3:41 PM, Shai Erera <se...@gmail.com> wrote:
>> >>> From here:
>> http://www.fileformat.info/info/unicode/char/d9ff/index.htm
>> >>>
>> >>> Looks like that character is not a valid Unicode character, and
>> perhaps the
>> >>> IBM's JVM behaves correctly? Robert - you're the Unicode expert :).
>> >>>
>> >>> Shai
>> >>>
>> >>> On Mon, Jul 26, 2010 at 10:40 PM, Shai Erera <se...@gmail.com>
>> wrote:
>> >>>>
>> >>>> I don't know what was the thing w/ the strings generated before, but
>> now I
>> >>>> ran the test again w/ the same seed and it generates the same
>> strings. So at
>> >>>> least it seems there are no problems w/ the Random class :).
>> >>>>
>> >>>> However, the string l.E fails w/ the IBM JVM and succeeds w/ SUN's.
>> Any
>> >>>> ideas why? What does the test check anyway?
>> >>>>
>> >>>> I ran TRR2, and set the regexp to always be "l.E" and the test
>> passes. The
>> >>>> failure comes from
>> >>>>
>> >>>> junit.framework.AssertionFailedError: expected:<true> but was:<false>
>> >>>>     at
>> >>>>
>> org.apache.lucene.util.automaton.TestUTF32ToUTF8.assertAutomaton(TestUTF32ToUTF8.java:199)
>> >>>>     at
>> >>>>
>> org.apache.lucene.util.automaton.TestUTF32ToUTF8.testRandomRegexes(TestUTF32ToUTF8.java:171)
>> >>>>
>> >>>> I've set regexp to "l.E", and also 'string' inside assertAutomaton to
>> >>>> "\u006C\uD9FF\u0045". The byte[] returned from
>> string.getBytes("UTF-8") are
>> >>>> [108, 69]. It just ignores the middle character. Perhaps that's why
>> the test
>> >>>> fails?
>> >>>>
>> >>>> When I run this w/ SUN's JVM, the bytes returned are [108, 63, 69].
>> >>>>
>> >>>> If I manually set the bytes, using IBM's, to [108, 63, 69], then the
>> test
>> >>>> passes.
>> >>>>
>> >>>> Interestingly, Googling for \uD9FF brings back LUCENE-2019 as the
>> first
>> >>>> result :). I'll dig some more into this character, and why the IBM
>> and SUN
>> >>>> JVMs return different byte[] representation for the same sequence of
>> >>>> characters. If you already spot the problem, please let me know.
>> >>>>
>> >>>> BTW, the test calls _TestUtil.getRandomMultiplier on every iteration
>> loop,
>> >>>> which goes and checks a system property. Perhaps we can extract it to
>> a
>> >>>> variable, or include a static constant in LuceneTestCase(J4) or
>> something?
>> >>>>
>> >>>> Shai
>> >>>>
>> >>>> On Mon, Jul 26, 2010 at 9:22 PM, Robert Muir <rc...@gmail.com>
>> wrote:
>> >>>>>
>> >>>>> maybe there is a bug in ibm's random generator :)
>> >>>>>
>> >>>>> On Mon, Jul 26, 2010 at 11:50 AM, Michael McCandless
>> >>>>> <lu...@mikemccandless.com> wrote:
>> >>>>>>
>> >>>>>> That's VERY spooky that w/ a fixed seed you see different random
>> >>>>>> regexps being made.
>> >>>>>>
>> >>>>>> Mike
>> >>>>>>
>> >>>>>> On Mon, Jul 26, 2010 at 11:40 AM, Shai Erera <se...@gmail.com>
>> wrote:
>> >>>>>> > Ok I've dug deeper into the test. I set the random seed to
>> >>>>>> > -9029631602016965389L in setUp(), and discovered that on the 4th
>> >>>>>> > iteration
>> >>>>>> > it breaks. For some reason though, AutomatonTestUtil.randomRegex
>> >>>>>> > generates
>> >>>>>> > different strings every time I run the test, even though it uses
>> the
>> >>>>>> > same
>> >>>>>> > Random object w/ the same seed ...
>> >>>>>> >
>> >>>>>> > Anyway, one of the regex that failed was this "l.E" (w/o the
>> quotes)
>> >>>>>> > and I
>> >>>>>> > think it's a lowercase L, '.' (dot) and 'E' (uppercase). Hope
>> this
>> >>>>>> > helps.
>> >>>>>> >
>> >>>>>> > Shai
>> >>>>>> >
>> >>>>>> > On Mon, Jul 26, 2010 at 6:23 PM, Robert Muir <rc...@gmail.com>
>> wrote:
>> >>>>>> >>
>> >>>>>> >> sounds nasty... its good you are running the tests with this
>> >>>>>> >> different
>> >>>>>> >> jvm...
>> >>>>>> >>
>> >>>>>> >> On Mon, Jul 26, 2010 at 11:21 AM, Shai Erera <se...@gmail.com>
>> >>>>>> >> wrote:
>> >>>>>> >>>
>> >>>>>> >>> Tried to run it w/ SUN JRE6 and it succeeds ! I've tried
>> several
>> >>>>>> >>> times
>> >>>>>> >>> and it succeeds every time. However, when I revert back to
>> IBM's, it
>> >>>>>> >>> fail
>> >>>>>> >>> immediately.
>> >>>>>> >>>
>> >>>>>> >>> I can help w/ the debug, if you give me a hint where to look
>> :).
>> >>>>>> >>>
>> >>>>>> >>> Shai
>> >>>>>> >>>
>> >>>>>> >>> On Mon, Jul 26, 2010 at 5:57 PM, Shai Erera <se...@gmail.com>
>> >>>>>> >>> wrote:
>> >>>>>> >>>>
>> >>>>>> >>>> Sorry for the delayed response.
>> >>>>>> >>>>
>> >>>>>> >>>> I ran it a couple more times, from Eclipse and Ant, and each
>> time
>> >>>>>> >>>> it
>> >>>>>> >>>> fails (amazing !), w/ different seeds. More seeds that fail:
>> >>>>>> >>>> NOTE: random seed of testcase 'testRandomRegexes' was:
>> >>>>>> >>>> -4244174191361080127
>> >>>>>> >>>> NOTE: random seed of testcase 'testRandomRegexes' was:
>> >>>>>> >>>> -7059086272401721644
>> >>>>>> >>>> NOTE: random seed of testcase 'testRandomRegexes' was:
>> >>>>>> >>>> -1314734215611104147
>> >>>>>> >>>>
>> >>>>>> >>>> I use IBM JVM, tried w/ both 1.5 and 1.6 ...
>> >>>>>> >>>>
>> >>>>>> >>>> Mike, can we use LUCENE-2565 to track this, or would you
>> prefer
>> >>>>>> >>>> that I
>> >>>>>> >>>> open a separate one?
>> >>>>>> >>>>
>> >>>>>> >>>> Shai
>> >>>>>> >>>>
>> >>>>>> >>>> On Mon, Jul 26, 2010 at 3:26 PM, Michael McCandless
>> >>>>>> >>>> <lu...@mikemccandless.com> wrote:
>> >>>>>> >>>>>
>> >>>>>> >>>>> On a more general note...
>> >>>>>> >>>>>
>> >>>>>> >>>>> Any time any of you out there hit an "odd" test failure,
>> please
>> >>>>>> >>>>> please
>> >>>>>> >>>>> please do just what Shai did: take it to the dev list!
>> >>>>>> >>>>>
>> >>>>>> >>>>> Think of Lucene's unit tests like SETI :)  We are desperately
>> >>>>>> >>>>> seeking
>> >>>>>> >>>>> bugs, and you and your machine may just be lucky enough to
>> find
>> >>>>>> >>>>> one...
>> >>>>>> >>>>> go forth and buy expensive new power hungry computers just so
>> you
>> >>>>>> >>>>> can
>> >>>>>> >>>>> run the random tests over and over, seeking the bugs!
>> >>>>>> >>>>>
>> >>>>>> >>>>> But be sure to include that random seed when you do hit a
>> >>>>>> >>>>> failure...
>> >>>>>> >>>>>
>> >>>>>> >>>>> Mike
>> >>>>>> >>>>>
>> >>>>>> >>>>> On Mon, Jul 26, 2010 at 8:23 AM, Robert Muir <
>> rcmuir@gmail.com>
>> >>>>>> >>>>> wrote:
>> >>>>>> >>>>> > I agree, Shai can you open a bug? I cannot reproduce, did
>> you
>> >>>>>> >>>>> > use an
>> >>>>>> >>>>> > IBM JVM
>> >>>>>> >>>>> > or another environment that might help us figure it out?
>> >>>>>> >>>>> >
>> >>>>>> >>>>> > On Mon, Jul 26, 2010 at 6:29 AM, Michael McCandless
>> >>>>>> >>>>> > <lu...@mikemccandless.com> wrote:
>> >>>>>> >>>>> >>
>> >>>>>> >>>>> >> Hmmm this means a bug is lurking.  This is the power of
>> random
>> >>>>>> >>>>> >> testing
>> >>>>>> >>>>> >> (that every time we all run tests, we're testing different
>> >>>>>> >>>>> >> "paths"
>> >>>>>> >>>>> >> through the code)....
>> >>>>>> >>>>> >>
>> >>>>>> >>>>> >> It seems exceptionally unlikely that LUCENE-2537's changes
>> >>>>>> >>>>> >> would
>> >>>>>> >>>>> >> cause
>> >>>>>> >>>>> >> this!
>> >>>>>> >>>>> >>
>> >>>>>> >>>>> >> But, unfortunately, when I plug that seed in I don't see
>> it
>> >>>>>> >>>>> >> fail,
>> >>>>>> >>>>> >> which is odd.  I'll run a stress test to see if I can
>> tickle
>> >>>>>> >>>>> >> the
>> >>>>>> >>>>> >> bug... can you open a Jira issue so we don't lose track?
>> >>>>>> >>>>> >>
>> >>>>>> >>>>> >> Mike
>> >>>>>> >>>>> >>
>> >>>>>> >>>>> >> On Mon, Jul 26, 2010 at 2:57 AM, Shai Erera <
>> serera@gmail.com>
>> >>>>>> >>>>> >> wrote:
>> >>>>>> >>>>> >> > Hi
>> >>>>>> >>>>> >> >
>> >>>>>> >>>>> >> > I was running tests on trunk (after merging the changes
>> from
>> >>>>>> >>>>> >> > LUCENE-2537)
>> >>>>>> >>>>> >> > and received this error message:
>> >>>>>> >>>>> >> >
>> >>>>>> >>>>> >> > expected:<true> but was:<false>
>> >>>>>> >>>>> >> >
>> >>>>>> >>>>> >> > junit.framework.AssertionFailedError: expected: but was:
>> >>>>>> >>>>> >> > at
>> >>>>>> >>>>> >> >
>> >>>>>> >>>>> >> >
>> >>>>>> >>>>> >> >
>> >>>>>> >>>>> >> >
>> org.apache.lucene.util.automaton.TestUTF32ToUTF8.assertAutomaton(TestUTF32ToUTF8.java:197)
>> >>>>>> >>>>> >> > at
>> >>>>>> >>>>> >> >
>> >>>>>> >>>>> >> >
>> >>>>>> >>>>> >> >
>> >>>>>> >>>>> >> >
>> org.apache.lucene.util.automaton.TestUTF32ToUTF8.testRandomRegexes(TestUTF32ToUTF8.java:170)
>> >>>>>> >>>>> >> > at
>> >>>>>> >>>>> >> >
>> >>>>>> >>>>> >> >
>> >>>>>> >>>>> >> >
>> org.apache.lucene.util.LuceneTestCase.runBare(LuceneTestCase.java:285)
>> >>>>>> >>>>> >> >
>> >>>>>> >>>>> >> > NOTE: random seed of testcase 'testRandomRegexes' was:
>> >>>>>> >>>>> >> > 3510820306304573866
>> >>>>>> >>>>> >> >
>> >>>>>> >>>>> >> > I'm sure it's related to my changes. Has anyone else
>> seen
>> >>>>>> >>>>> >> > this
>> >>>>>> >>>>> >> > before?
>> >>>>>> >>>>> >> >
>> >>>>>> >>>>> >> > Shai
>> >>>>>> >>>>> >> >
>> >>>>>> >>>>> >>
>> >>>>>> >>>>> >>
>> >>>>>> >>>>> >>
>> >>>>>> >>>>> >>
>> ---------------------------------------------------------------------
>> >>>>>> >>>>> >> To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
>> >>>>>> >>>>> >> For additional commands, e-mail:
>> dev-help@lucene.apache.org
>> >>>>>> >>>>> >>
>> >>>>>> >>>>> >
>> >>>>>> >>>>> >
>> >>>>>> >>>>> >
>> >>>>>> >>>>> > --
>> >>>>>> >>>>> > Robert Muir
>> >>>>>> >>>>> > rcmuir@gmail.com
>> >>>>>> >>>>> >
>> >>>>>> >>>>>
>> >>>>>> >>>>>
>> >>>>>> >>>>>
>> ---------------------------------------------------------------------
>> >>>>>> >>>>> To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
>> >>>>>> >>>>> For additional commands, e-mail: dev-help@lucene.apache.org
>> >>>>>> >>>>>
>> >>>>>> >>>>
>> >>>>>> >>>
>> >>>>>> >>
>> >>>>>> >>
>> >>>>>> >>
>> >>>>>> >> --
>> >>>>>> >> Robert Muir
>> >>>>>> >> rcmuir@gmail.com
>> >>>>>> >
>> >>>>>> >
>> >>>>>>
>> >>>>>>
>> ---------------------------------------------------------------------
>> >>>>>> To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
>> >>>>>> For additional commands, e-mail: dev-help@lucene.apache.org
>> >>>>>>
>> >>>>>
>> >>>>>
>> >>>>>
>> >>>>> --
>> >>>>> Robert Muir
>> >>>>> rcmuir@gmail.com
>> >>>>
>> >>>
>> >>>
>> >>
>> >
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
>> For additional commands, e-mail: dev-help@lucene.apache.org
>>
>>
>


-- 
Robert Muir
rcmuir@gmail.com

Re: TestUTF32ToUTF8.testRandomRegexes fails

Posted by Shai Erera <se...@gmail.com>.
As reported on the issue, the patch solves the problem.

However, I was wondering whether that doesn't expose a bug in
CharacterRunAutomaton -- it handles characters that the JVM ignores when
dealing w/ the string (at least when converting them to bytes). Is that ok?
Shouldn't we check somewhere that that character should be handled at all?

Shai

On Tue, Jul 27, 2010 at 12:41 AM, Michael McCandless <
lucene@mikemccandless.com> wrote:

> Shai can you try the patch on LUCENE-2568?  Thanks.
>
> Mike
>
> On Mon, Jul 26, 2010 at 4:25 PM, Michael McCandless
> <lu...@mikemccandless.com> wrote:
> > OK I think likely this is a bug in RAS.  And we are just seeing the
> > difference in how Oracle's & IBM's JREs handle an unpaired
> > surrogate...
> >
> > Lemme work out a patch...
> >
> > Mike
> >
> > On Mon, Jul 26, 2010 at 4:13 PM, Michael McCandless
> > <lu...@mikemccandless.com> wrote:
> >> Yeah that char is a high surrogate which is unpaired, which is no good
> >> -- it's invalid.  Cool, though, that Google puts us first when you
> >> search on this character :)
> >>
> >> Can you figure out how that bad string was created?  That "if
> >> (random.nextBoolean())" either creates the string randomly (which
> >> should never return unpaired surrogate), or, calls
> >> RandomAcceptedString.getRandomAcceptedString... maybe the bug is in
> >> RAS.
> >>
> >> Mike
> >>
> >> On Mon, Jul 26, 2010 at 3:41 PM, Shai Erera <se...@gmail.com> wrote:
> >>> From here: http://www.fileformat.info/info/unicode/char/d9ff/index.htm
> >>>
> >>> Looks like that character is not a valid Unicode character, and perhaps
> the
> >>> IBM's JVM behaves correctly? Robert - you're the Unicode expert :).
> >>>
> >>> Shai
> >>>
> >>> On Mon, Jul 26, 2010 at 10:40 PM, Shai Erera <se...@gmail.com> wrote:
> >>>>
> >>>> I don't know what was the thing w/ the strings generated before, but
> now I
> >>>> ran the test again w/ the same seed and it generates the same strings.
> So at
> >>>> least it seems there are no problems w/ the Random class :).
> >>>>
> >>>> However, the string l.E fails w/ the IBM JVM and succeeds w/ SUN's.
> Any
> >>>> ideas why? What does the test check anyway?
> >>>>
> >>>> I ran TRR2, and set the regexp to always be "l.E" and the test passes.
> The
> >>>> failure comes from
> >>>>
> >>>> junit.framework.AssertionFailedError: expected:<true> but was:<false>
> >>>>     at
> >>>>
> org.apache.lucene.util.automaton.TestUTF32ToUTF8.assertAutomaton(TestUTF32ToUTF8.java:199)
> >>>>     at
> >>>>
> org.apache.lucene.util.automaton.TestUTF32ToUTF8.testRandomRegexes(TestUTF32ToUTF8.java:171)
> >>>>
> >>>> I've set regexp to "l.E", and also 'string' inside assertAutomaton to
> >>>> "\u006C\uD9FF\u0045". The byte[] returned from
> string.getBytes("UTF-8") are
> >>>> [108, 69]. It just ignores the middle character. Perhaps that's why
> the test
> >>>> fails?
> >>>>
> >>>> When I run this w/ SUN's JVM, the bytes returned are [108, 63, 69].
> >>>>
> >>>> If I manually set the bytes, using IBM's, to [108, 63, 69], then the
> test
> >>>> passes.
> >>>>
> >>>> Interestingly, Googling for \uD9FF brings back LUCENE-2019 as the
> first
> >>>> result :). I'll dig some more into this character, and why the IBM and
> SUN
> >>>> JVMs return different byte[] representation for the same sequence of
> >>>> characters. If you already spot the problem, please let me know.
> >>>>
> >>>> BTW, the test calls _TestUtil.getRandomMultiplier on every iteration
> loop,
> >>>> which goes and checks a system property. Perhaps we can extract it to
> a
> >>>> variable, or include a static constant in LuceneTestCase(J4) or
> something?
> >>>>
> >>>> Shai
> >>>>
> >>>> On Mon, Jul 26, 2010 at 9:22 PM, Robert Muir <rc...@gmail.com>
> wrote:
> >>>>>
> >>>>> maybe there is a bug in ibm's random generator :)
> >>>>>
> >>>>> On Mon, Jul 26, 2010 at 11:50 AM, Michael McCandless
> >>>>> <lu...@mikemccandless.com> wrote:
> >>>>>>
> >>>>>> That's VERY spooky that w/ a fixed seed you see different random
> >>>>>> regexps being made.
> >>>>>>
> >>>>>> Mike
> >>>>>>
> >>>>>> On Mon, Jul 26, 2010 at 11:40 AM, Shai Erera <se...@gmail.com>
> wrote:
> >>>>>> > Ok I've dug deeper into the test. I set the random seed to
> >>>>>> > -9029631602016965389L in setUp(), and discovered that on the 4th
> >>>>>> > iteration
> >>>>>> > it breaks. For some reason though, AutomatonTestUtil.randomRegex
> >>>>>> > generates
> >>>>>> > different strings every time I run the test, even though it uses
> the
> >>>>>> > same
> >>>>>> > Random object w/ the same seed ...
> >>>>>> >
> >>>>>> > Anyway, one of the regex that failed was this "l.E" (w/o the
> quotes)
> >>>>>> > and I
> >>>>>> > think it's a lowercase L, '.' (dot) and 'E' (uppercase). Hope this
> >>>>>> > helps.
> >>>>>> >
> >>>>>> > Shai
> >>>>>> >
> >>>>>> > On Mon, Jul 26, 2010 at 6:23 PM, Robert Muir <rc...@gmail.com>
> wrote:
> >>>>>> >>
> >>>>>> >> sounds nasty... its good you are running the tests with this
> >>>>>> >> different
> >>>>>> >> jvm...
> >>>>>> >>
> >>>>>> >> On Mon, Jul 26, 2010 at 11:21 AM, Shai Erera <se...@gmail.com>
> >>>>>> >> wrote:
> >>>>>> >>>
> >>>>>> >>> Tried to run it w/ SUN JRE6 and it succeeds ! I've tried several
> >>>>>> >>> times
> >>>>>> >>> and it succeeds every time. However, when I revert back to
> IBM's, it
> >>>>>> >>> fail
> >>>>>> >>> immediately.
> >>>>>> >>>
> >>>>>> >>> I can help w/ the debug, if you give me a hint where to look :).
> >>>>>> >>>
> >>>>>> >>> Shai
> >>>>>> >>>
> >>>>>> >>> On Mon, Jul 26, 2010 at 5:57 PM, Shai Erera <se...@gmail.com>
> >>>>>> >>> wrote:
> >>>>>> >>>>
> >>>>>> >>>> Sorry for the delayed response.
> >>>>>> >>>>
> >>>>>> >>>> I ran it a couple more times, from Eclipse and Ant, and each
> time
> >>>>>> >>>> it
> >>>>>> >>>> fails (amazing !), w/ different seeds. More seeds that fail:
> >>>>>> >>>> NOTE: random seed of testcase 'testRandomRegexes' was:
> >>>>>> >>>> -4244174191361080127
> >>>>>> >>>> NOTE: random seed of testcase 'testRandomRegexes' was:
> >>>>>> >>>> -7059086272401721644
> >>>>>> >>>> NOTE: random seed of testcase 'testRandomRegexes' was:
> >>>>>> >>>> -1314734215611104147
> >>>>>> >>>>
> >>>>>> >>>> I use IBM JVM, tried w/ both 1.5 and 1.6 ...
> >>>>>> >>>>
> >>>>>> >>>> Mike, can we use LUCENE-2565 to track this, or would you prefer
> >>>>>> >>>> that I
> >>>>>> >>>> open a separate one?
> >>>>>> >>>>
> >>>>>> >>>> Shai
> >>>>>> >>>>
> >>>>>> >>>> On Mon, Jul 26, 2010 at 3:26 PM, Michael McCandless
> >>>>>> >>>> <lu...@mikemccandless.com> wrote:
> >>>>>> >>>>>
> >>>>>> >>>>> On a more general note...
> >>>>>> >>>>>
> >>>>>> >>>>> Any time any of you out there hit an "odd" test failure,
> please
> >>>>>> >>>>> please
> >>>>>> >>>>> please do just what Shai did: take it to the dev list!
> >>>>>> >>>>>
> >>>>>> >>>>> Think of Lucene's unit tests like SETI :)  We are desperately
> >>>>>> >>>>> seeking
> >>>>>> >>>>> bugs, and you and your machine may just be lucky enough to
> find
> >>>>>> >>>>> one...
> >>>>>> >>>>> go forth and buy expensive new power hungry computers just so
> you
> >>>>>> >>>>> can
> >>>>>> >>>>> run the random tests over and over, seeking the bugs!
> >>>>>> >>>>>
> >>>>>> >>>>> But be sure to include that random seed when you do hit a
> >>>>>> >>>>> failure...
> >>>>>> >>>>>
> >>>>>> >>>>> Mike
> >>>>>> >>>>>
> >>>>>> >>>>> On Mon, Jul 26, 2010 at 8:23 AM, Robert Muir <
> rcmuir@gmail.com>
> >>>>>> >>>>> wrote:
> >>>>>> >>>>> > I agree, Shai can you open a bug? I cannot reproduce, did
> you
> >>>>>> >>>>> > use an
> >>>>>> >>>>> > IBM JVM
> >>>>>> >>>>> > or another environment that might help us figure it out?
> >>>>>> >>>>> >
> >>>>>> >>>>> > On Mon, Jul 26, 2010 at 6:29 AM, Michael McCandless
> >>>>>> >>>>> > <lu...@mikemccandless.com> wrote:
> >>>>>> >>>>> >>
> >>>>>> >>>>> >> Hmmm this means a bug is lurking.  This is the power of
> random
> >>>>>> >>>>> >> testing
> >>>>>> >>>>> >> (that every time we all run tests, we're testing different
> >>>>>> >>>>> >> "paths"
> >>>>>> >>>>> >> through the code)....
> >>>>>> >>>>> >>
> >>>>>> >>>>> >> It seems exceptionally unlikely that LUCENE-2537's changes
> >>>>>> >>>>> >> would
> >>>>>> >>>>> >> cause
> >>>>>> >>>>> >> this!
> >>>>>> >>>>> >>
> >>>>>> >>>>> >> But, unfortunately, when I plug that seed in I don't see it
> >>>>>> >>>>> >> fail,
> >>>>>> >>>>> >> which is odd.  I'll run a stress test to see if I can
> tickle
> >>>>>> >>>>> >> the
> >>>>>> >>>>> >> bug... can you open a Jira issue so we don't lose track?
> >>>>>> >>>>> >>
> >>>>>> >>>>> >> Mike
> >>>>>> >>>>> >>
> >>>>>> >>>>> >> On Mon, Jul 26, 2010 at 2:57 AM, Shai Erera <
> serera@gmail.com>
> >>>>>> >>>>> >> wrote:
> >>>>>> >>>>> >> > Hi
> >>>>>> >>>>> >> >
> >>>>>> >>>>> >> > I was running tests on trunk (after merging the changes
> from
> >>>>>> >>>>> >> > LUCENE-2537)
> >>>>>> >>>>> >> > and received this error message:
> >>>>>> >>>>> >> >
> >>>>>> >>>>> >> > expected:<true> but was:<false>
> >>>>>> >>>>> >> >
> >>>>>> >>>>> >> > junit.framework.AssertionFailedError: expected: but was:
> >>>>>> >>>>> >> > at
> >>>>>> >>>>> >> >
> >>>>>> >>>>> >> >
> >>>>>> >>>>> >> >
> >>>>>> >>>>> >> >
> org.apache.lucene.util.automaton.TestUTF32ToUTF8.assertAutomaton(TestUTF32ToUTF8.java:197)
> >>>>>> >>>>> >> > at
> >>>>>> >>>>> >> >
> >>>>>> >>>>> >> >
> >>>>>> >>>>> >> >
> >>>>>> >>>>> >> >
> org.apache.lucene.util.automaton.TestUTF32ToUTF8.testRandomRegexes(TestUTF32ToUTF8.java:170)
> >>>>>> >>>>> >> > at
> >>>>>> >>>>> >> >
> >>>>>> >>>>> >> >
> >>>>>> >>>>> >> >
> org.apache.lucene.util.LuceneTestCase.runBare(LuceneTestCase.java:285)
> >>>>>> >>>>> >> >
> >>>>>> >>>>> >> > NOTE: random seed of testcase 'testRandomRegexes' was:
> >>>>>> >>>>> >> > 3510820306304573866
> >>>>>> >>>>> >> >
> >>>>>> >>>>> >> > I'm sure it's related to my changes. Has anyone else seen
> >>>>>> >>>>> >> > this
> >>>>>> >>>>> >> > before?
> >>>>>> >>>>> >> >
> >>>>>> >>>>> >> > Shai
> >>>>>> >>>>> >> >
> >>>>>> >>>>> >>
> >>>>>> >>>>> >>
> >>>>>> >>>>> >>
> >>>>>> >>>>> >>
> ---------------------------------------------------------------------
> >>>>>> >>>>> >> To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
> >>>>>> >>>>> >> For additional commands, e-mail:
> dev-help@lucene.apache.org
> >>>>>> >>>>> >>
> >>>>>> >>>>> >
> >>>>>> >>>>> >
> >>>>>> >>>>> >
> >>>>>> >>>>> > --
> >>>>>> >>>>> > Robert Muir
> >>>>>> >>>>> > rcmuir@gmail.com
> >>>>>> >>>>> >
> >>>>>> >>>>>
> >>>>>> >>>>>
> >>>>>> >>>>>
> ---------------------------------------------------------------------
> >>>>>> >>>>> To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
> >>>>>> >>>>> For additional commands, e-mail: dev-help@lucene.apache.org
> >>>>>> >>>>>
> >>>>>> >>>>
> >>>>>> >>>
> >>>>>> >>
> >>>>>> >>
> >>>>>> >>
> >>>>>> >> --
> >>>>>> >> Robert Muir
> >>>>>> >> rcmuir@gmail.com
> >>>>>> >
> >>>>>> >
> >>>>>>
> >>>>>>
> ---------------------------------------------------------------------
> >>>>>> To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
> >>>>>> For additional commands, e-mail: dev-help@lucene.apache.org
> >>>>>>
> >>>>>
> >>>>>
> >>>>>
> >>>>> --
> >>>>> Robert Muir
> >>>>> rcmuir@gmail.com
> >>>>
> >>>
> >>>
> >>
> >
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
> For additional commands, e-mail: dev-help@lucene.apache.org
>
>

Re: TestUTF32ToUTF8.testRandomRegexes fails

Posted by Michael McCandless <lu...@mikemccandless.com>.
Shai can you try the patch on LUCENE-2568?  Thanks.

Mike

On Mon, Jul 26, 2010 at 4:25 PM, Michael McCandless
<lu...@mikemccandless.com> wrote:
> OK I think likely this is a bug in RAS.  And we are just seeing the
> difference in how Oracle's & IBM's JREs handle an unpaired
> surrogate...
>
> Lemme work out a patch...
>
> Mike
>
> On Mon, Jul 26, 2010 at 4:13 PM, Michael McCandless
> <lu...@mikemccandless.com> wrote:
>> Yeah that char is a high surrogate which is unpaired, which is no good
>> -- it's invalid.  Cool, though, that Google puts us first when you
>> search on this character :)
>>
>> Can you figure out how that bad string was created?  That "if
>> (random.nextBoolean())" either creates the string randomly (which
>> should never return unpaired surrogate), or, calls
>> RandomAcceptedString.getRandomAcceptedString... maybe the bug is in
>> RAS.
>>
>> Mike
>>
>> On Mon, Jul 26, 2010 at 3:41 PM, Shai Erera <se...@gmail.com> wrote:
>>> From here: http://www.fileformat.info/info/unicode/char/d9ff/index.htm
>>>
>>> Looks like that character is not a valid Unicode character, and perhaps the
>>> IBM's JVM behaves correctly? Robert - you're the Unicode expert :).
>>>
>>> Shai
>>>
>>> On Mon, Jul 26, 2010 at 10:40 PM, Shai Erera <se...@gmail.com> wrote:
>>>>
>>>> I don't know what was the thing w/ the strings generated before, but now I
>>>> ran the test again w/ the same seed and it generates the same strings. So at
>>>> least it seems there are no problems w/ the Random class :).
>>>>
>>>> However, the string l.E fails w/ the IBM JVM and succeeds w/ SUN's. Any
>>>> ideas why? What does the test check anyway?
>>>>
>>>> I ran TRR2, and set the regexp to always be "l.E" and the test passes. The
>>>> failure comes from
>>>>
>>>> junit.framework.AssertionFailedError: expected:<true> but was:<false>
>>>>     at
>>>> org.apache.lucene.util.automaton.TestUTF32ToUTF8.assertAutomaton(TestUTF32ToUTF8.java:199)
>>>>     at
>>>> org.apache.lucene.util.automaton.TestUTF32ToUTF8.testRandomRegexes(TestUTF32ToUTF8.java:171)
>>>>
>>>> I've set regexp to "l.E", and also 'string' inside assertAutomaton to
>>>> "\u006C\uD9FF\u0045". The byte[] returned from string.getBytes("UTF-8") are
>>>> [108, 69]. It just ignores the middle character. Perhaps that's why the test
>>>> fails?
>>>>
>>>> When I run this w/ SUN's JVM, the bytes returned are [108, 63, 69].
>>>>
>>>> If I manually set the bytes, using IBM's, to [108, 63, 69], then the test
>>>> passes.
>>>>
>>>> Interestingly, Googling for \uD9FF brings back LUCENE-2019 as the first
>>>> result :). I'll dig some more into this character, and why the IBM and SUN
>>>> JVMs return different byte[] representation for the same sequence of
>>>> characters. If you already spot the problem, please let me know.
>>>>
>>>> BTW, the test calls _TestUtil.getRandomMultiplier on every iteration loop,
>>>> which goes and checks a system property. Perhaps we can extract it to a
>>>> variable, or include a static constant in LuceneTestCase(J4) or something?
>>>>
>>>> Shai
>>>>
>>>> On Mon, Jul 26, 2010 at 9:22 PM, Robert Muir <rc...@gmail.com> wrote:
>>>>>
>>>>> maybe there is a bug in ibm's random generator :)
>>>>>
>>>>> On Mon, Jul 26, 2010 at 11:50 AM, Michael McCandless
>>>>> <lu...@mikemccandless.com> wrote:
>>>>>>
>>>>>> That's VERY spooky that w/ a fixed seed you see different random
>>>>>> regexps being made.
>>>>>>
>>>>>> Mike
>>>>>>
>>>>>> On Mon, Jul 26, 2010 at 11:40 AM, Shai Erera <se...@gmail.com> wrote:
>>>>>> > Ok I've dug deeper into the test. I set the random seed to
>>>>>> > -9029631602016965389L in setUp(), and discovered that on the 4th
>>>>>> > iteration
>>>>>> > it breaks. For some reason though, AutomatonTestUtil.randomRegex
>>>>>> > generates
>>>>>> > different strings every time I run the test, even though it uses the
>>>>>> > same
>>>>>> > Random object w/ the same seed ...
>>>>>> >
>>>>>> > Anyway, one of the regex that failed was this "l.E" (w/o the quotes)
>>>>>> > and I
>>>>>> > think it's a lowercase L, '.' (dot) and 'E' (uppercase). Hope this
>>>>>> > helps.
>>>>>> >
>>>>>> > Shai
>>>>>> >
>>>>>> > On Mon, Jul 26, 2010 at 6:23 PM, Robert Muir <rc...@gmail.com> wrote:
>>>>>> >>
>>>>>> >> sounds nasty... its good you are running the tests with this
>>>>>> >> different
>>>>>> >> jvm...
>>>>>> >>
>>>>>> >> On Mon, Jul 26, 2010 at 11:21 AM, Shai Erera <se...@gmail.com>
>>>>>> >> wrote:
>>>>>> >>>
>>>>>> >>> Tried to run it w/ SUN JRE6 and it succeeds ! I've tried several
>>>>>> >>> times
>>>>>> >>> and it succeeds every time. However, when I revert back to IBM's, it
>>>>>> >>> fail
>>>>>> >>> immediately.
>>>>>> >>>
>>>>>> >>> I can help w/ the debug, if you give me a hint where to look :).
>>>>>> >>>
>>>>>> >>> Shai
>>>>>> >>>
>>>>>> >>> On Mon, Jul 26, 2010 at 5:57 PM, Shai Erera <se...@gmail.com>
>>>>>> >>> wrote:
>>>>>> >>>>
>>>>>> >>>> Sorry for the delayed response.
>>>>>> >>>>
>>>>>> >>>> I ran it a couple more times, from Eclipse and Ant, and each time
>>>>>> >>>> it
>>>>>> >>>> fails (amazing !), w/ different seeds. More seeds that fail:
>>>>>> >>>> NOTE: random seed of testcase 'testRandomRegexes' was:
>>>>>> >>>> -4244174191361080127
>>>>>> >>>> NOTE: random seed of testcase 'testRandomRegexes' was:
>>>>>> >>>> -7059086272401721644
>>>>>> >>>> NOTE: random seed of testcase 'testRandomRegexes' was:
>>>>>> >>>> -1314734215611104147
>>>>>> >>>>
>>>>>> >>>> I use IBM JVM, tried w/ both 1.5 and 1.6 ...
>>>>>> >>>>
>>>>>> >>>> Mike, can we use LUCENE-2565 to track this, or would you prefer
>>>>>> >>>> that I
>>>>>> >>>> open a separate one?
>>>>>> >>>>
>>>>>> >>>> Shai
>>>>>> >>>>
>>>>>> >>>> On Mon, Jul 26, 2010 at 3:26 PM, Michael McCandless
>>>>>> >>>> <lu...@mikemccandless.com> wrote:
>>>>>> >>>>>
>>>>>> >>>>> On a more general note...
>>>>>> >>>>>
>>>>>> >>>>> Any time any of you out there hit an "odd" test failure, please
>>>>>> >>>>> please
>>>>>> >>>>> please do just what Shai did: take it to the dev list!
>>>>>> >>>>>
>>>>>> >>>>> Think of Lucene's unit tests like SETI :)  We are desperately
>>>>>> >>>>> seeking
>>>>>> >>>>> bugs, and you and your machine may just be lucky enough to find
>>>>>> >>>>> one...
>>>>>> >>>>> go forth and buy expensive new power hungry computers just so you
>>>>>> >>>>> can
>>>>>> >>>>> run the random tests over and over, seeking the bugs!
>>>>>> >>>>>
>>>>>> >>>>> But be sure to include that random seed when you do hit a
>>>>>> >>>>> failure...
>>>>>> >>>>>
>>>>>> >>>>> Mike
>>>>>> >>>>>
>>>>>> >>>>> On Mon, Jul 26, 2010 at 8:23 AM, Robert Muir <rc...@gmail.com>
>>>>>> >>>>> wrote:
>>>>>> >>>>> > I agree, Shai can you open a bug? I cannot reproduce, did you
>>>>>> >>>>> > use an
>>>>>> >>>>> > IBM JVM
>>>>>> >>>>> > or another environment that might help us figure it out?
>>>>>> >>>>> >
>>>>>> >>>>> > On Mon, Jul 26, 2010 at 6:29 AM, Michael McCandless
>>>>>> >>>>> > <lu...@mikemccandless.com> wrote:
>>>>>> >>>>> >>
>>>>>> >>>>> >> Hmmm this means a bug is lurking.  This is the power of random
>>>>>> >>>>> >> testing
>>>>>> >>>>> >> (that every time we all run tests, we're testing different
>>>>>> >>>>> >> "paths"
>>>>>> >>>>> >> through the code)....
>>>>>> >>>>> >>
>>>>>> >>>>> >> It seems exceptionally unlikely that LUCENE-2537's changes
>>>>>> >>>>> >> would
>>>>>> >>>>> >> cause
>>>>>> >>>>> >> this!
>>>>>> >>>>> >>
>>>>>> >>>>> >> But, unfortunately, when I plug that seed in I don't see it
>>>>>> >>>>> >> fail,
>>>>>> >>>>> >> which is odd.  I'll run a stress test to see if I can tickle
>>>>>> >>>>> >> the
>>>>>> >>>>> >> bug... can you open a Jira issue so we don't lose track?
>>>>>> >>>>> >>
>>>>>> >>>>> >> Mike
>>>>>> >>>>> >>
>>>>>> >>>>> >> On Mon, Jul 26, 2010 at 2:57 AM, Shai Erera <se...@gmail.com>
>>>>>> >>>>> >> wrote:
>>>>>> >>>>> >> > Hi
>>>>>> >>>>> >> >
>>>>>> >>>>> >> > I was running tests on trunk (after merging the changes from
>>>>>> >>>>> >> > LUCENE-2537)
>>>>>> >>>>> >> > and received this error message:
>>>>>> >>>>> >> >
>>>>>> >>>>> >> > expected:<true> but was:<false>
>>>>>> >>>>> >> >
>>>>>> >>>>> >> > junit.framework.AssertionFailedError: expected: but was:
>>>>>> >>>>> >> > at
>>>>>> >>>>> >> >
>>>>>> >>>>> >> >
>>>>>> >>>>> >> >
>>>>>> >>>>> >> > org.apache.lucene.util.automaton.TestUTF32ToUTF8.assertAutomaton(TestUTF32ToUTF8.java:197)
>>>>>> >>>>> >> > at
>>>>>> >>>>> >> >
>>>>>> >>>>> >> >
>>>>>> >>>>> >> >
>>>>>> >>>>> >> > org.apache.lucene.util.automaton.TestUTF32ToUTF8.testRandomRegexes(TestUTF32ToUTF8.java:170)
>>>>>> >>>>> >> > at
>>>>>> >>>>> >> >
>>>>>> >>>>> >> >
>>>>>> >>>>> >> > org.apache.lucene.util.LuceneTestCase.runBare(LuceneTestCase.java:285)
>>>>>> >>>>> >> >
>>>>>> >>>>> >> > NOTE: random seed of testcase 'testRandomRegexes' was:
>>>>>> >>>>> >> > 3510820306304573866
>>>>>> >>>>> >> >
>>>>>> >>>>> >> > I'm sure it's related to my changes. Has anyone else seen
>>>>>> >>>>> >> > this
>>>>>> >>>>> >> > before?
>>>>>> >>>>> >> >
>>>>>> >>>>> >> > Shai
>>>>>> >>>>> >> >
>>>>>> >>>>> >>
>>>>>> >>>>> >>
>>>>>> >>>>> >>
>>>>>> >>>>> >> ---------------------------------------------------------------------
>>>>>> >>>>> >> To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
>>>>>> >>>>> >> For additional commands, e-mail: dev-help@lucene.apache.org
>>>>>> >>>>> >>
>>>>>> >>>>> >
>>>>>> >>>>> >
>>>>>> >>>>> >
>>>>>> >>>>> > --
>>>>>> >>>>> > Robert Muir
>>>>>> >>>>> > rcmuir@gmail.com
>>>>>> >>>>> >
>>>>>> >>>>>
>>>>>> >>>>>
>>>>>> >>>>> ---------------------------------------------------------------------
>>>>>> >>>>> To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
>>>>>> >>>>> For additional commands, e-mail: dev-help@lucene.apache.org
>>>>>> >>>>>
>>>>>> >>>>
>>>>>> >>>
>>>>>> >>
>>>>>> >>
>>>>>> >>
>>>>>> >> --
>>>>>> >> Robert Muir
>>>>>> >> rcmuir@gmail.com
>>>>>> >
>>>>>> >
>>>>>>
>>>>>> ---------------------------------------------------------------------
>>>>>> To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
>>>>>> For additional commands, e-mail: dev-help@lucene.apache.org
>>>>>>
>>>>>
>>>>>
>>>>>
>>>>> --
>>>>> Robert Muir
>>>>> rcmuir@gmail.com
>>>>
>>>
>>>
>>
>

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


Re: TestUTF32ToUTF8.testRandomRegexes fails

Posted by Michael McCandless <lu...@mikemccandless.com>.
OK I think likely this is a bug in RAS.  And we are just seeing the
difference in how Oracle's & IBM's JREs handle an unpaired
surrogate...

Lemme work out a patch...

Mike

On Mon, Jul 26, 2010 at 4:13 PM, Michael McCandless
<lu...@mikemccandless.com> wrote:
> Yeah that char is a high surrogate which is unpaired, which is no good
> -- it's invalid.  Cool, though, that Google puts us first when you
> search on this character :)
>
> Can you figure out how that bad string was created?  That "if
> (random.nextBoolean())" either creates the string randomly (which
> should never return unpaired surrogate), or, calls
> RandomAcceptedString.getRandomAcceptedString... maybe the bug is in
> RAS.
>
> Mike
>
> On Mon, Jul 26, 2010 at 3:41 PM, Shai Erera <se...@gmail.com> wrote:
>> From here: http://www.fileformat.info/info/unicode/char/d9ff/index.htm
>>
>> Looks like that character is not a valid Unicode character, and perhaps the
>> IBM's JVM behaves correctly? Robert - you're the Unicode expert :).
>>
>> Shai
>>
>> On Mon, Jul 26, 2010 at 10:40 PM, Shai Erera <se...@gmail.com> wrote:
>>>
>>> I don't know what was the thing w/ the strings generated before, but now I
>>> ran the test again w/ the same seed and it generates the same strings. So at
>>> least it seems there are no problems w/ the Random class :).
>>>
>>> However, the string l.E fails w/ the IBM JVM and succeeds w/ SUN's. Any
>>> ideas why? What does the test check anyway?
>>>
>>> I ran TRR2, and set the regexp to always be "l.E" and the test passes. The
>>> failure comes from
>>>
>>> junit.framework.AssertionFailedError: expected:<true> but was:<false>
>>>     at
>>> org.apache.lucene.util.automaton.TestUTF32ToUTF8.assertAutomaton(TestUTF32ToUTF8.java:199)
>>>     at
>>> org.apache.lucene.util.automaton.TestUTF32ToUTF8.testRandomRegexes(TestUTF32ToUTF8.java:171)
>>>
>>> I've set regexp to "l.E", and also 'string' inside assertAutomaton to
>>> "\u006C\uD9FF\u0045". The byte[] returned from string.getBytes("UTF-8") are
>>> [108, 69]. It just ignores the middle character. Perhaps that's why the test
>>> fails?
>>>
>>> When I run this w/ SUN's JVM, the bytes returned are [108, 63, 69].
>>>
>>> If I manually set the bytes, using IBM's, to [108, 63, 69], then the test
>>> passes.
>>>
>>> Interestingly, Googling for \uD9FF brings back LUCENE-2019 as the first
>>> result :). I'll dig some more into this character, and why the IBM and SUN
>>> JVMs return different byte[] representation for the same sequence of
>>> characters. If you already spot the problem, please let me know.
>>>
>>> BTW, the test calls _TestUtil.getRandomMultiplier on every iteration loop,
>>> which goes and checks a system property. Perhaps we can extract it to a
>>> variable, or include a static constant in LuceneTestCase(J4) or something?
>>>
>>> Shai
>>>
>>> On Mon, Jul 26, 2010 at 9:22 PM, Robert Muir <rc...@gmail.com> wrote:
>>>>
>>>> maybe there is a bug in ibm's random generator :)
>>>>
>>>> On Mon, Jul 26, 2010 at 11:50 AM, Michael McCandless
>>>> <lu...@mikemccandless.com> wrote:
>>>>>
>>>>> That's VERY spooky that w/ a fixed seed you see different random
>>>>> regexps being made.
>>>>>
>>>>> Mike
>>>>>
>>>>> On Mon, Jul 26, 2010 at 11:40 AM, Shai Erera <se...@gmail.com> wrote:
>>>>> > Ok I've dug deeper into the test. I set the random seed to
>>>>> > -9029631602016965389L in setUp(), and discovered that on the 4th
>>>>> > iteration
>>>>> > it breaks. For some reason though, AutomatonTestUtil.randomRegex
>>>>> > generates
>>>>> > different strings every time I run the test, even though it uses the
>>>>> > same
>>>>> > Random object w/ the same seed ...
>>>>> >
>>>>> > Anyway, one of the regex that failed was this "l.E" (w/o the quotes)
>>>>> > and I
>>>>> > think it's a lowercase L, '.' (dot) and 'E' (uppercase). Hope this
>>>>> > helps.
>>>>> >
>>>>> > Shai
>>>>> >
>>>>> > On Mon, Jul 26, 2010 at 6:23 PM, Robert Muir <rc...@gmail.com> wrote:
>>>>> >>
>>>>> >> sounds nasty... its good you are running the tests with this
>>>>> >> different
>>>>> >> jvm...
>>>>> >>
>>>>> >> On Mon, Jul 26, 2010 at 11:21 AM, Shai Erera <se...@gmail.com>
>>>>> >> wrote:
>>>>> >>>
>>>>> >>> Tried to run it w/ SUN JRE6 and it succeeds ! I've tried several
>>>>> >>> times
>>>>> >>> and it succeeds every time. However, when I revert back to IBM's, it
>>>>> >>> fail
>>>>> >>> immediately.
>>>>> >>>
>>>>> >>> I can help w/ the debug, if you give me a hint where to look :).
>>>>> >>>
>>>>> >>> Shai
>>>>> >>>
>>>>> >>> On Mon, Jul 26, 2010 at 5:57 PM, Shai Erera <se...@gmail.com>
>>>>> >>> wrote:
>>>>> >>>>
>>>>> >>>> Sorry for the delayed response.
>>>>> >>>>
>>>>> >>>> I ran it a couple more times, from Eclipse and Ant, and each time
>>>>> >>>> it
>>>>> >>>> fails (amazing !), w/ different seeds. More seeds that fail:
>>>>> >>>> NOTE: random seed of testcase 'testRandomRegexes' was:
>>>>> >>>> -4244174191361080127
>>>>> >>>> NOTE: random seed of testcase 'testRandomRegexes' was:
>>>>> >>>> -7059086272401721644
>>>>> >>>> NOTE: random seed of testcase 'testRandomRegexes' was:
>>>>> >>>> -1314734215611104147
>>>>> >>>>
>>>>> >>>> I use IBM JVM, tried w/ both 1.5 and 1.6 ...
>>>>> >>>>
>>>>> >>>> Mike, can we use LUCENE-2565 to track this, or would you prefer
>>>>> >>>> that I
>>>>> >>>> open a separate one?
>>>>> >>>>
>>>>> >>>> Shai
>>>>> >>>>
>>>>> >>>> On Mon, Jul 26, 2010 at 3:26 PM, Michael McCandless
>>>>> >>>> <lu...@mikemccandless.com> wrote:
>>>>> >>>>>
>>>>> >>>>> On a more general note...
>>>>> >>>>>
>>>>> >>>>> Any time any of you out there hit an "odd" test failure, please
>>>>> >>>>> please
>>>>> >>>>> please do just what Shai did: take it to the dev list!
>>>>> >>>>>
>>>>> >>>>> Think of Lucene's unit tests like SETI :)  We are desperately
>>>>> >>>>> seeking
>>>>> >>>>> bugs, and you and your machine may just be lucky enough to find
>>>>> >>>>> one...
>>>>> >>>>> go forth and buy expensive new power hungry computers just so you
>>>>> >>>>> can
>>>>> >>>>> run the random tests over and over, seeking the bugs!
>>>>> >>>>>
>>>>> >>>>> But be sure to include that random seed when you do hit a
>>>>> >>>>> failure...
>>>>> >>>>>
>>>>> >>>>> Mike
>>>>> >>>>>
>>>>> >>>>> On Mon, Jul 26, 2010 at 8:23 AM, Robert Muir <rc...@gmail.com>
>>>>> >>>>> wrote:
>>>>> >>>>> > I agree, Shai can you open a bug? I cannot reproduce, did you
>>>>> >>>>> > use an
>>>>> >>>>> > IBM JVM
>>>>> >>>>> > or another environment that might help us figure it out?
>>>>> >>>>> >
>>>>> >>>>> > On Mon, Jul 26, 2010 at 6:29 AM, Michael McCandless
>>>>> >>>>> > <lu...@mikemccandless.com> wrote:
>>>>> >>>>> >>
>>>>> >>>>> >> Hmmm this means a bug is lurking.  This is the power of random
>>>>> >>>>> >> testing
>>>>> >>>>> >> (that every time we all run tests, we're testing different
>>>>> >>>>> >> "paths"
>>>>> >>>>> >> through the code)....
>>>>> >>>>> >>
>>>>> >>>>> >> It seems exceptionally unlikely that LUCENE-2537's changes
>>>>> >>>>> >> would
>>>>> >>>>> >> cause
>>>>> >>>>> >> this!
>>>>> >>>>> >>
>>>>> >>>>> >> But, unfortunately, when I plug that seed in I don't see it
>>>>> >>>>> >> fail,
>>>>> >>>>> >> which is odd.  I'll run a stress test to see if I can tickle
>>>>> >>>>> >> the
>>>>> >>>>> >> bug... can you open a Jira issue so we don't lose track?
>>>>> >>>>> >>
>>>>> >>>>> >> Mike
>>>>> >>>>> >>
>>>>> >>>>> >> On Mon, Jul 26, 2010 at 2:57 AM, Shai Erera <se...@gmail.com>
>>>>> >>>>> >> wrote:
>>>>> >>>>> >> > Hi
>>>>> >>>>> >> >
>>>>> >>>>> >> > I was running tests on trunk (after merging the changes from
>>>>> >>>>> >> > LUCENE-2537)
>>>>> >>>>> >> > and received this error message:
>>>>> >>>>> >> >
>>>>> >>>>> >> > expected:<true> but was:<false>
>>>>> >>>>> >> >
>>>>> >>>>> >> > junit.framework.AssertionFailedError: expected: but was:
>>>>> >>>>> >> > at
>>>>> >>>>> >> >
>>>>> >>>>> >> >
>>>>> >>>>> >> >
>>>>> >>>>> >> > org.apache.lucene.util.automaton.TestUTF32ToUTF8.assertAutomaton(TestUTF32ToUTF8.java:197)
>>>>> >>>>> >> > at
>>>>> >>>>> >> >
>>>>> >>>>> >> >
>>>>> >>>>> >> >
>>>>> >>>>> >> > org.apache.lucene.util.automaton.TestUTF32ToUTF8.testRandomRegexes(TestUTF32ToUTF8.java:170)
>>>>> >>>>> >> > at
>>>>> >>>>> >> >
>>>>> >>>>> >> >
>>>>> >>>>> >> > org.apache.lucene.util.LuceneTestCase.runBare(LuceneTestCase.java:285)
>>>>> >>>>> >> >
>>>>> >>>>> >> > NOTE: random seed of testcase 'testRandomRegexes' was:
>>>>> >>>>> >> > 3510820306304573866
>>>>> >>>>> >> >
>>>>> >>>>> >> > I'm sure it's related to my changes. Has anyone else seen
>>>>> >>>>> >> > this
>>>>> >>>>> >> > before?
>>>>> >>>>> >> >
>>>>> >>>>> >> > Shai
>>>>> >>>>> >> >
>>>>> >>>>> >>
>>>>> >>>>> >>
>>>>> >>>>> >>
>>>>> >>>>> >> ---------------------------------------------------------------------
>>>>> >>>>> >> To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
>>>>> >>>>> >> For additional commands, e-mail: dev-help@lucene.apache.org
>>>>> >>>>> >>
>>>>> >>>>> >
>>>>> >>>>> >
>>>>> >>>>> >
>>>>> >>>>> > --
>>>>> >>>>> > Robert Muir
>>>>> >>>>> > rcmuir@gmail.com
>>>>> >>>>> >
>>>>> >>>>>
>>>>> >>>>>
>>>>> >>>>> ---------------------------------------------------------------------
>>>>> >>>>> To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
>>>>> >>>>> For additional commands, e-mail: dev-help@lucene.apache.org
>>>>> >>>>>
>>>>> >>>>
>>>>> >>>
>>>>> >>
>>>>> >>
>>>>> >>
>>>>> >> --
>>>>> >> Robert Muir
>>>>> >> rcmuir@gmail.com
>>>>> >
>>>>> >
>>>>>
>>>>> ---------------------------------------------------------------------
>>>>> To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
>>>>> For additional commands, e-mail: dev-help@lucene.apache.org
>>>>>
>>>>
>>>>
>>>>
>>>> --
>>>> Robert Muir
>>>> rcmuir@gmail.com
>>>
>>
>>
>

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


Re: TestUTF32ToUTF8.testRandomRegexes fails

Posted by Michael McCandless <lu...@mikemccandless.com>.
Yeah that char is a high surrogate which is unpaired, which is no good
-- it's invalid.  Cool, though, that Google puts us first when you
search on this character :)

Can you figure out how that bad string was created?  That "if
(random.nextBoolean())" either creates the string randomly (which
should never return unpaired surrogate), or, calls
RandomAcceptedString.getRandomAcceptedString... maybe the bug is in
RAS.

Mike

On Mon, Jul 26, 2010 at 3:41 PM, Shai Erera <se...@gmail.com> wrote:
> From here: http://www.fileformat.info/info/unicode/char/d9ff/index.htm
>
> Looks like that character is not a valid Unicode character, and perhaps the
> IBM's JVM behaves correctly? Robert - you're the Unicode expert :).
>
> Shai
>
> On Mon, Jul 26, 2010 at 10:40 PM, Shai Erera <se...@gmail.com> wrote:
>>
>> I don't know what was the thing w/ the strings generated before, but now I
>> ran the test again w/ the same seed and it generates the same strings. So at
>> least it seems there are no problems w/ the Random class :).
>>
>> However, the string l.E fails w/ the IBM JVM and succeeds w/ SUN's. Any
>> ideas why? What does the test check anyway?
>>
>> I ran TRR2, and set the regexp to always be "l.E" and the test passes. The
>> failure comes from
>>
>> junit.framework.AssertionFailedError: expected:<true> but was:<false>
>>     at
>> org.apache.lucene.util.automaton.TestUTF32ToUTF8.assertAutomaton(TestUTF32ToUTF8.java:199)
>>     at
>> org.apache.lucene.util.automaton.TestUTF32ToUTF8.testRandomRegexes(TestUTF32ToUTF8.java:171)
>>
>> I've set regexp to "l.E", and also 'string' inside assertAutomaton to
>> "\u006C\uD9FF\u0045". The byte[] returned from string.getBytes("UTF-8") are
>> [108, 69]. It just ignores the middle character. Perhaps that's why the test
>> fails?
>>
>> When I run this w/ SUN's JVM, the bytes returned are [108, 63, 69].
>>
>> If I manually set the bytes, using IBM's, to [108, 63, 69], then the test
>> passes.
>>
>> Interestingly, Googling for \uD9FF brings back LUCENE-2019 as the first
>> result :). I'll dig some more into this character, and why the IBM and SUN
>> JVMs return different byte[] representation for the same sequence of
>> characters. If you already spot the problem, please let me know.
>>
>> BTW, the test calls _TestUtil.getRandomMultiplier on every iteration loop,
>> which goes and checks a system property. Perhaps we can extract it to a
>> variable, or include a static constant in LuceneTestCase(J4) or something?
>>
>> Shai
>>
>> On Mon, Jul 26, 2010 at 9:22 PM, Robert Muir <rc...@gmail.com> wrote:
>>>
>>> maybe there is a bug in ibm's random generator :)
>>>
>>> On Mon, Jul 26, 2010 at 11:50 AM, Michael McCandless
>>> <lu...@mikemccandless.com> wrote:
>>>>
>>>> That's VERY spooky that w/ a fixed seed you see different random
>>>> regexps being made.
>>>>
>>>> Mike
>>>>
>>>> On Mon, Jul 26, 2010 at 11:40 AM, Shai Erera <se...@gmail.com> wrote:
>>>> > Ok I've dug deeper into the test. I set the random seed to
>>>> > -9029631602016965389L in setUp(), and discovered that on the 4th
>>>> > iteration
>>>> > it breaks. For some reason though, AutomatonTestUtil.randomRegex
>>>> > generates
>>>> > different strings every time I run the test, even though it uses the
>>>> > same
>>>> > Random object w/ the same seed ...
>>>> >
>>>> > Anyway, one of the regex that failed was this "l.E" (w/o the quotes)
>>>> > and I
>>>> > think it's a lowercase L, '.' (dot) and 'E' (uppercase). Hope this
>>>> > helps.
>>>> >
>>>> > Shai
>>>> >
>>>> > On Mon, Jul 26, 2010 at 6:23 PM, Robert Muir <rc...@gmail.com> wrote:
>>>> >>
>>>> >> sounds nasty... its good you are running the tests with this
>>>> >> different
>>>> >> jvm...
>>>> >>
>>>> >> On Mon, Jul 26, 2010 at 11:21 AM, Shai Erera <se...@gmail.com>
>>>> >> wrote:
>>>> >>>
>>>> >>> Tried to run it w/ SUN JRE6 and it succeeds ! I've tried several
>>>> >>> times
>>>> >>> and it succeeds every time. However, when I revert back to IBM's, it
>>>> >>> fail
>>>> >>> immediately.
>>>> >>>
>>>> >>> I can help w/ the debug, if you give me a hint where to look :).
>>>> >>>
>>>> >>> Shai
>>>> >>>
>>>> >>> On Mon, Jul 26, 2010 at 5:57 PM, Shai Erera <se...@gmail.com>
>>>> >>> wrote:
>>>> >>>>
>>>> >>>> Sorry for the delayed response.
>>>> >>>>
>>>> >>>> I ran it a couple more times, from Eclipse and Ant, and each time
>>>> >>>> it
>>>> >>>> fails (amazing !), w/ different seeds. More seeds that fail:
>>>> >>>> NOTE: random seed of testcase 'testRandomRegexes' was:
>>>> >>>> -4244174191361080127
>>>> >>>> NOTE: random seed of testcase 'testRandomRegexes' was:
>>>> >>>> -7059086272401721644
>>>> >>>> NOTE: random seed of testcase 'testRandomRegexes' was:
>>>> >>>> -1314734215611104147
>>>> >>>>
>>>> >>>> I use IBM JVM, tried w/ both 1.5 and 1.6 ...
>>>> >>>>
>>>> >>>> Mike, can we use LUCENE-2565 to track this, or would you prefer
>>>> >>>> that I
>>>> >>>> open a separate one?
>>>> >>>>
>>>> >>>> Shai
>>>> >>>>
>>>> >>>> On Mon, Jul 26, 2010 at 3:26 PM, Michael McCandless
>>>> >>>> <lu...@mikemccandless.com> wrote:
>>>> >>>>>
>>>> >>>>> On a more general note...
>>>> >>>>>
>>>> >>>>> Any time any of you out there hit an "odd" test failure, please
>>>> >>>>> please
>>>> >>>>> please do just what Shai did: take it to the dev list!
>>>> >>>>>
>>>> >>>>> Think of Lucene's unit tests like SETI :)  We are desperately
>>>> >>>>> seeking
>>>> >>>>> bugs, and you and your machine may just be lucky enough to find
>>>> >>>>> one...
>>>> >>>>> go forth and buy expensive new power hungry computers just so you
>>>> >>>>> can
>>>> >>>>> run the random tests over and over, seeking the bugs!
>>>> >>>>>
>>>> >>>>> But be sure to include that random seed when you do hit a
>>>> >>>>> failure...
>>>> >>>>>
>>>> >>>>> Mike
>>>> >>>>>
>>>> >>>>> On Mon, Jul 26, 2010 at 8:23 AM, Robert Muir <rc...@gmail.com>
>>>> >>>>> wrote:
>>>> >>>>> > I agree, Shai can you open a bug? I cannot reproduce, did you
>>>> >>>>> > use an
>>>> >>>>> > IBM JVM
>>>> >>>>> > or another environment that might help us figure it out?
>>>> >>>>> >
>>>> >>>>> > On Mon, Jul 26, 2010 at 6:29 AM, Michael McCandless
>>>> >>>>> > <lu...@mikemccandless.com> wrote:
>>>> >>>>> >>
>>>> >>>>> >> Hmmm this means a bug is lurking.  This is the power of random
>>>> >>>>> >> testing
>>>> >>>>> >> (that every time we all run tests, we're testing different
>>>> >>>>> >> "paths"
>>>> >>>>> >> through the code)....
>>>> >>>>> >>
>>>> >>>>> >> It seems exceptionally unlikely that LUCENE-2537's changes
>>>> >>>>> >> would
>>>> >>>>> >> cause
>>>> >>>>> >> this!
>>>> >>>>> >>
>>>> >>>>> >> But, unfortunately, when I plug that seed in I don't see it
>>>> >>>>> >> fail,
>>>> >>>>> >> which is odd.  I'll run a stress test to see if I can tickle
>>>> >>>>> >> the
>>>> >>>>> >> bug... can you open a Jira issue so we don't lose track?
>>>> >>>>> >>
>>>> >>>>> >> Mike
>>>> >>>>> >>
>>>> >>>>> >> On Mon, Jul 26, 2010 at 2:57 AM, Shai Erera <se...@gmail.com>
>>>> >>>>> >> wrote:
>>>> >>>>> >> > Hi
>>>> >>>>> >> >
>>>> >>>>> >> > I was running tests on trunk (after merging the changes from
>>>> >>>>> >> > LUCENE-2537)
>>>> >>>>> >> > and received this error message:
>>>> >>>>> >> >
>>>> >>>>> >> > expected:<true> but was:<false>
>>>> >>>>> >> >
>>>> >>>>> >> > junit.framework.AssertionFailedError: expected: but was:
>>>> >>>>> >> > at
>>>> >>>>> >> >
>>>> >>>>> >> >
>>>> >>>>> >> >
>>>> >>>>> >> > org.apache.lucene.util.automaton.TestUTF32ToUTF8.assertAutomaton(TestUTF32ToUTF8.java:197)
>>>> >>>>> >> > at
>>>> >>>>> >> >
>>>> >>>>> >> >
>>>> >>>>> >> >
>>>> >>>>> >> > org.apache.lucene.util.automaton.TestUTF32ToUTF8.testRandomRegexes(TestUTF32ToUTF8.java:170)
>>>> >>>>> >> > at
>>>> >>>>> >> >
>>>> >>>>> >> >
>>>> >>>>> >> > org.apache.lucene.util.LuceneTestCase.runBare(LuceneTestCase.java:285)
>>>> >>>>> >> >
>>>> >>>>> >> > NOTE: random seed of testcase 'testRandomRegexes' was:
>>>> >>>>> >> > 3510820306304573866
>>>> >>>>> >> >
>>>> >>>>> >> > I'm sure it's related to my changes. Has anyone else seen
>>>> >>>>> >> > this
>>>> >>>>> >> > before?
>>>> >>>>> >> >
>>>> >>>>> >> > Shai
>>>> >>>>> >> >
>>>> >>>>> >>
>>>> >>>>> >>
>>>> >>>>> >>
>>>> >>>>> >> ---------------------------------------------------------------------
>>>> >>>>> >> To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
>>>> >>>>> >> For additional commands, e-mail: dev-help@lucene.apache.org
>>>> >>>>> >>
>>>> >>>>> >
>>>> >>>>> >
>>>> >>>>> >
>>>> >>>>> > --
>>>> >>>>> > Robert Muir
>>>> >>>>> > rcmuir@gmail.com
>>>> >>>>> >
>>>> >>>>>
>>>> >>>>>
>>>> >>>>> ---------------------------------------------------------------------
>>>> >>>>> To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
>>>> >>>>> For additional commands, e-mail: dev-help@lucene.apache.org
>>>> >>>>>
>>>> >>>>
>>>> >>>
>>>> >>
>>>> >>
>>>> >>
>>>> >> --
>>>> >> Robert Muir
>>>> >> rcmuir@gmail.com
>>>> >
>>>> >
>>>>
>>>> ---------------------------------------------------------------------
>>>> To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
>>>> For additional commands, e-mail: dev-help@lucene.apache.org
>>>>
>>>
>>>
>>>
>>> --
>>> Robert Muir
>>> rcmuir@gmail.com
>>
>
>

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


Re: TestUTF32ToUTF8.testRandomRegexes fails

Posted by Robert Muir <rc...@gmail.com>.
I think Sun's String ctor probably does CodingErrorAction.REPLACE (insert
the 0x3f: question mark char) and IBM's probably
does CodingErrorAction.IGNORE (drops it)

i dont know who is right, both suck in my opinion, i
like CodingErrorAction.REPORT (throw an exception).

On Mon, Jul 26, 2010 at 3:41 PM, Shai Erera <se...@gmail.com> wrote:

> From here: http://www.fileformat.info/info/unicode/char/d9ff/index.htm
>
> Looks like that character is not a valid Unicode character, and perhaps the
> IBM's JVM behaves correctly? Robert - you're the Unicode expert :).
>
> Shai
>
>
> On Mon, Jul 26, 2010 at 10:40 PM, Shai Erera <se...@gmail.com> wrote:
>
>> I don't know what was the thing w/ the strings generated before, but now I
>> ran the test again w/ the same seed and it generates the same strings. So at
>> least it seems there are no problems w/ the Random class :).
>>
>> However, the string l.E fails w/ the IBM JVM and succeeds w/ SUN's. Any
>> ideas why? What does the test check anyway?
>>
>> I ran TRR2, and set the regexp to always be "l.E" and the test passes. The
>> failure comes from
>>
>> junit.framework.AssertionFailedError: expected:<true> but was:<false>
>>     at
>> org.apache.lucene.util.automaton.TestUTF32ToUTF8.assertAutomaton(TestUTF32ToUTF8.java:199)
>>     at
>> org.apache.lucene.util.automaton.TestUTF32ToUTF8.testRandomRegexes(TestUTF32ToUTF8.java:171)
>>
>> I've set regexp to "l.E", and also 'string' inside assertAutomaton to
>> "\u006C\uD9FF\u0045". The byte[] returned from string.getBytes("UTF-8") are
>> [108, 69]. It just ignores the middle character. Perhaps that's why the test
>> fails?
>>
>> When I run this w/ SUN's JVM, the bytes returned are [108, 63, 69].
>>
>> If I manually set the bytes, using IBM's, to [108, 63, 69], then the test
>> passes.
>>
>> Interestingly, Googling for \uD9FF brings back LUCENE-2019 as the first
>> result :). I'll dig some more into this character, and why the IBM and SUN
>> JVMs return different byte[] representation for the same sequence of
>> characters. If you already spot the problem, please let me know.
>>
>> BTW, the test calls _TestUtil.getRandomMultiplier on every iteration loop,
>> which goes and checks a system property. Perhaps we can extract it to a
>> variable, or include a static constant in LuceneTestCase(J4) or something?
>>
>> Shai
>>
>>
>> On Mon, Jul 26, 2010 at 9:22 PM, Robert Muir <rc...@gmail.com> wrote:
>>
>>> maybe there is a bug in ibm's random generator :)
>>>
>>>
>>> On Mon, Jul 26, 2010 at 11:50 AM, Michael McCandless <
>>> lucene@mikemccandless.com> wrote:
>>>
>>>> That's VERY spooky that w/ a fixed seed you see different random
>>>> regexps being made.
>>>>
>>>> Mike
>>>>
>>>> On Mon, Jul 26, 2010 at 11:40 AM, Shai Erera <se...@gmail.com> wrote:
>>>> > Ok I've dug deeper into the test. I set the random seed to
>>>> > -9029631602016965389L in setUp(), and discovered that on the 4th
>>>> iteration
>>>> > it breaks. For some reason though, AutomatonTestUtil.randomRegex
>>>> generates
>>>> > different strings every time I run the test, even though it uses the
>>>> same
>>>> > Random object w/ the same seed ...
>>>> >
>>>> > Anyway, one of the regex that failed was this "l.E" (w/o the quotes)
>>>> and I
>>>> > think it's a lowercase L, '.' (dot) and 'E' (uppercase). Hope this
>>>> helps.
>>>> >
>>>> > Shai
>>>> >
>>>> > On Mon, Jul 26, 2010 at 6:23 PM, Robert Muir <rc...@gmail.com>
>>>> wrote:
>>>> >>
>>>> >> sounds nasty... its good you are running the tests with this
>>>> different
>>>> >> jvm...
>>>> >>
>>>> >> On Mon, Jul 26, 2010 at 11:21 AM, Shai Erera <se...@gmail.com>
>>>> wrote:
>>>> >>>
>>>> >>> Tried to run it w/ SUN JRE6 and it succeeds ! I've tried several
>>>> times
>>>> >>> and it succeeds every time. However, when I revert back to IBM's, it
>>>> fail
>>>> >>> immediately.
>>>> >>>
>>>> >>> I can help w/ the debug, if you give me a hint where to look :).
>>>> >>>
>>>> >>> Shai
>>>> >>>
>>>> >>> On Mon, Jul 26, 2010 at 5:57 PM, Shai Erera <se...@gmail.com>
>>>> wrote:
>>>> >>>>
>>>> >>>> Sorry for the delayed response.
>>>> >>>>
>>>> >>>> I ran it a couple more times, from Eclipse and Ant, and each time
>>>> it
>>>> >>>> fails (amazing !), w/ different seeds. More seeds that fail:
>>>> >>>> NOTE: random seed of testcase 'testRandomRegexes' was:
>>>> >>>> -4244174191361080127
>>>> >>>> NOTE: random seed of testcase 'testRandomRegexes' was:
>>>> >>>> -7059086272401721644
>>>> >>>> NOTE: random seed of testcase 'testRandomRegexes' was:
>>>> >>>> -1314734215611104147
>>>> >>>>
>>>> >>>> I use IBM JVM, tried w/ both 1.5 and 1.6 ...
>>>> >>>>
>>>> >>>> Mike, can we use LUCENE-2565 to track this, or would you prefer
>>>> that I
>>>> >>>> open a separate one?
>>>> >>>>
>>>> >>>> Shai
>>>> >>>>
>>>> >>>> On Mon, Jul 26, 2010 at 3:26 PM, Michael McCandless
>>>> >>>> <lu...@mikemccandless.com> wrote:
>>>> >>>>>
>>>> >>>>> On a more general note...
>>>> >>>>>
>>>> >>>>> Any time any of you out there hit an "odd" test failure, please
>>>> please
>>>> >>>>> please do just what Shai did: take it to the dev list!
>>>> >>>>>
>>>> >>>>> Think of Lucene's unit tests like SETI :)  We are desperately
>>>> seeking
>>>> >>>>> bugs, and you and your machine may just be lucky enough to find
>>>> one...
>>>> >>>>> go forth and buy expensive new power hungry computers just so you
>>>> can
>>>> >>>>> run the random tests over and over, seeking the bugs!
>>>> >>>>>
>>>> >>>>> But be sure to include that random seed when you do hit a
>>>> failure...
>>>> >>>>>
>>>> >>>>> Mike
>>>> >>>>>
>>>> >>>>> On Mon, Jul 26, 2010 at 8:23 AM, Robert Muir <rc...@gmail.com>
>>>> wrote:
>>>> >>>>> > I agree, Shai can you open a bug? I cannot reproduce, did you
>>>> use an
>>>> >>>>> > IBM JVM
>>>> >>>>> > or another environment that might help us figure it out?
>>>> >>>>> >
>>>> >>>>> > On Mon, Jul 26, 2010 at 6:29 AM, Michael McCandless
>>>> >>>>> > <lu...@mikemccandless.com> wrote:
>>>> >>>>> >>
>>>> >>>>> >> Hmmm this means a bug is lurking.  This is the power of random
>>>> >>>>> >> testing
>>>> >>>>> >> (that every time we all run tests, we're testing different
>>>> "paths"
>>>> >>>>> >> through the code)....
>>>> >>>>> >>
>>>> >>>>> >> It seems exceptionally unlikely that LUCENE-2537's changes
>>>> would
>>>> >>>>> >> cause
>>>> >>>>> >> this!
>>>> >>>>> >>
>>>> >>>>> >> But, unfortunately, when I plug that seed in I don't see it
>>>> fail,
>>>> >>>>> >> which is odd.  I'll run a stress test to see if I can tickle
>>>> the
>>>> >>>>> >> bug... can you open a Jira issue so we don't lose track?
>>>> >>>>> >>
>>>> >>>>> >> Mike
>>>> >>>>> >>
>>>> >>>>> >> On Mon, Jul 26, 2010 at 2:57 AM, Shai Erera <se...@gmail.com>
>>>> >>>>> >> wrote:
>>>> >>>>> >> > Hi
>>>> >>>>> >> >
>>>> >>>>> >> > I was running tests on trunk (after merging the changes from
>>>> >>>>> >> > LUCENE-2537)
>>>> >>>>> >> > and received this error message:
>>>> >>>>> >> >
>>>> >>>>> >> > expected:<true> but was:<false>
>>>> >>>>> >> >
>>>> >>>>> >> > junit.framework.AssertionFailedError: expected: but was:
>>>> >>>>> >> > at
>>>> >>>>> >> >
>>>> >>>>> >> >
>>>> >>>>> >> >
>>>> org.apache.lucene.util.automaton.TestUTF32ToUTF8.assertAutomaton(TestUTF32ToUTF8.java:197)
>>>> >>>>> >> > at
>>>> >>>>> >> >
>>>> >>>>> >> >
>>>> >>>>> >> >
>>>> org.apache.lucene.util.automaton.TestUTF32ToUTF8.testRandomRegexes(TestUTF32ToUTF8.java:170)
>>>> >>>>> >> > at
>>>> >>>>> >> >
>>>> >>>>> >> >
>>>> org.apache.lucene.util.LuceneTestCase.runBare(LuceneTestCase.java:285)
>>>> >>>>> >> >
>>>> >>>>> >> > NOTE: random seed of testcase 'testRandomRegexes' was:
>>>> >>>>> >> > 3510820306304573866
>>>> >>>>> >> >
>>>> >>>>> >> > I'm sure it's related to my changes. Has anyone else seen
>>>> this
>>>> >>>>> >> > before?
>>>> >>>>> >> >
>>>> >>>>> >> > Shai
>>>> >>>>> >> >
>>>> >>>>> >>
>>>> >>>>> >>
>>>> >>>>> >>
>>>> ---------------------------------------------------------------------
>>>> >>>>> >> To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
>>>> >>>>> >> For additional commands, e-mail: dev-help@lucene.apache.org
>>>> >>>>> >>
>>>> >>>>> >
>>>> >>>>> >
>>>> >>>>> >
>>>> >>>>> > --
>>>> >>>>> > Robert Muir
>>>> >>>>> > rcmuir@gmail.com
>>>> >>>>> >
>>>> >>>>>
>>>> >>>>>
>>>> ---------------------------------------------------------------------
>>>> >>>>> To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
>>>> >>>>> For additional commands, e-mail: dev-help@lucene.apache.org
>>>> >>>>>
>>>> >>>>
>>>> >>>
>>>> >>
>>>> >>
>>>> >>
>>>> >> --
>>>> >> Robert Muir
>>>> >> rcmuir@gmail.com
>>>> >
>>>> >
>>>>
>>>> ---------------------------------------------------------------------
>>>> To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
>>>> For additional commands, e-mail: dev-help@lucene.apache.org
>>>>
>>>>
>>>
>>>
>>> --
>>> Robert Muir
>>> rcmuir@gmail.com
>>>
>>
>>
>


-- 
Robert Muir
rcmuir@gmail.com

Re: TestUTF32ToUTF8.testRandomRegexes fails

Posted by Shai Erera <se...@gmail.com>.
>From here: http://www.fileformat.info/info/unicode/char/d9ff/index.htm

Looks like that character is not a valid Unicode character, and perhaps the
IBM's JVM behaves correctly? Robert - you're the Unicode expert :).

Shai

On Mon, Jul 26, 2010 at 10:40 PM, Shai Erera <se...@gmail.com> wrote:

> I don't know what was the thing w/ the strings generated before, but now I
> ran the test again w/ the same seed and it generates the same strings. So at
> least it seems there are no problems w/ the Random class :).
>
> However, the string l.E fails w/ the IBM JVM and succeeds w/ SUN's. Any
> ideas why? What does the test check anyway?
>
> I ran TRR2, and set the regexp to always be "l.E" and the test passes. The
> failure comes from
>
> junit.framework.AssertionFailedError: expected:<true> but was:<false>
>     at
> org.apache.lucene.util.automaton.TestUTF32ToUTF8.assertAutomaton(TestUTF32ToUTF8.java:199)
>     at
> org.apache.lucene.util.automaton.TestUTF32ToUTF8.testRandomRegexes(TestUTF32ToUTF8.java:171)
>
> I've set regexp to "l.E", and also 'string' inside assertAutomaton to
> "\u006C\uD9FF\u0045". The byte[] returned from string.getBytes("UTF-8") are
> [108, 69]. It just ignores the middle character. Perhaps that's why the test
> fails?
>
> When I run this w/ SUN's JVM, the bytes returned are [108, 63, 69].
>
> If I manually set the bytes, using IBM's, to [108, 63, 69], then the test
> passes.
>
> Interestingly, Googling for \uD9FF brings back LUCENE-2019 as the first
> result :). I'll dig some more into this character, and why the IBM and SUN
> JVMs return different byte[] representation for the same sequence of
> characters. If you already spot the problem, please let me know.
>
> BTW, the test calls _TestUtil.getRandomMultiplier on every iteration loop,
> which goes and checks a system property. Perhaps we can extract it to a
> variable, or include a static constant in LuceneTestCase(J4) or something?
>
> Shai
>
>
> On Mon, Jul 26, 2010 at 9:22 PM, Robert Muir <rc...@gmail.com> wrote:
>
>> maybe there is a bug in ibm's random generator :)
>>
>>
>> On Mon, Jul 26, 2010 at 11:50 AM, Michael McCandless <
>> lucene@mikemccandless.com> wrote:
>>
>>> That's VERY spooky that w/ a fixed seed you see different random
>>> regexps being made.
>>>
>>> Mike
>>>
>>> On Mon, Jul 26, 2010 at 11:40 AM, Shai Erera <se...@gmail.com> wrote:
>>> > Ok I've dug deeper into the test. I set the random seed to
>>> > -9029631602016965389L in setUp(), and discovered that on the 4th
>>> iteration
>>> > it breaks. For some reason though, AutomatonTestUtil.randomRegex
>>> generates
>>> > different strings every time I run the test, even though it uses the
>>> same
>>> > Random object w/ the same seed ...
>>> >
>>> > Anyway, one of the regex that failed was this "l.E" (w/o the quotes)
>>> and I
>>> > think it's a lowercase L, '.' (dot) and 'E' (uppercase). Hope this
>>> helps.
>>> >
>>> > Shai
>>> >
>>> > On Mon, Jul 26, 2010 at 6:23 PM, Robert Muir <rc...@gmail.com> wrote:
>>> >>
>>> >> sounds nasty... its good you are running the tests with this different
>>> >> jvm...
>>> >>
>>> >> On Mon, Jul 26, 2010 at 11:21 AM, Shai Erera <se...@gmail.com>
>>> wrote:
>>> >>>
>>> >>> Tried to run it w/ SUN JRE6 and it succeeds ! I've tried several
>>> times
>>> >>> and it succeeds every time. However, when I revert back to IBM's, it
>>> fail
>>> >>> immediately.
>>> >>>
>>> >>> I can help w/ the debug, if you give me a hint where to look :).
>>> >>>
>>> >>> Shai
>>> >>>
>>> >>> On Mon, Jul 26, 2010 at 5:57 PM, Shai Erera <se...@gmail.com>
>>> wrote:
>>> >>>>
>>> >>>> Sorry for the delayed response.
>>> >>>>
>>> >>>> I ran it a couple more times, from Eclipse and Ant, and each time it
>>> >>>> fails (amazing !), w/ different seeds. More seeds that fail:
>>> >>>> NOTE: random seed of testcase 'testRandomRegexes' was:
>>> >>>> -4244174191361080127
>>> >>>> NOTE: random seed of testcase 'testRandomRegexes' was:
>>> >>>> -7059086272401721644
>>> >>>> NOTE: random seed of testcase 'testRandomRegexes' was:
>>> >>>> -1314734215611104147
>>> >>>>
>>> >>>> I use IBM JVM, tried w/ both 1.5 and 1.6 ...
>>> >>>>
>>> >>>> Mike, can we use LUCENE-2565 to track this, or would you prefer that
>>> I
>>> >>>> open a separate one?
>>> >>>>
>>> >>>> Shai
>>> >>>>
>>> >>>> On Mon, Jul 26, 2010 at 3:26 PM, Michael McCandless
>>> >>>> <lu...@mikemccandless.com> wrote:
>>> >>>>>
>>> >>>>> On a more general note...
>>> >>>>>
>>> >>>>> Any time any of you out there hit an "odd" test failure, please
>>> please
>>> >>>>> please do just what Shai did: take it to the dev list!
>>> >>>>>
>>> >>>>> Think of Lucene's unit tests like SETI :)  We are desperately
>>> seeking
>>> >>>>> bugs, and you and your machine may just be lucky enough to find
>>> one...
>>> >>>>> go forth and buy expensive new power hungry computers just so you
>>> can
>>> >>>>> run the random tests over and over, seeking the bugs!
>>> >>>>>
>>> >>>>> But be sure to include that random seed when you do hit a
>>> failure...
>>> >>>>>
>>> >>>>> Mike
>>> >>>>>
>>> >>>>> On Mon, Jul 26, 2010 at 8:23 AM, Robert Muir <rc...@gmail.com>
>>> wrote:
>>> >>>>> > I agree, Shai can you open a bug? I cannot reproduce, did you use
>>> an
>>> >>>>> > IBM JVM
>>> >>>>> > or another environment that might help us figure it out?
>>> >>>>> >
>>> >>>>> > On Mon, Jul 26, 2010 at 6:29 AM, Michael McCandless
>>> >>>>> > <lu...@mikemccandless.com> wrote:
>>> >>>>> >>
>>> >>>>> >> Hmmm this means a bug is lurking.  This is the power of random
>>> >>>>> >> testing
>>> >>>>> >> (that every time we all run tests, we're testing different
>>> "paths"
>>> >>>>> >> through the code)....
>>> >>>>> >>
>>> >>>>> >> It seems exceptionally unlikely that LUCENE-2537's changes would
>>> >>>>> >> cause
>>> >>>>> >> this!
>>> >>>>> >>
>>> >>>>> >> But, unfortunately, when I plug that seed in I don't see it
>>> fail,
>>> >>>>> >> which is odd.  I'll run a stress test to see if I can tickle the
>>> >>>>> >> bug... can you open a Jira issue so we don't lose track?
>>> >>>>> >>
>>> >>>>> >> Mike
>>> >>>>> >>
>>> >>>>> >> On Mon, Jul 26, 2010 at 2:57 AM, Shai Erera <se...@gmail.com>
>>> >>>>> >> wrote:
>>> >>>>> >> > Hi
>>> >>>>> >> >
>>> >>>>> >> > I was running tests on trunk (after merging the changes from
>>> >>>>> >> > LUCENE-2537)
>>> >>>>> >> > and received this error message:
>>> >>>>> >> >
>>> >>>>> >> > expected:<true> but was:<false>
>>> >>>>> >> >
>>> >>>>> >> > junit.framework.AssertionFailedError: expected: but was:
>>> >>>>> >> > at
>>> >>>>> >> >
>>> >>>>> >> >
>>> >>>>> >> >
>>> org.apache.lucene.util.automaton.TestUTF32ToUTF8.assertAutomaton(TestUTF32ToUTF8.java:197)
>>> >>>>> >> > at
>>> >>>>> >> >
>>> >>>>> >> >
>>> >>>>> >> >
>>> org.apache.lucene.util.automaton.TestUTF32ToUTF8.testRandomRegexes(TestUTF32ToUTF8.java:170)
>>> >>>>> >> > at
>>> >>>>> >> >
>>> >>>>> >> >
>>> org.apache.lucene.util.LuceneTestCase.runBare(LuceneTestCase.java:285)
>>> >>>>> >> >
>>> >>>>> >> > NOTE: random seed of testcase 'testRandomRegexes' was:
>>> >>>>> >> > 3510820306304573866
>>> >>>>> >> >
>>> >>>>> >> > I'm sure it's related to my changes. Has anyone else seen this
>>> >>>>> >> > before?
>>> >>>>> >> >
>>> >>>>> >> > Shai
>>> >>>>> >> >
>>> >>>>> >>
>>> >>>>> >>
>>> >>>>> >>
>>> ---------------------------------------------------------------------
>>> >>>>> >> To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
>>> >>>>> >> For additional commands, e-mail: dev-help@lucene.apache.org
>>> >>>>> >>
>>> >>>>> >
>>> >>>>> >
>>> >>>>> >
>>> >>>>> > --
>>> >>>>> > Robert Muir
>>> >>>>> > rcmuir@gmail.com
>>> >>>>> >
>>> >>>>>
>>> >>>>>
>>> ---------------------------------------------------------------------
>>> >>>>> To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
>>> >>>>> For additional commands, e-mail: dev-help@lucene.apache.org
>>> >>>>>
>>> >>>>
>>> >>>
>>> >>
>>> >>
>>> >>
>>> >> --
>>> >> Robert Muir
>>> >> rcmuir@gmail.com
>>> >
>>> >
>>>
>>> ---------------------------------------------------------------------
>>> To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
>>> For additional commands, e-mail: dev-help@lucene.apache.org
>>>
>>>
>>
>>
>> --
>> Robert Muir
>> rcmuir@gmail.com
>>
>
>

Re: TestUTF32ToUTF8.testRandomRegexes fails

Posted by Robert Muir <rc...@gmail.com>.
first of all, thanks for taking the time to do all of this debugging!

my guess is this might be related to
https://issues.apache.org/jira/browse/LUCENE-2565

<https://issues.apache.org/jira/browse/LUCENE-2565>does it fail if you apply
Mike's patch?

On Mon, Jul 26, 2010 at 3:40 PM, Shai Erera <se...@gmail.com> wrote:

> I don't know what was the thing w/ the strings generated before, but now I
> ran the test again w/ the same seed and it generates the same strings. So at
> least it seems there are no problems w/ the Random class :).
>
> However, the string l.E fails w/ the IBM JVM and succeeds w/ SUN's. Any
> ideas why? What does the test check anyway?
>
> I ran TRR2, and set the regexp to always be "l.E" and the test passes. The
> failure comes from
>
> junit.framework.AssertionFailedError: expected:<true> but was:<false>
>     at
> org.apache.lucene.util.automaton.TestUTF32ToUTF8.assertAutomaton(TestUTF32ToUTF8.java:199)
>     at
> org.apache.lucene.util.automaton.TestUTF32ToUTF8.testRandomRegexes(TestUTF32ToUTF8.java:171)
>
> I've set regexp to "l.E", and also 'string' inside assertAutomaton to
> "\u006C\uD9FF\u0045". The byte[] returned from string.getBytes("UTF-8") are
> [108, 69]. It just ignores the middle character. Perhaps that's why the test
> fails?
>
> When I run this w/ SUN's JVM, the bytes returned are [108, 63, 69].
>
> If I manually set the bytes, using IBM's, to [108, 63, 69], then the test
> passes.
>
> Interestingly, Googling for \uD9FF brings back LUCENE-2019 as the first
> result :). I'll dig some more into this character, and why the IBM and SUN
> JVMs return different byte[] representation for the same sequence of
> characters. If you already spot the problem, please let me know.
>
> BTW, the test calls _TestUtil.getRandomMultiplier on every iteration loop,
> which goes and checks a system property. Perhaps we can extract it to a
> variable, or include a static constant in LuceneTestCase(J4) or something?
>
> Shai
>
>
> On Mon, Jul 26, 2010 at 9:22 PM, Robert Muir <rc...@gmail.com> wrote:
>
>> maybe there is a bug in ibm's random generator :)
>>
>>
>> On Mon, Jul 26, 2010 at 11:50 AM, Michael McCandless <
>> lucene@mikemccandless.com> wrote:
>>
>>> That's VERY spooky that w/ a fixed seed you see different random
>>> regexps being made.
>>>
>>> Mike
>>>
>>> On Mon, Jul 26, 2010 at 11:40 AM, Shai Erera <se...@gmail.com> wrote:
>>> > Ok I've dug deeper into the test. I set the random seed to
>>> > -9029631602016965389L in setUp(), and discovered that on the 4th
>>> iteration
>>> > it breaks. For some reason though, AutomatonTestUtil.randomRegex
>>> generates
>>> > different strings every time I run the test, even though it uses the
>>> same
>>> > Random object w/ the same seed ...
>>> >
>>> > Anyway, one of the regex that failed was this "l.E" (w/o the quotes)
>>> and I
>>> > think it's a lowercase L, '.' (dot) and 'E' (uppercase). Hope this
>>> helps.
>>> >
>>> > Shai
>>> >
>>> > On Mon, Jul 26, 2010 at 6:23 PM, Robert Muir <rc...@gmail.com> wrote:
>>> >>
>>> >> sounds nasty... its good you are running the tests with this different
>>> >> jvm...
>>> >>
>>> >> On Mon, Jul 26, 2010 at 11:21 AM, Shai Erera <se...@gmail.com>
>>> wrote:
>>> >>>
>>> >>> Tried to run it w/ SUN JRE6 and it succeeds ! I've tried several
>>> times
>>> >>> and it succeeds every time. However, when I revert back to IBM's, it
>>> fail
>>> >>> immediately.
>>> >>>
>>> >>> I can help w/ the debug, if you give me a hint where to look :).
>>> >>>
>>> >>> Shai
>>> >>>
>>> >>> On Mon, Jul 26, 2010 at 5:57 PM, Shai Erera <se...@gmail.com>
>>> wrote:
>>> >>>>
>>> >>>> Sorry for the delayed response.
>>> >>>>
>>> >>>> I ran it a couple more times, from Eclipse and Ant, and each time it
>>> >>>> fails (amazing !), w/ different seeds. More seeds that fail:
>>> >>>> NOTE: random seed of testcase 'testRandomRegexes' was:
>>> >>>> -4244174191361080127
>>> >>>> NOTE: random seed of testcase 'testRandomRegexes' was:
>>> >>>> -7059086272401721644
>>> >>>> NOTE: random seed of testcase 'testRandomRegexes' was:
>>> >>>> -1314734215611104147
>>> >>>>
>>> >>>> I use IBM JVM, tried w/ both 1.5 and 1.6 ...
>>> >>>>
>>> >>>> Mike, can we use LUCENE-2565 to track this, or would you prefer that
>>> I
>>> >>>> open a separate one?
>>> >>>>
>>> >>>> Shai
>>> >>>>
>>> >>>> On Mon, Jul 26, 2010 at 3:26 PM, Michael McCandless
>>> >>>> <lu...@mikemccandless.com> wrote:
>>> >>>>>
>>> >>>>> On a more general note...
>>> >>>>>
>>> >>>>> Any time any of you out there hit an "odd" test failure, please
>>> please
>>> >>>>> please do just what Shai did: take it to the dev list!
>>> >>>>>
>>> >>>>> Think of Lucene's unit tests like SETI :)  We are desperately
>>> seeking
>>> >>>>> bugs, and you and your machine may just be lucky enough to find
>>> one...
>>> >>>>> go forth and buy expensive new power hungry computers just so you
>>> can
>>> >>>>> run the random tests over and over, seeking the bugs!
>>> >>>>>
>>> >>>>> But be sure to include that random seed when you do hit a
>>> failure...
>>> >>>>>
>>> >>>>> Mike
>>> >>>>>
>>> >>>>> On Mon, Jul 26, 2010 at 8:23 AM, Robert Muir <rc...@gmail.com>
>>> wrote:
>>> >>>>> > I agree, Shai can you open a bug? I cannot reproduce, did you use
>>> an
>>> >>>>> > IBM JVM
>>> >>>>> > or another environment that might help us figure it out?
>>> >>>>> >
>>> >>>>> > On Mon, Jul 26, 2010 at 6:29 AM, Michael McCandless
>>> >>>>> > <lu...@mikemccandless.com> wrote:
>>> >>>>> >>
>>> >>>>> >> Hmmm this means a bug is lurking.  This is the power of random
>>> >>>>> >> testing
>>> >>>>> >> (that every time we all run tests, we're testing different
>>> "paths"
>>> >>>>> >> through the code)....
>>> >>>>> >>
>>> >>>>> >> It seems exceptionally unlikely that LUCENE-2537's changes would
>>> >>>>> >> cause
>>> >>>>> >> this!
>>> >>>>> >>
>>> >>>>> >> But, unfortunately, when I plug that seed in I don't see it
>>> fail,
>>> >>>>> >> which is odd.  I'll run a stress test to see if I can tickle the
>>> >>>>> >> bug... can you open a Jira issue so we don't lose track?
>>> >>>>> >>
>>> >>>>> >> Mike
>>> >>>>> >>
>>> >>>>> >> On Mon, Jul 26, 2010 at 2:57 AM, Shai Erera <se...@gmail.com>
>>> >>>>> >> wrote:
>>> >>>>> >> > Hi
>>> >>>>> >> >
>>> >>>>> >> > I was running tests on trunk (after merging the changes from
>>> >>>>> >> > LUCENE-2537)
>>> >>>>> >> > and received this error message:
>>> >>>>> >> >
>>> >>>>> >> > expected:<true> but was:<false>
>>> >>>>> >> >
>>> >>>>> >> > junit.framework.AssertionFailedError: expected: but was:
>>> >>>>> >> > at
>>> >>>>> >> >
>>> >>>>> >> >
>>> >>>>> >> >
>>> org.apache.lucene.util.automaton.TestUTF32ToUTF8.assertAutomaton(TestUTF32ToUTF8.java:197)
>>> >>>>> >> > at
>>> >>>>> >> >
>>> >>>>> >> >
>>> >>>>> >> >
>>> org.apache.lucene.util.automaton.TestUTF32ToUTF8.testRandomRegexes(TestUTF32ToUTF8.java:170)
>>> >>>>> >> > at
>>> >>>>> >> >
>>> >>>>> >> >
>>> org.apache.lucene.util.LuceneTestCase.runBare(LuceneTestCase.java:285)
>>> >>>>> >> >
>>> >>>>> >> > NOTE: random seed of testcase 'testRandomRegexes' was:
>>> >>>>> >> > 3510820306304573866
>>> >>>>> >> >
>>> >>>>> >> > I'm sure it's related to my changes. Has anyone else seen this
>>> >>>>> >> > before?
>>> >>>>> >> >
>>> >>>>> >> > Shai
>>> >>>>> >> >
>>> >>>>> >>
>>> >>>>> >>
>>> >>>>> >>
>>> ---------------------------------------------------------------------
>>> >>>>> >> To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
>>> >>>>> >> For additional commands, e-mail: dev-help@lucene.apache.org
>>> >>>>> >>
>>> >>>>> >
>>> >>>>> >
>>> >>>>> >
>>> >>>>> > --
>>> >>>>> > Robert Muir
>>> >>>>> > rcmuir@gmail.com
>>> >>>>> >
>>> >>>>>
>>> >>>>>
>>> ---------------------------------------------------------------------
>>> >>>>> To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
>>> >>>>> For additional commands, e-mail: dev-help@lucene.apache.org
>>> >>>>>
>>> >>>>
>>> >>>
>>> >>
>>> >>
>>> >>
>>> >> --
>>> >> Robert Muir
>>> >> rcmuir@gmail.com
>>> >
>>> >
>>>
>>> ---------------------------------------------------------------------
>>> To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
>>> For additional commands, e-mail: dev-help@lucene.apache.org
>>>
>>>
>>
>>
>> --
>> Robert Muir
>> rcmuir@gmail.com
>>
>
>


-- 
Robert Muir
rcmuir@gmail.com

Re: TestUTF32ToUTF8.testRandomRegexes fails

Posted by Shai Erera <se...@gmail.com>.
I don't know what was the thing w/ the strings generated before, but now I
ran the test again w/ the same seed and it generates the same strings. So at
least it seems there are no problems w/ the Random class :).

However, the string l.E fails w/ the IBM JVM and succeeds w/ SUN's. Any
ideas why? What does the test check anyway?

I ran TRR2, and set the regexp to always be "l.E" and the test passes. The
failure comes from

junit.framework.AssertionFailedError: expected:<true> but was:<false>
    at
org.apache.lucene.util.automaton.TestUTF32ToUTF8.assertAutomaton(TestUTF32ToUTF8.java:199)
    at
org.apache.lucene.util.automaton.TestUTF32ToUTF8.testRandomRegexes(TestUTF32ToUTF8.java:171)

I've set regexp to "l.E", and also 'string' inside assertAutomaton to
"\u006C\uD9FF\u0045". The byte[] returned from string.getBytes("UTF-8") are
[108, 69]. It just ignores the middle character. Perhaps that's why the test
fails?

When I run this w/ SUN's JVM, the bytes returned are [108, 63, 69].

If I manually set the bytes, using IBM's, to [108, 63, 69], then the test
passes.

Interestingly, Googling for \uD9FF brings back LUCENE-2019 as the first
result :). I'll dig some more into this character, and why the IBM and SUN
JVMs return different byte[] representation for the same sequence of
characters. If you already spot the problem, please let me know.

BTW, the test calls _TestUtil.getRandomMultiplier on every iteration loop,
which goes and checks a system property. Perhaps we can extract it to a
variable, or include a static constant in LuceneTestCase(J4) or something?

Shai

On Mon, Jul 26, 2010 at 9:22 PM, Robert Muir <rc...@gmail.com> wrote:

> maybe there is a bug in ibm's random generator :)
>
>
> On Mon, Jul 26, 2010 at 11:50 AM, Michael McCandless <
> lucene@mikemccandless.com> wrote:
>
>> That's VERY spooky that w/ a fixed seed you see different random
>> regexps being made.
>>
>> Mike
>>
>> On Mon, Jul 26, 2010 at 11:40 AM, Shai Erera <se...@gmail.com> wrote:
>> > Ok I've dug deeper into the test. I set the random seed to
>> > -9029631602016965389L in setUp(), and discovered that on the 4th
>> iteration
>> > it breaks. For some reason though, AutomatonTestUtil.randomRegex
>> generates
>> > different strings every time I run the test, even though it uses the
>> same
>> > Random object w/ the same seed ...
>> >
>> > Anyway, one of the regex that failed was this "l.E" (w/o the quotes) and
>> I
>> > think it's a lowercase L, '.' (dot) and 'E' (uppercase). Hope this
>> helps.
>> >
>> > Shai
>> >
>> > On Mon, Jul 26, 2010 at 6:23 PM, Robert Muir <rc...@gmail.com> wrote:
>> >>
>> >> sounds nasty... its good you are running the tests with this different
>> >> jvm...
>> >>
>> >> On Mon, Jul 26, 2010 at 11:21 AM, Shai Erera <se...@gmail.com> wrote:
>> >>>
>> >>> Tried to run it w/ SUN JRE6 and it succeeds ! I've tried several times
>> >>> and it succeeds every time. However, when I revert back to IBM's, it
>> fail
>> >>> immediately.
>> >>>
>> >>> I can help w/ the debug, if you give me a hint where to look :).
>> >>>
>> >>> Shai
>> >>>
>> >>> On Mon, Jul 26, 2010 at 5:57 PM, Shai Erera <se...@gmail.com> wrote:
>> >>>>
>> >>>> Sorry for the delayed response.
>> >>>>
>> >>>> I ran it a couple more times, from Eclipse and Ant, and each time it
>> >>>> fails (amazing !), w/ different seeds. More seeds that fail:
>> >>>> NOTE: random seed of testcase 'testRandomRegexes' was:
>> >>>> -4244174191361080127
>> >>>> NOTE: random seed of testcase 'testRandomRegexes' was:
>> >>>> -7059086272401721644
>> >>>> NOTE: random seed of testcase 'testRandomRegexes' was:
>> >>>> -1314734215611104147
>> >>>>
>> >>>> I use IBM JVM, tried w/ both 1.5 and 1.6 ...
>> >>>>
>> >>>> Mike, can we use LUCENE-2565 to track this, or would you prefer that
>> I
>> >>>> open a separate one?
>> >>>>
>> >>>> Shai
>> >>>>
>> >>>> On Mon, Jul 26, 2010 at 3:26 PM, Michael McCandless
>> >>>> <lu...@mikemccandless.com> wrote:
>> >>>>>
>> >>>>> On a more general note...
>> >>>>>
>> >>>>> Any time any of you out there hit an "odd" test failure, please
>> please
>> >>>>> please do just what Shai did: take it to the dev list!
>> >>>>>
>> >>>>> Think of Lucene's unit tests like SETI :)  We are desperately
>> seeking
>> >>>>> bugs, and you and your machine may just be lucky enough to find
>> one...
>> >>>>> go forth and buy expensive new power hungry computers just so you
>> can
>> >>>>> run the random tests over and over, seeking the bugs!
>> >>>>>
>> >>>>> But be sure to include that random seed when you do hit a failure...
>> >>>>>
>> >>>>> Mike
>> >>>>>
>> >>>>> On Mon, Jul 26, 2010 at 8:23 AM, Robert Muir <rc...@gmail.com>
>> wrote:
>> >>>>> > I agree, Shai can you open a bug? I cannot reproduce, did you use
>> an
>> >>>>> > IBM JVM
>> >>>>> > or another environment that might help us figure it out?
>> >>>>> >
>> >>>>> > On Mon, Jul 26, 2010 at 6:29 AM, Michael McCandless
>> >>>>> > <lu...@mikemccandless.com> wrote:
>> >>>>> >>
>> >>>>> >> Hmmm this means a bug is lurking.  This is the power of random
>> >>>>> >> testing
>> >>>>> >> (that every time we all run tests, we're testing different
>> "paths"
>> >>>>> >> through the code)....
>> >>>>> >>
>> >>>>> >> It seems exceptionally unlikely that LUCENE-2537's changes would
>> >>>>> >> cause
>> >>>>> >> this!
>> >>>>> >>
>> >>>>> >> But, unfortunately, when I plug that seed in I don't see it fail,
>> >>>>> >> which is odd.  I'll run a stress test to see if I can tickle the
>> >>>>> >> bug... can you open a Jira issue so we don't lose track?
>> >>>>> >>
>> >>>>> >> Mike
>> >>>>> >>
>> >>>>> >> On Mon, Jul 26, 2010 at 2:57 AM, Shai Erera <se...@gmail.com>
>> >>>>> >> wrote:
>> >>>>> >> > Hi
>> >>>>> >> >
>> >>>>> >> > I was running tests on trunk (after merging the changes from
>> >>>>> >> > LUCENE-2537)
>> >>>>> >> > and received this error message:
>> >>>>> >> >
>> >>>>> >> > expected:<true> but was:<false>
>> >>>>> >> >
>> >>>>> >> > junit.framework.AssertionFailedError: expected: but was:
>> >>>>> >> > at
>> >>>>> >> >
>> >>>>> >> >
>> >>>>> >> >
>> org.apache.lucene.util.automaton.TestUTF32ToUTF8.assertAutomaton(TestUTF32ToUTF8.java:197)
>> >>>>> >> > at
>> >>>>> >> >
>> >>>>> >> >
>> >>>>> >> >
>> org.apache.lucene.util.automaton.TestUTF32ToUTF8.testRandomRegexes(TestUTF32ToUTF8.java:170)
>> >>>>> >> > at
>> >>>>> >> >
>> >>>>> >> >
>> org.apache.lucene.util.LuceneTestCase.runBare(LuceneTestCase.java:285)
>> >>>>> >> >
>> >>>>> >> > NOTE: random seed of testcase 'testRandomRegexes' was:
>> >>>>> >> > 3510820306304573866
>> >>>>> >> >
>> >>>>> >> > I'm sure it's related to my changes. Has anyone else seen this
>> >>>>> >> > before?
>> >>>>> >> >
>> >>>>> >> > Shai
>> >>>>> >> >
>> >>>>> >>
>> >>>>> >>
>> >>>>> >>
>> ---------------------------------------------------------------------
>> >>>>> >> To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
>> >>>>> >> For additional commands, e-mail: dev-help@lucene.apache.org
>> >>>>> >>
>> >>>>> >
>> >>>>> >
>> >>>>> >
>> >>>>> > --
>> >>>>> > Robert Muir
>> >>>>> > rcmuir@gmail.com
>> >>>>> >
>> >>>>>
>> >>>>>
>> ---------------------------------------------------------------------
>> >>>>> To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
>> >>>>> For additional commands, e-mail: dev-help@lucene.apache.org
>> >>>>>
>> >>>>
>> >>>
>> >>
>> >>
>> >>
>> >> --
>> >> Robert Muir
>> >> rcmuir@gmail.com
>> >
>> >
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
>> For additional commands, e-mail: dev-help@lucene.apache.org
>>
>>
>
>
> --
> Robert Muir
> rcmuir@gmail.com
>

Re: TestUTF32ToUTF8.testRandomRegexes fails

Posted by Robert Muir <rc...@gmail.com>.
maybe there is a bug in ibm's random generator :)

On Mon, Jul 26, 2010 at 11:50 AM, Michael McCandless <
lucene@mikemccandless.com> wrote:

> That's VERY spooky that w/ a fixed seed you see different random
> regexps being made.
>
> Mike
>
> On Mon, Jul 26, 2010 at 11:40 AM, Shai Erera <se...@gmail.com> wrote:
> > Ok I've dug deeper into the test. I set the random seed to
> > -9029631602016965389L in setUp(), and discovered that on the 4th
> iteration
> > it breaks. For some reason though, AutomatonTestUtil.randomRegex
> generates
> > different strings every time I run the test, even though it uses the same
> > Random object w/ the same seed ...
> >
> > Anyway, one of the regex that failed was this "l.E" (w/o the quotes) and
> I
> > think it's a lowercase L, '.' (dot) and 'E' (uppercase). Hope this helps.
> >
> > Shai
> >
> > On Mon, Jul 26, 2010 at 6:23 PM, Robert Muir <rc...@gmail.com> wrote:
> >>
> >> sounds nasty... its good you are running the tests with this different
> >> jvm...
> >>
> >> On Mon, Jul 26, 2010 at 11:21 AM, Shai Erera <se...@gmail.com> wrote:
> >>>
> >>> Tried to run it w/ SUN JRE6 and it succeeds ! I've tried several times
> >>> and it succeeds every time. However, when I revert back to IBM's, it
> fail
> >>> immediately.
> >>>
> >>> I can help w/ the debug, if you give me a hint where to look :).
> >>>
> >>> Shai
> >>>
> >>> On Mon, Jul 26, 2010 at 5:57 PM, Shai Erera <se...@gmail.com> wrote:
> >>>>
> >>>> Sorry for the delayed response.
> >>>>
> >>>> I ran it a couple more times, from Eclipse and Ant, and each time it
> >>>> fails (amazing !), w/ different seeds. More seeds that fail:
> >>>> NOTE: random seed of testcase 'testRandomRegexes' was:
> >>>> -4244174191361080127
> >>>> NOTE: random seed of testcase 'testRandomRegexes' was:
> >>>> -7059086272401721644
> >>>> NOTE: random seed of testcase 'testRandomRegexes' was:
> >>>> -1314734215611104147
> >>>>
> >>>> I use IBM JVM, tried w/ both 1.5 and 1.6 ...
> >>>>
> >>>> Mike, can we use LUCENE-2565 to track this, or would you prefer that I
> >>>> open a separate one?
> >>>>
> >>>> Shai
> >>>>
> >>>> On Mon, Jul 26, 2010 at 3:26 PM, Michael McCandless
> >>>> <lu...@mikemccandless.com> wrote:
> >>>>>
> >>>>> On a more general note...
> >>>>>
> >>>>> Any time any of you out there hit an "odd" test failure, please
> please
> >>>>> please do just what Shai did: take it to the dev list!
> >>>>>
> >>>>> Think of Lucene's unit tests like SETI :)  We are desperately seeking
> >>>>> bugs, and you and your machine may just be lucky enough to find
> one...
> >>>>> go forth and buy expensive new power hungry computers just so you can
> >>>>> run the random tests over and over, seeking the bugs!
> >>>>>
> >>>>> But be sure to include that random seed when you do hit a failure...
> >>>>>
> >>>>> Mike
> >>>>>
> >>>>> On Mon, Jul 26, 2010 at 8:23 AM, Robert Muir <rc...@gmail.com>
> wrote:
> >>>>> > I agree, Shai can you open a bug? I cannot reproduce, did you use
> an
> >>>>> > IBM JVM
> >>>>> > or another environment that might help us figure it out?
> >>>>> >
> >>>>> > On Mon, Jul 26, 2010 at 6:29 AM, Michael McCandless
> >>>>> > <lu...@mikemccandless.com> wrote:
> >>>>> >>
> >>>>> >> Hmmm this means a bug is lurking.  This is the power of random
> >>>>> >> testing
> >>>>> >> (that every time we all run tests, we're testing different "paths"
> >>>>> >> through the code)....
> >>>>> >>
> >>>>> >> It seems exceptionally unlikely that LUCENE-2537's changes would
> >>>>> >> cause
> >>>>> >> this!
> >>>>> >>
> >>>>> >> But, unfortunately, when I plug that seed in I don't see it fail,
> >>>>> >> which is odd.  I'll run a stress test to see if I can tickle the
> >>>>> >> bug... can you open a Jira issue so we don't lose track?
> >>>>> >>
> >>>>> >> Mike
> >>>>> >>
> >>>>> >> On Mon, Jul 26, 2010 at 2:57 AM, Shai Erera <se...@gmail.com>
> >>>>> >> wrote:
> >>>>> >> > Hi
> >>>>> >> >
> >>>>> >> > I was running tests on trunk (after merging the changes from
> >>>>> >> > LUCENE-2537)
> >>>>> >> > and received this error message:
> >>>>> >> >
> >>>>> >> > expected:<true> but was:<false>
> >>>>> >> >
> >>>>> >> > junit.framework.AssertionFailedError: expected: but was:
> >>>>> >> > at
> >>>>> >> >
> >>>>> >> >
> >>>>> >> >
> org.apache.lucene.util.automaton.TestUTF32ToUTF8.assertAutomaton(TestUTF32ToUTF8.java:197)
> >>>>> >> > at
> >>>>> >> >
> >>>>> >> >
> >>>>> >> >
> org.apache.lucene.util.automaton.TestUTF32ToUTF8.testRandomRegexes(TestUTF32ToUTF8.java:170)
> >>>>> >> > at
> >>>>> >> >
> >>>>> >> >
> org.apache.lucene.util.LuceneTestCase.runBare(LuceneTestCase.java:285)
> >>>>> >> >
> >>>>> >> > NOTE: random seed of testcase 'testRandomRegexes' was:
> >>>>> >> > 3510820306304573866
> >>>>> >> >
> >>>>> >> > I'm sure it's related to my changes. Has anyone else seen this
> >>>>> >> > before?
> >>>>> >> >
> >>>>> >> > Shai
> >>>>> >> >
> >>>>> >>
> >>>>> >>
> >>>>> >>
> ---------------------------------------------------------------------
> >>>>> >> To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
> >>>>> >> For additional commands, e-mail: dev-help@lucene.apache.org
> >>>>> >>
> >>>>> >
> >>>>> >
> >>>>> >
> >>>>> > --
> >>>>> > Robert Muir
> >>>>> > rcmuir@gmail.com
> >>>>> >
> >>>>>
> >>>>> ---------------------------------------------------------------------
> >>>>> To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
> >>>>> For additional commands, e-mail: dev-help@lucene.apache.org
> >>>>>
> >>>>
> >>>
> >>
> >>
> >>
> >> --
> >> Robert Muir
> >> rcmuir@gmail.com
> >
> >
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
> For additional commands, e-mail: dev-help@lucene.apache.org
>
>


-- 
Robert Muir
rcmuir@gmail.com

Re: TestUTF32ToUTF8.testRandomRegexes fails

Posted by Michael McCandless <lu...@mikemccandless.com>.
That's VERY spooky that w/ a fixed seed you see different random
regexps being made.

Mike

On Mon, Jul 26, 2010 at 11:40 AM, Shai Erera <se...@gmail.com> wrote:
> Ok I've dug deeper into the test. I set the random seed to
> -9029631602016965389L in setUp(), and discovered that on the 4th iteration
> it breaks. For some reason though, AutomatonTestUtil.randomRegex generates
> different strings every time I run the test, even though it uses the same
> Random object w/ the same seed ...
>
> Anyway, one of the regex that failed was this "l.E" (w/o the quotes) and I
> think it's a lowercase L, '.' (dot) and 'E' (uppercase). Hope this helps.
>
> Shai
>
> On Mon, Jul 26, 2010 at 6:23 PM, Robert Muir <rc...@gmail.com> wrote:
>>
>> sounds nasty... its good you are running the tests with this different
>> jvm...
>>
>> On Mon, Jul 26, 2010 at 11:21 AM, Shai Erera <se...@gmail.com> wrote:
>>>
>>> Tried to run it w/ SUN JRE6 and it succeeds ! I've tried several times
>>> and it succeeds every time. However, when I revert back to IBM's, it fail
>>> immediately.
>>>
>>> I can help w/ the debug, if you give me a hint where to look :).
>>>
>>> Shai
>>>
>>> On Mon, Jul 26, 2010 at 5:57 PM, Shai Erera <se...@gmail.com> wrote:
>>>>
>>>> Sorry for the delayed response.
>>>>
>>>> I ran it a couple more times, from Eclipse and Ant, and each time it
>>>> fails (amazing !), w/ different seeds. More seeds that fail:
>>>> NOTE: random seed of testcase 'testRandomRegexes' was:
>>>> -4244174191361080127
>>>> NOTE: random seed of testcase 'testRandomRegexes' was:
>>>> -7059086272401721644
>>>> NOTE: random seed of testcase 'testRandomRegexes' was:
>>>> -1314734215611104147
>>>>
>>>> I use IBM JVM, tried w/ both 1.5 and 1.6 ...
>>>>
>>>> Mike, can we use LUCENE-2565 to track this, or would you prefer that I
>>>> open a separate one?
>>>>
>>>> Shai
>>>>
>>>> On Mon, Jul 26, 2010 at 3:26 PM, Michael McCandless
>>>> <lu...@mikemccandless.com> wrote:
>>>>>
>>>>> On a more general note...
>>>>>
>>>>> Any time any of you out there hit an "odd" test failure, please please
>>>>> please do just what Shai did: take it to the dev list!
>>>>>
>>>>> Think of Lucene's unit tests like SETI :)  We are desperately seeking
>>>>> bugs, and you and your machine may just be lucky enough to find one...
>>>>> go forth and buy expensive new power hungry computers just so you can
>>>>> run the random tests over and over, seeking the bugs!
>>>>>
>>>>> But be sure to include that random seed when you do hit a failure...
>>>>>
>>>>> Mike
>>>>>
>>>>> On Mon, Jul 26, 2010 at 8:23 AM, Robert Muir <rc...@gmail.com> wrote:
>>>>> > I agree, Shai can you open a bug? I cannot reproduce, did you use an
>>>>> > IBM JVM
>>>>> > or another environment that might help us figure it out?
>>>>> >
>>>>> > On Mon, Jul 26, 2010 at 6:29 AM, Michael McCandless
>>>>> > <lu...@mikemccandless.com> wrote:
>>>>> >>
>>>>> >> Hmmm this means a bug is lurking.  This is the power of random
>>>>> >> testing
>>>>> >> (that every time we all run tests, we're testing different "paths"
>>>>> >> through the code)....
>>>>> >>
>>>>> >> It seems exceptionally unlikely that LUCENE-2537's changes would
>>>>> >> cause
>>>>> >> this!
>>>>> >>
>>>>> >> But, unfortunately, when I plug that seed in I don't see it fail,
>>>>> >> which is odd.  I'll run a stress test to see if I can tickle the
>>>>> >> bug... can you open a Jira issue so we don't lose track?
>>>>> >>
>>>>> >> Mike
>>>>> >>
>>>>> >> On Mon, Jul 26, 2010 at 2:57 AM, Shai Erera <se...@gmail.com>
>>>>> >> wrote:
>>>>> >> > Hi
>>>>> >> >
>>>>> >> > I was running tests on trunk (after merging the changes from
>>>>> >> > LUCENE-2537)
>>>>> >> > and received this error message:
>>>>> >> >
>>>>> >> > expected:<true> but was:<false>
>>>>> >> >
>>>>> >> > junit.framework.AssertionFailedError: expected: but was:
>>>>> >> > at
>>>>> >> >
>>>>> >> >
>>>>> >> > org.apache.lucene.util.automaton.TestUTF32ToUTF8.assertAutomaton(TestUTF32ToUTF8.java:197)
>>>>> >> > at
>>>>> >> >
>>>>> >> >
>>>>> >> > org.apache.lucene.util.automaton.TestUTF32ToUTF8.testRandomRegexes(TestUTF32ToUTF8.java:170)
>>>>> >> > at
>>>>> >> >
>>>>> >> > org.apache.lucene.util.LuceneTestCase.runBare(LuceneTestCase.java:285)
>>>>> >> >
>>>>> >> > NOTE: random seed of testcase 'testRandomRegexes' was:
>>>>> >> > 3510820306304573866
>>>>> >> >
>>>>> >> > I'm sure it's related to my changes. Has anyone else seen this
>>>>> >> > before?
>>>>> >> >
>>>>> >> > Shai
>>>>> >> >
>>>>> >>
>>>>> >>
>>>>> >> ---------------------------------------------------------------------
>>>>> >> To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
>>>>> >> For additional commands, e-mail: dev-help@lucene.apache.org
>>>>> >>
>>>>> >
>>>>> >
>>>>> >
>>>>> > --
>>>>> > Robert Muir
>>>>> > rcmuir@gmail.com
>>>>> >
>>>>>
>>>>> ---------------------------------------------------------------------
>>>>> To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
>>>>> For additional commands, e-mail: dev-help@lucene.apache.org
>>>>>
>>>>
>>>
>>
>>
>>
>> --
>> Robert Muir
>> rcmuir@gmail.com
>
>

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


Re: TestUTF32ToUTF8.testRandomRegexes fails

Posted by Robert Muir <rc...@gmail.com>.
sorry, i screwed up the name of the test, i meant TestRegexpRandom2

On Mon, Jul 26, 2010 at 11:46 AM, Robert Muir <rc...@gmail.com> wrote:

> hmm maybe the bug is in AutomatonTestUtil.randomRegex?
>
> can you do me a favor and run -Dtestcase=TestRandomRegex2
> This testcase also uses this same randomRegex method.
>
> you can also "crank" it like our other random tests, for instance with
> -Drandom.multiplier=3
>
> On Mon, Jul 26, 2010 at 11:40 AM, Shai Erera <se...@gmail.com> wrote:
>
>> Ok I've dug deeper into the test. I set the random seed to
>> -9029631602016965389L in setUp(), and discovered that on the 4th iteration
>> it breaks. For some reason though, AutomatonTestUtil.randomRegex generates
>> different strings every time I run the test, even though it uses the same
>> Random object w/ the same seed ...
>>
>> Anyway, one of the regex that failed was this "l.E" (w/o the quotes) and I
>> think it's a lowercase L, '.' (dot) and 'E' (uppercase). Hope this helps.
>>
>> Shai
>>
>> On Mon, Jul 26, 2010 at 6:23 PM, Robert Muir <rc...@gmail.com> wrote:
>>
>>> sounds nasty... its good you are running the tests with this different
>>> jvm...
>>>
>>>
>>> On Mon, Jul 26, 2010 at 11:21 AM, Shai Erera <se...@gmail.com> wrote:
>>>
>>>> Tried to run it w/ SUN JRE6 and it succeeds ! I've tried several times
>>>> and it succeeds every time. However, when I revert back to IBM's, it fail
>>>> immediately.
>>>>
>>>> I can help w/ the debug, if you give me a hint where to look :).
>>>>
>>>> Shai
>>>>
>>>> On Mon, Jul 26, 2010 at 5:57 PM, Shai Erera <se...@gmail.com> wrote:
>>>>
>>>>> Sorry for the delayed response.
>>>>>
>>>>> I ran it a couple more times, from Eclipse and Ant, and each time it
>>>>> fails (amazing !), w/ different seeds. More seeds that fail:
>>>>> NOTE: random seed of testcase 'testRandomRegexes' was:
>>>>> -4244174191361080127
>>>>> NOTE: random seed of testcase 'testRandomRegexes' was:
>>>>> -7059086272401721644
>>>>> NOTE: random seed of testcase 'testRandomRegexes' was:
>>>>> -1314734215611104147
>>>>>
>>>>> I use IBM JVM, tried w/ both 1.5 and 1.6 ...
>>>>>
>>>>> Mike, can we use LUCENE-2565 to track this, or would you prefer that I
>>>>> open a separate one?
>>>>>
>>>>> Shai
>>>>>
>>>>>
>>>>> On Mon, Jul 26, 2010 at 3:26 PM, Michael McCandless <
>>>>> lucene@mikemccandless.com> wrote:
>>>>>
>>>>>> On a more general note...
>>>>>>
>>>>>> Any time any of you out there hit an "odd" test failure, please please
>>>>>> please do just what Shai did: take it to the dev list!
>>>>>>
>>>>>> Think of Lucene's unit tests like SETI :)  We are desperately seeking
>>>>>> bugs, and you and your machine may just be lucky enough to find one...
>>>>>> go forth and buy expensive new power hungry computers just so you can
>>>>>> run the random tests over and over, seeking the bugs!
>>>>>>
>>>>>> But be sure to include that random seed when you do hit a failure...
>>>>>>
>>>>>> Mike
>>>>>>
>>>>>> On Mon, Jul 26, 2010 at 8:23 AM, Robert Muir <rc...@gmail.com>
>>>>>> wrote:
>>>>>> > I agree, Shai can you open a bug? I cannot reproduce, did you use an
>>>>>> IBM JVM
>>>>>> > or another environment that might help us figure it out?
>>>>>> >
>>>>>> > On Mon, Jul 26, 2010 at 6:29 AM, Michael McCandless
>>>>>> > <lu...@mikemccandless.com> wrote:
>>>>>> >>
>>>>>> >> Hmmm this means a bug is lurking.  This is the power of random
>>>>>> testing
>>>>>> >> (that every time we all run tests, we're testing different "paths"
>>>>>> >> through the code)....
>>>>>> >>
>>>>>> >> It seems exceptionally unlikely that LUCENE-2537's changes would
>>>>>> cause
>>>>>> >> this!
>>>>>> >>
>>>>>> >> But, unfortunately, when I plug that seed in I don't see it fail,
>>>>>> >> which is odd.  I'll run a stress test to see if I can tickle the
>>>>>> >> bug... can you open a Jira issue so we don't lose track?
>>>>>> >>
>>>>>> >> Mike
>>>>>> >>
>>>>>> >> On Mon, Jul 26, 2010 at 2:57 AM, Shai Erera <se...@gmail.com>
>>>>>> wrote:
>>>>>> >> > Hi
>>>>>> >> >
>>>>>> >> > I was running tests on trunk (after merging the changes from
>>>>>> >> > LUCENE-2537)
>>>>>> >> > and received this error message:
>>>>>> >> >
>>>>>> >> > expected:<true> but was:<false>
>>>>>> >> >
>>>>>> >> > junit.framework.AssertionFailedError: expected: but was:
>>>>>> >> > at
>>>>>> >> >
>>>>>> >> >
>>>>>> org.apache.lucene.util.automaton.TestUTF32ToUTF8.assertAutomaton(TestUTF32ToUTF8.java:197)
>>>>>> >> > at
>>>>>> >> >
>>>>>> >> >
>>>>>> org.apache.lucene.util.automaton.TestUTF32ToUTF8.testRandomRegexes(TestUTF32ToUTF8.java:170)
>>>>>> >> > at
>>>>>> >> >
>>>>>> org.apache.lucene.util.LuceneTestCase.runBare(LuceneTestCase.java:285)
>>>>>> >> >
>>>>>> >> > NOTE: random seed of testcase 'testRandomRegexes' was:
>>>>>> >> > 3510820306304573866
>>>>>> >> >
>>>>>> >> > I'm sure it's related to my changes. Has anyone else seen this
>>>>>> before?
>>>>>> >> >
>>>>>> >> > Shai
>>>>>> >> >
>>>>>> >>
>>>>>> >>
>>>>>> ---------------------------------------------------------------------
>>>>>> >> To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
>>>>>> >> For additional commands, e-mail: dev-help@lucene.apache.org
>>>>>> >>
>>>>>> >
>>>>>> >
>>>>>> >
>>>>>> > --
>>>>>> > Robert Muir
>>>>>> > rcmuir@gmail.com
>>>>>> >
>>>>>>
>>>>>> ---------------------------------------------------------------------
>>>>>> To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
>>>>>> For additional commands, e-mail: dev-help@lucene.apache.org
>>>>>>
>>>>>>
>>>>>
>>>>
>>>
>>>
>>> --
>>> Robert Muir
>>> rcmuir@gmail.com
>>>
>>
>>
>
>
> --
> Robert Muir
> rcmuir@gmail.com
>



-- 
Robert Muir
rcmuir@gmail.com

Re: TestUTF32ToUTF8.testRandomRegexes fails

Posted by Robert Muir <rc...@gmail.com>.
hmm maybe the bug is in AutomatonTestUtil.randomRegex?

can you do me a favor and run -Dtestcase=TestRandomRegex2
This testcase also uses this same randomRegex method.

you can also "crank" it like our other random tests, for instance with
-Drandom.multiplier=3

On Mon, Jul 26, 2010 at 11:40 AM, Shai Erera <se...@gmail.com> wrote:

> Ok I've dug deeper into the test. I set the random seed to
> -9029631602016965389L in setUp(), and discovered that on the 4th iteration
> it breaks. For some reason though, AutomatonTestUtil.randomRegex generates
> different strings every time I run the test, even though it uses the same
> Random object w/ the same seed ...
>
> Anyway, one of the regex that failed was this "l.E" (w/o the quotes) and I
> think it's a lowercase L, '.' (dot) and 'E' (uppercase). Hope this helps.
>
> Shai
>
> On Mon, Jul 26, 2010 at 6:23 PM, Robert Muir <rc...@gmail.com> wrote:
>
>> sounds nasty... its good you are running the tests with this different
>> jvm...
>>
>>
>> On Mon, Jul 26, 2010 at 11:21 AM, Shai Erera <se...@gmail.com> wrote:
>>
>>> Tried to run it w/ SUN JRE6 and it succeeds ! I've tried several times
>>> and it succeeds every time. However, when I revert back to IBM's, it fail
>>> immediately.
>>>
>>> I can help w/ the debug, if you give me a hint where to look :).
>>>
>>> Shai
>>>
>>> On Mon, Jul 26, 2010 at 5:57 PM, Shai Erera <se...@gmail.com> wrote:
>>>
>>>> Sorry for the delayed response.
>>>>
>>>> I ran it a couple more times, from Eclipse and Ant, and each time it
>>>> fails (amazing !), w/ different seeds. More seeds that fail:
>>>> NOTE: random seed of testcase 'testRandomRegexes' was:
>>>> -4244174191361080127
>>>> NOTE: random seed of testcase 'testRandomRegexes' was:
>>>> -7059086272401721644
>>>> NOTE: random seed of testcase 'testRandomRegexes' was:
>>>> -1314734215611104147
>>>>
>>>> I use IBM JVM, tried w/ both 1.5 and 1.6 ...
>>>>
>>>> Mike, can we use LUCENE-2565 to track this, or would you prefer that I
>>>> open a separate one?
>>>>
>>>> Shai
>>>>
>>>>
>>>> On Mon, Jul 26, 2010 at 3:26 PM, Michael McCandless <
>>>> lucene@mikemccandless.com> wrote:
>>>>
>>>>> On a more general note...
>>>>>
>>>>> Any time any of you out there hit an "odd" test failure, please please
>>>>> please do just what Shai did: take it to the dev list!
>>>>>
>>>>> Think of Lucene's unit tests like SETI :)  We are desperately seeking
>>>>> bugs, and you and your machine may just be lucky enough to find one...
>>>>> go forth and buy expensive new power hungry computers just so you can
>>>>> run the random tests over and over, seeking the bugs!
>>>>>
>>>>> But be sure to include that random seed when you do hit a failure...
>>>>>
>>>>> Mike
>>>>>
>>>>> On Mon, Jul 26, 2010 at 8:23 AM, Robert Muir <rc...@gmail.com> wrote:
>>>>> > I agree, Shai can you open a bug? I cannot reproduce, did you use an
>>>>> IBM JVM
>>>>> > or another environment that might help us figure it out?
>>>>> >
>>>>> > On Mon, Jul 26, 2010 at 6:29 AM, Michael McCandless
>>>>> > <lu...@mikemccandless.com> wrote:
>>>>> >>
>>>>> >> Hmmm this means a bug is lurking.  This is the power of random
>>>>> testing
>>>>> >> (that every time we all run tests, we're testing different "paths"
>>>>> >> through the code)....
>>>>> >>
>>>>> >> It seems exceptionally unlikely that LUCENE-2537's changes would
>>>>> cause
>>>>> >> this!
>>>>> >>
>>>>> >> But, unfortunately, when I plug that seed in I don't see it fail,
>>>>> >> which is odd.  I'll run a stress test to see if I can tickle the
>>>>> >> bug... can you open a Jira issue so we don't lose track?
>>>>> >>
>>>>> >> Mike
>>>>> >>
>>>>> >> On Mon, Jul 26, 2010 at 2:57 AM, Shai Erera <se...@gmail.com>
>>>>> wrote:
>>>>> >> > Hi
>>>>> >> >
>>>>> >> > I was running tests on trunk (after merging the changes from
>>>>> >> > LUCENE-2537)
>>>>> >> > and received this error message:
>>>>> >> >
>>>>> >> > expected:<true> but was:<false>
>>>>> >> >
>>>>> >> > junit.framework.AssertionFailedError: expected: but was:
>>>>> >> > at
>>>>> >> >
>>>>> >> >
>>>>> org.apache.lucene.util.automaton.TestUTF32ToUTF8.assertAutomaton(TestUTF32ToUTF8.java:197)
>>>>> >> > at
>>>>> >> >
>>>>> >> >
>>>>> org.apache.lucene.util.automaton.TestUTF32ToUTF8.testRandomRegexes(TestUTF32ToUTF8.java:170)
>>>>> >> > at
>>>>> >> >
>>>>> org.apache.lucene.util.LuceneTestCase.runBare(LuceneTestCase.java:285)
>>>>> >> >
>>>>> >> > NOTE: random seed of testcase 'testRandomRegexes' was:
>>>>> >> > 3510820306304573866
>>>>> >> >
>>>>> >> > I'm sure it's related to my changes. Has anyone else seen this
>>>>> before?
>>>>> >> >
>>>>> >> > Shai
>>>>> >> >
>>>>> >>
>>>>> >>
>>>>> ---------------------------------------------------------------------
>>>>> >> To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
>>>>> >> For additional commands, e-mail: dev-help@lucene.apache.org
>>>>> >>
>>>>> >
>>>>> >
>>>>> >
>>>>> > --
>>>>> > Robert Muir
>>>>> > rcmuir@gmail.com
>>>>> >
>>>>>
>>>>> ---------------------------------------------------------------------
>>>>> To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
>>>>> For additional commands, e-mail: dev-help@lucene.apache.org
>>>>>
>>>>>
>>>>
>>>
>>
>>
>> --
>> Robert Muir
>> rcmuir@gmail.com
>>
>
>


-- 
Robert Muir
rcmuir@gmail.com

Re: TestUTF32ToUTF8.testRandomRegexes fails

Posted by Shai Erera <se...@gmail.com>.
Ok I've dug deeper into the test. I set the random seed to
-9029631602016965389L in setUp(), and discovered that on the 4th iteration
it breaks. For some reason though, AutomatonTestUtil.randomRegex generates
different strings every time I run the test, even though it uses the same
Random object w/ the same seed ...

Anyway, one of the regex that failed was this "l.E" (w/o the quotes) and I
think it's a lowercase L, '.' (dot) and 'E' (uppercase). Hope this helps.

Shai

On Mon, Jul 26, 2010 at 6:23 PM, Robert Muir <rc...@gmail.com> wrote:

> sounds nasty... its good you are running the tests with this different
> jvm...
>
>
> On Mon, Jul 26, 2010 at 11:21 AM, Shai Erera <se...@gmail.com> wrote:
>
>> Tried to run it w/ SUN JRE6 and it succeeds ! I've tried several times and
>> it succeeds every time. However, when I revert back to IBM's, it fail
>> immediately.
>>
>> I can help w/ the debug, if you give me a hint where to look :).
>>
>> Shai
>>
>> On Mon, Jul 26, 2010 at 5:57 PM, Shai Erera <se...@gmail.com> wrote:
>>
>>> Sorry for the delayed response.
>>>
>>> I ran it a couple more times, from Eclipse and Ant, and each time it
>>> fails (amazing !), w/ different seeds. More seeds that fail:
>>> NOTE: random seed of testcase 'testRandomRegexes' was:
>>> -4244174191361080127
>>> NOTE: random seed of testcase 'testRandomRegexes' was:
>>> -7059086272401721644
>>> NOTE: random seed of testcase 'testRandomRegexes' was:
>>> -1314734215611104147
>>>
>>> I use IBM JVM, tried w/ both 1.5 and 1.6 ...
>>>
>>> Mike, can we use LUCENE-2565 to track this, or would you prefer that I
>>> open a separate one?
>>>
>>> Shai
>>>
>>>
>>> On Mon, Jul 26, 2010 at 3:26 PM, Michael McCandless <
>>> lucene@mikemccandless.com> wrote:
>>>
>>>> On a more general note...
>>>>
>>>> Any time any of you out there hit an "odd" test failure, please please
>>>> please do just what Shai did: take it to the dev list!
>>>>
>>>> Think of Lucene's unit tests like SETI :)  We are desperately seeking
>>>> bugs, and you and your machine may just be lucky enough to find one...
>>>> go forth and buy expensive new power hungry computers just so you can
>>>> run the random tests over and over, seeking the bugs!
>>>>
>>>> But be sure to include that random seed when you do hit a failure...
>>>>
>>>> Mike
>>>>
>>>> On Mon, Jul 26, 2010 at 8:23 AM, Robert Muir <rc...@gmail.com> wrote:
>>>> > I agree, Shai can you open a bug? I cannot reproduce, did you use an
>>>> IBM JVM
>>>> > or another environment that might help us figure it out?
>>>> >
>>>> > On Mon, Jul 26, 2010 at 6:29 AM, Michael McCandless
>>>> > <lu...@mikemccandless.com> wrote:
>>>> >>
>>>> >> Hmmm this means a bug is lurking.  This is the power of random
>>>> testing
>>>> >> (that every time we all run tests, we're testing different "paths"
>>>> >> through the code)....
>>>> >>
>>>> >> It seems exceptionally unlikely that LUCENE-2537's changes would
>>>> cause
>>>> >> this!
>>>> >>
>>>> >> But, unfortunately, when I plug that seed in I don't see it fail,
>>>> >> which is odd.  I'll run a stress test to see if I can tickle the
>>>> >> bug... can you open a Jira issue so we don't lose track?
>>>> >>
>>>> >> Mike
>>>> >>
>>>> >> On Mon, Jul 26, 2010 at 2:57 AM, Shai Erera <se...@gmail.com>
>>>> wrote:
>>>> >> > Hi
>>>> >> >
>>>> >> > I was running tests on trunk (after merging the changes from
>>>> >> > LUCENE-2537)
>>>> >> > and received this error message:
>>>> >> >
>>>> >> > expected:<true> but was:<false>
>>>> >> >
>>>> >> > junit.framework.AssertionFailedError: expected: but was:
>>>> >> > at
>>>> >> >
>>>> >> >
>>>> org.apache.lucene.util.automaton.TestUTF32ToUTF8.assertAutomaton(TestUTF32ToUTF8.java:197)
>>>> >> > at
>>>> >> >
>>>> >> >
>>>> org.apache.lucene.util.automaton.TestUTF32ToUTF8.testRandomRegexes(TestUTF32ToUTF8.java:170)
>>>> >> > at
>>>> >> >
>>>> org.apache.lucene.util.LuceneTestCase.runBare(LuceneTestCase.java:285)
>>>> >> >
>>>> >> > NOTE: random seed of testcase 'testRandomRegexes' was:
>>>> >> > 3510820306304573866
>>>> >> >
>>>> >> > I'm sure it's related to my changes. Has anyone else seen this
>>>> before?
>>>> >> >
>>>> >> > Shai
>>>> >> >
>>>> >>
>>>> >> ---------------------------------------------------------------------
>>>> >> To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
>>>> >> For additional commands, e-mail: dev-help@lucene.apache.org
>>>> >>
>>>> >
>>>> >
>>>> >
>>>> > --
>>>> > Robert Muir
>>>> > rcmuir@gmail.com
>>>> >
>>>>
>>>> ---------------------------------------------------------------------
>>>> To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
>>>> For additional commands, e-mail: dev-help@lucene.apache.org
>>>>
>>>>
>>>
>>
>
>
> --
> Robert Muir
> rcmuir@gmail.com
>

Re: TestUTF32ToUTF8.testRandomRegexes fails

Posted by Robert Muir <rc...@gmail.com>.
sounds nasty... its good you are running the tests with this different
jvm...

On Mon, Jul 26, 2010 at 11:21 AM, Shai Erera <se...@gmail.com> wrote:

> Tried to run it w/ SUN JRE6 and it succeeds ! I've tried several times and
> it succeeds every time. However, when I revert back to IBM's, it fail
> immediately.
>
> I can help w/ the debug, if you give me a hint where to look :).
>
> Shai
>
> On Mon, Jul 26, 2010 at 5:57 PM, Shai Erera <se...@gmail.com> wrote:
>
>> Sorry for the delayed response.
>>
>> I ran it a couple more times, from Eclipse and Ant, and each time it fails
>> (amazing !), w/ different seeds. More seeds that fail:
>> NOTE: random seed of testcase 'testRandomRegexes' was:
>> -4244174191361080127
>> NOTE: random seed of testcase 'testRandomRegexes' was:
>> -7059086272401721644
>> NOTE: random seed of testcase 'testRandomRegexes' was:
>> -1314734215611104147
>>
>> I use IBM JVM, tried w/ both 1.5 and 1.6 ...
>>
>> Mike, can we use LUCENE-2565 to track this, or would you prefer that I
>> open a separate one?
>>
>> Shai
>>
>>
>> On Mon, Jul 26, 2010 at 3:26 PM, Michael McCandless <
>> lucene@mikemccandless.com> wrote:
>>
>>> On a more general note...
>>>
>>> Any time any of you out there hit an "odd" test failure, please please
>>> please do just what Shai did: take it to the dev list!
>>>
>>> Think of Lucene's unit tests like SETI :)  We are desperately seeking
>>> bugs, and you and your machine may just be lucky enough to find one...
>>> go forth and buy expensive new power hungry computers just so you can
>>> run the random tests over and over, seeking the bugs!
>>>
>>> But be sure to include that random seed when you do hit a failure...
>>>
>>> Mike
>>>
>>> On Mon, Jul 26, 2010 at 8:23 AM, Robert Muir <rc...@gmail.com> wrote:
>>> > I agree, Shai can you open a bug? I cannot reproduce, did you use an
>>> IBM JVM
>>> > or another environment that might help us figure it out?
>>> >
>>> > On Mon, Jul 26, 2010 at 6:29 AM, Michael McCandless
>>> > <lu...@mikemccandless.com> wrote:
>>> >>
>>> >> Hmmm this means a bug is lurking.  This is the power of random testing
>>> >> (that every time we all run tests, we're testing different "paths"
>>> >> through the code)....
>>> >>
>>> >> It seems exceptionally unlikely that LUCENE-2537's changes would cause
>>> >> this!
>>> >>
>>> >> But, unfortunately, when I plug that seed in I don't see it fail,
>>> >> which is odd.  I'll run a stress test to see if I can tickle the
>>> >> bug... can you open a Jira issue so we don't lose track?
>>> >>
>>> >> Mike
>>> >>
>>> >> On Mon, Jul 26, 2010 at 2:57 AM, Shai Erera <se...@gmail.com> wrote:
>>> >> > Hi
>>> >> >
>>> >> > I was running tests on trunk (after merging the changes from
>>> >> > LUCENE-2537)
>>> >> > and received this error message:
>>> >> >
>>> >> > expected:<true> but was:<false>
>>> >> >
>>> >> > junit.framework.AssertionFailedError: expected: but was:
>>> >> > at
>>> >> >
>>> >> >
>>> org.apache.lucene.util.automaton.TestUTF32ToUTF8.assertAutomaton(TestUTF32ToUTF8.java:197)
>>> >> > at
>>> >> >
>>> >> >
>>> org.apache.lucene.util.automaton.TestUTF32ToUTF8.testRandomRegexes(TestUTF32ToUTF8.java:170)
>>> >> > at
>>> >> >
>>> org.apache.lucene.util.LuceneTestCase.runBare(LuceneTestCase.java:285)
>>> >> >
>>> >> > NOTE: random seed of testcase 'testRandomRegexes' was:
>>> >> > 3510820306304573866
>>> >> >
>>> >> > I'm sure it's related to my changes. Has anyone else seen this
>>> before?
>>> >> >
>>> >> > Shai
>>> >> >
>>> >>
>>> >> ---------------------------------------------------------------------
>>> >> To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
>>> >> For additional commands, e-mail: dev-help@lucene.apache.org
>>> >>
>>> >
>>> >
>>> >
>>> > --
>>> > Robert Muir
>>> > rcmuir@gmail.com
>>> >
>>>
>>> ---------------------------------------------------------------------
>>> To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
>>> For additional commands, e-mail: dev-help@lucene.apache.org
>>>
>>>
>>
>


-- 
Robert Muir
rcmuir@gmail.com

Re: TestUTF32ToUTF8.testRandomRegexes fails

Posted by Shai Erera <se...@gmail.com>.
Tried to run it w/ SUN JRE6 and it succeeds ! I've tried several times and
it succeeds every time. However, when I revert back to IBM's, it fail
immediately.

I can help w/ the debug, if you give me a hint where to look :).

Shai

On Mon, Jul 26, 2010 at 5:57 PM, Shai Erera <se...@gmail.com> wrote:

> Sorry for the delayed response.
>
> I ran it a couple more times, from Eclipse and Ant, and each time it fails
> (amazing !), w/ different seeds. More seeds that fail:
> NOTE: random seed of testcase 'testRandomRegexes' was: -4244174191361080127
> NOTE: random seed of testcase 'testRandomRegexes' was: -7059086272401721644
> NOTE: random seed of testcase 'testRandomRegexes' was: -1314734215611104147
>
> I use IBM JVM, tried w/ both 1.5 and 1.6 ...
>
> Mike, can we use LUCENE-2565 to track this, or would you prefer that I open
> a separate one?
>
> Shai
>
>
> On Mon, Jul 26, 2010 at 3:26 PM, Michael McCandless <
> lucene@mikemccandless.com> wrote:
>
>> On a more general note...
>>
>> Any time any of you out there hit an "odd" test failure, please please
>> please do just what Shai did: take it to the dev list!
>>
>> Think of Lucene's unit tests like SETI :)  We are desperately seeking
>> bugs, and you and your machine may just be lucky enough to find one...
>> go forth and buy expensive new power hungry computers just so you can
>> run the random tests over and over, seeking the bugs!
>>
>> But be sure to include that random seed when you do hit a failure...
>>
>> Mike
>>
>> On Mon, Jul 26, 2010 at 8:23 AM, Robert Muir <rc...@gmail.com> wrote:
>> > I agree, Shai can you open a bug? I cannot reproduce, did you use an IBM
>> JVM
>> > or another environment that might help us figure it out?
>> >
>> > On Mon, Jul 26, 2010 at 6:29 AM, Michael McCandless
>> > <lu...@mikemccandless.com> wrote:
>> >>
>> >> Hmmm this means a bug is lurking.  This is the power of random testing
>> >> (that every time we all run tests, we're testing different "paths"
>> >> through the code)....
>> >>
>> >> It seems exceptionally unlikely that LUCENE-2537's changes would cause
>> >> this!
>> >>
>> >> But, unfortunately, when I plug that seed in I don't see it fail,
>> >> which is odd.  I'll run a stress test to see if I can tickle the
>> >> bug... can you open a Jira issue so we don't lose track?
>> >>
>> >> Mike
>> >>
>> >> On Mon, Jul 26, 2010 at 2:57 AM, Shai Erera <se...@gmail.com> wrote:
>> >> > Hi
>> >> >
>> >> > I was running tests on trunk (after merging the changes from
>> >> > LUCENE-2537)
>> >> > and received this error message:
>> >> >
>> >> > expected:<true> but was:<false>
>> >> >
>> >> > junit.framework.AssertionFailedError: expected: but was:
>> >> > at
>> >> >
>> >> >
>> org.apache.lucene.util.automaton.TestUTF32ToUTF8.assertAutomaton(TestUTF32ToUTF8.java:197)
>> >> > at
>> >> >
>> >> >
>> org.apache.lucene.util.automaton.TestUTF32ToUTF8.testRandomRegexes(TestUTF32ToUTF8.java:170)
>> >> > at
>> >> >
>> org.apache.lucene.util.LuceneTestCase.runBare(LuceneTestCase.java:285)
>> >> >
>> >> > NOTE: random seed of testcase 'testRandomRegexes' was:
>> >> > 3510820306304573866
>> >> >
>> >> > I'm sure it's related to my changes. Has anyone else seen this
>> before?
>> >> >
>> >> > Shai
>> >> >
>> >>
>> >> ---------------------------------------------------------------------
>> >> To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
>> >> For additional commands, e-mail: dev-help@lucene.apache.org
>> >>
>> >
>> >
>> >
>> > --
>> > Robert Muir
>> > rcmuir@gmail.com
>> >
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
>> For additional commands, e-mail: dev-help@lucene.apache.org
>>
>>
>

Re: TestUTF32ToUTF8.testRandomRegexes fails

Posted by Shai Erera <se...@gmail.com>.
Sorry for the delayed response.

I ran it a couple more times, from Eclipse and Ant, and each time it fails
(amazing !), w/ different seeds. More seeds that fail:
NOTE: random seed of testcase 'testRandomRegexes' was: -4244174191361080127
NOTE: random seed of testcase 'testRandomRegexes' was: -7059086272401721644
NOTE: random seed of testcase 'testRandomRegexes' was: -1314734215611104147

I use IBM JVM, tried w/ both 1.5 and 1.6 ...

Mike, can we use LUCENE-2565 to track this, or would you prefer that I open
a separate one?

Shai

On Mon, Jul 26, 2010 at 3:26 PM, Michael McCandless <
lucene@mikemccandless.com> wrote:

> On a more general note...
>
> Any time any of you out there hit an "odd" test failure, please please
> please do just what Shai did: take it to the dev list!
>
> Think of Lucene's unit tests like SETI :)  We are desperately seeking
> bugs, and you and your machine may just be lucky enough to find one...
> go forth and buy expensive new power hungry computers just so you can
> run the random tests over and over, seeking the bugs!
>
> But be sure to include that random seed when you do hit a failure...
>
> Mike
>
> On Mon, Jul 26, 2010 at 8:23 AM, Robert Muir <rc...@gmail.com> wrote:
> > I agree, Shai can you open a bug? I cannot reproduce, did you use an IBM
> JVM
> > or another environment that might help us figure it out?
> >
> > On Mon, Jul 26, 2010 at 6:29 AM, Michael McCandless
> > <lu...@mikemccandless.com> wrote:
> >>
> >> Hmmm this means a bug is lurking.  This is the power of random testing
> >> (that every time we all run tests, we're testing different "paths"
> >> through the code)....
> >>
> >> It seems exceptionally unlikely that LUCENE-2537's changes would cause
> >> this!
> >>
> >> But, unfortunately, when I plug that seed in I don't see it fail,
> >> which is odd.  I'll run a stress test to see if I can tickle the
> >> bug... can you open a Jira issue so we don't lose track?
> >>
> >> Mike
> >>
> >> On Mon, Jul 26, 2010 at 2:57 AM, Shai Erera <se...@gmail.com> wrote:
> >> > Hi
> >> >
> >> > I was running tests on trunk (after merging the changes from
> >> > LUCENE-2537)
> >> > and received this error message:
> >> >
> >> > expected:<true> but was:<false>
> >> >
> >> > junit.framework.AssertionFailedError: expected: but was:
> >> > at
> >> >
> >> >
> org.apache.lucene.util.automaton.TestUTF32ToUTF8.assertAutomaton(TestUTF32ToUTF8.java:197)
> >> > at
> >> >
> >> >
> org.apache.lucene.util.automaton.TestUTF32ToUTF8.testRandomRegexes(TestUTF32ToUTF8.java:170)
> >> > at
> >> > org.apache.lucene.util.LuceneTestCase.runBare(LuceneTestCase.java:285)
> >> >
> >> > NOTE: random seed of testcase 'testRandomRegexes' was:
> >> > 3510820306304573866
> >> >
> >> > I'm sure it's related to my changes. Has anyone else seen this before?
> >> >
> >> > Shai
> >> >
> >>
> >> ---------------------------------------------------------------------
> >> To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
> >> For additional commands, e-mail: dev-help@lucene.apache.org
> >>
> >
> >
> >
> > --
> > Robert Muir
> > rcmuir@gmail.com
> >
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
> For additional commands, e-mail: dev-help@lucene.apache.org
>
>

Re: TestUTF32ToUTF8.testRandomRegexes fails

Posted by Michael McCandless <lu...@mikemccandless.com>.
On a more general note...

Any time any of you out there hit an "odd" test failure, please please
please do just what Shai did: take it to the dev list!

Think of Lucene's unit tests like SETI :)  We are desperately seeking
bugs, and you and your machine may just be lucky enough to find one...
go forth and buy expensive new power hungry computers just so you can
run the random tests over and over, seeking the bugs!

But be sure to include that random seed when you do hit a failure...

Mike

On Mon, Jul 26, 2010 at 8:23 AM, Robert Muir <rc...@gmail.com> wrote:
> I agree, Shai can you open a bug? I cannot reproduce, did you use an IBM JVM
> or another environment that might help us figure it out?
>
> On Mon, Jul 26, 2010 at 6:29 AM, Michael McCandless
> <lu...@mikemccandless.com> wrote:
>>
>> Hmmm this means a bug is lurking.  This is the power of random testing
>> (that every time we all run tests, we're testing different "paths"
>> through the code)....
>>
>> It seems exceptionally unlikely that LUCENE-2537's changes would cause
>> this!
>>
>> But, unfortunately, when I plug that seed in I don't see it fail,
>> which is odd.  I'll run a stress test to see if I can tickle the
>> bug... can you open a Jira issue so we don't lose track?
>>
>> Mike
>>
>> On Mon, Jul 26, 2010 at 2:57 AM, Shai Erera <se...@gmail.com> wrote:
>> > Hi
>> >
>> > I was running tests on trunk (after merging the changes from
>> > LUCENE-2537)
>> > and received this error message:
>> >
>> > expected:<true> but was:<false>
>> >
>> > junit.framework.AssertionFailedError: expected: but was:
>> > at
>> >
>> > org.apache.lucene.util.automaton.TestUTF32ToUTF8.assertAutomaton(TestUTF32ToUTF8.java:197)
>> > at
>> >
>> > org.apache.lucene.util.automaton.TestUTF32ToUTF8.testRandomRegexes(TestUTF32ToUTF8.java:170)
>> > at
>> > org.apache.lucene.util.LuceneTestCase.runBare(LuceneTestCase.java:285)
>> >
>> > NOTE: random seed of testcase 'testRandomRegexes' was:
>> > 3510820306304573866
>> >
>> > I'm sure it's related to my changes. Has anyone else seen this before?
>> >
>> > Shai
>> >
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
>> For additional commands, e-mail: dev-help@lucene.apache.org
>>
>
>
>
> --
> Robert Muir
> rcmuir@gmail.com
>

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


Re: TestUTF32ToUTF8.testRandomRegexes fails

Posted by Robert Muir <rc...@gmail.com>.
I agree, Shai can you open a bug? I cannot reproduce, did you use an IBM JVM
or another environment that might help us figure it out?

On Mon, Jul 26, 2010 at 6:29 AM, Michael McCandless <
lucene@mikemccandless.com> wrote:

> Hmmm this means a bug is lurking.  This is the power of random testing
> (that every time we all run tests, we're testing different "paths"
> through the code)....
>
> It seems exceptionally unlikely that LUCENE-2537's changes would cause
> this!
>
> But, unfortunately, when I plug that seed in I don't see it fail,
> which is odd.  I'll run a stress test to see if I can tickle the
> bug... can you open a Jira issue so we don't lose track?
>
> Mike
>
> On Mon, Jul 26, 2010 at 2:57 AM, Shai Erera <se...@gmail.com> wrote:
> > Hi
> >
> > I was running tests on trunk (after merging the changes from LUCENE-2537)
> > and received this error message:
> >
> > expected:<true> but was:<false>
> >
> > junit.framework.AssertionFailedError: expected: but was:
> > at
> >
> org.apache.lucene.util.automaton.TestUTF32ToUTF8.assertAutomaton(TestUTF32ToUTF8.java:197)
> > at
> >
> org.apache.lucene.util.automaton.TestUTF32ToUTF8.testRandomRegexes(TestUTF32ToUTF8.java:170)
> > at org.apache.lucene.util.LuceneTestCase.runBare(LuceneTestCase.java:285)
> >
> > NOTE: random seed of testcase 'testRandomRegexes' was:
> 3510820306304573866
> >
> > I'm sure it's related to my changes. Has anyone else seen this before?
> >
> > Shai
> >
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
> For additional commands, e-mail: dev-help@lucene.apache.org
>
>


-- 
Robert Muir
rcmuir@gmail.com

Re: TestUTF32ToUTF8.testRandomRegexes fails

Posted by Michael McCandless <lu...@mikemccandless.com>.
Hmmm this means a bug is lurking.  This is the power of random testing
(that every time we all run tests, we're testing different "paths"
through the code)....

It seems exceptionally unlikely that LUCENE-2537's changes would cause this!

But, unfortunately, when I plug that seed in I don't see it fail,
which is odd.  I'll run a stress test to see if I can tickle the
bug... can you open a Jira issue so we don't lose track?

Mike

On Mon, Jul 26, 2010 at 2:57 AM, Shai Erera <se...@gmail.com> wrote:
> Hi
>
> I was running tests on trunk (after merging the changes from LUCENE-2537)
> and received this error message:
>
> expected:<true> but was:<false>
>
> junit.framework.AssertionFailedError: expected: but was:
> at
> org.apache.lucene.util.automaton.TestUTF32ToUTF8.assertAutomaton(TestUTF32ToUTF8.java:197)
> at
> org.apache.lucene.util.automaton.TestUTF32ToUTF8.testRandomRegexes(TestUTF32ToUTF8.java:170)
> at org.apache.lucene.util.LuceneTestCase.runBare(LuceneTestCase.java:285)
>
> NOTE: random seed of testcase 'testRandomRegexes' was: 3510820306304573866
>
> I'm sure it's related to my changes. Has anyone else seen this before?
>
> Shai
>

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org