You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@lucene.apache.org by "Michael McCandless (JIRA)" <ji...@apache.org> on 2010/12/18 12:06:00 UTC

[jira] Created: (LUCENE-2819) LuceneTestCase's check for uncaught exceptions in threads causes collateral damage?

LuceneTestCase's check for uncaught exceptions in threads causes collateral damage?
-----------------------------------------------------------------------------------

                 Key: LUCENE-2819
                 URL: https://issues.apache.org/jira/browse/LUCENE-2819
             Project: Lucene - Java
          Issue Type: Bug
          Components: Tests
            Reporter: Michael McCandless
             Fix For: 3.1, 4.0


Eg see these failures:

    https://hudson.apache.org/hudson/job/Lucene-3.x/214/

Multiple test methods failed in TestIndexWriterOnDiskFull, but, I think only 1 test had a real failure but somehow our "thread hit exc" tracking incorrectly blames the other 3 cases?

I'm not sure about this but it seems like something like that is going on...

So, one problem is that LuceneTestCase.tearDown fails on any thread excs, but if CMS had also hit a failure, then fails to clear CMS's thread failures.  I think we should just remove CMS's thread failure tracking?  (It's static so it can definitely bleed across tests).  Ie, just rely on LuceneTestCase's tracking.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


[jira] Updated: (LUCENE-2819) LuceneTestCase's check for uncaught exceptions in threads causes collateral damage?

Posted by "Robert Muir (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/LUCENE-2819?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Robert Muir updated LUCENE-2819:
--------------------------------

    Attachment: LUCENE-2819.patch

here's an updated patch, I think its much better.
The core tests are passing but still need to do contrib/solr.

Some problems i found, were having to 'actually close' the executorservices because ParallelMultiShredder doesnt wait for the shutdown to actually happen in its close().

Also the TimeLimitingCollector creates a new thread...statically! This just seems really evil.

I don't think tests should be creating threads and not cleaning up after themselves!

You might also ask why even bother killing the the threads if we will fail anyway? 
True we will already fail the test in this case, but this is just to try to
prevent the fails from being attributed to other test cases (the original problem here).


> LuceneTestCase's check for uncaught exceptions in threads causes collateral damage?
> -----------------------------------------------------------------------------------
>
>                 Key: LUCENE-2819
>                 URL: https://issues.apache.org/jira/browse/LUCENE-2819
>             Project: Lucene - Java
>          Issue Type: Bug
>          Components: Tests
>            Reporter: Michael McCandless
>             Fix For: 3.1, 4.0
>
>         Attachments: LUCENE-2819.patch, LUCENE-2819.patch, LUCENE-2819.patch
>
>
> Eg see these failures:
>     https://hudson.apache.org/hudson/job/Lucene-3.x/214/
> Multiple test methods failed in TestIndexWriterOnDiskFull, but, I think only 1 test had a real failure but somehow our "thread hit exc" tracking incorrectly blames the other 3 cases?
> I'm not sure about this but it seems like something like that is going on...
> So, one problem is that LuceneTestCase.tearDown fails on any thread excs, but if CMS had also hit a failure, then fails to clear CMS's thread failures.  I think we should just remove CMS's thread failure tracking?  (It's static so it can definitely bleed across tests).  Ie, just rely on LuceneTestCase's tracking.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


[jira] Updated: (LUCENE-2819) LuceneTestCase's check for uncaught exceptions in threads causes collateral damage?

Posted by "Robert Muir (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/LUCENE-2819?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Robert Muir updated LUCENE-2819:
--------------------------------

    Attachment: LUCENE-2819.patch

ok final patch.

We can't quite fail() yet (it just warns for now) but we should fix it to fail.

For the solr tests we only test this in afterClass, this is because many solr tests legitimately
start up threads in beforeClass and shut them down in afterClass.

This means we cant prevent 'collateral damage' in these solr tests, but we can for lucene.

still for the solr tests, we can prevent collateral damage across test classes, and find resource leaks.


> LuceneTestCase's check for uncaught exceptions in threads causes collateral damage?
> -----------------------------------------------------------------------------------
>
>                 Key: LUCENE-2819
>                 URL: https://issues.apache.org/jira/browse/LUCENE-2819
>             Project: Lucene - Java
>          Issue Type: Bug
>          Components: Tests
>            Reporter: Michael McCandless
>             Fix For: 3.1, 4.0
>
>         Attachments: LUCENE-2819.patch, LUCENE-2819.patch, LUCENE-2819.patch, LUCENE-2819.patch
>
>
> Eg see these failures:
>     https://hudson.apache.org/hudson/job/Lucene-3.x/214/
> Multiple test methods failed in TestIndexWriterOnDiskFull, but, I think only 1 test had a real failure but somehow our "thread hit exc" tracking incorrectly blames the other 3 cases?
> I'm not sure about this but it seems like something like that is going on...
> So, one problem is that LuceneTestCase.tearDown fails on any thread excs, but if CMS had also hit a failure, then fails to clear CMS's thread failures.  I think we should just remove CMS's thread failure tracking?  (It's static so it can definitely bleed across tests).  Ie, just rely on LuceneTestCase's tracking.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


[jira] Updated: (LUCENE-2819) LuceneTestCase's check for uncaught exceptions in threads causes collateral damage?

Posted by "Robert Muir (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/LUCENE-2819?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Robert Muir updated LUCENE-2819:
--------------------------------

    Attachment: LUCENE-2819.patch

I worked on mike's patch a bit... here's an updated version.

I think lucenetestcase is ok, but there are tests that need fixing.

For example TestParallelMultiSearcher doesn't close() its searcher, so its executor never gets shutdown.
because of this the test now fails.

> LuceneTestCase's check for uncaught exceptions in threads causes collateral damage?
> -----------------------------------------------------------------------------------
>
>                 Key: LUCENE-2819
>                 URL: https://issues.apache.org/jira/browse/LUCENE-2819
>             Project: Lucene - Java
>          Issue Type: Bug
>          Components: Tests
>            Reporter: Michael McCandless
>             Fix For: 3.1, 4.0
>
>         Attachments: LUCENE-2819.patch, LUCENE-2819.patch
>
>
> Eg see these failures:
>     https://hudson.apache.org/hudson/job/Lucene-3.x/214/
> Multiple test methods failed in TestIndexWriterOnDiskFull, but, I think only 1 test had a real failure but somehow our "thread hit exc" tracking incorrectly blames the other 3 cases?
> I'm not sure about this but it seems like something like that is going on...
> So, one problem is that LuceneTestCase.tearDown fails on any thread excs, but if CMS had also hit a failure, then fails to clear CMS's thread failures.  I think we should just remove CMS's thread failure tracking?  (It's static so it can definitely bleed across tests).  Ie, just rely on LuceneTestCase's tracking.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


[jira] Commented: (LUCENE-2819) LuceneTestCase's check for uncaught exceptions in threads causes collateral damage?

Posted by "Robert Muir (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/LUCENE-2819?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12972776#action_12972776 ] 

Robert Muir commented on LUCENE-2819:
-------------------------------------

I think this is the problem: lets say the main thread spawns 3 other threads (A,B,C).
when A throws exception, our uncaught exception handler calls the test to fail.

There is nothing wrong with this... the problem in your example is i think B and C are still running and then fail later (even if its just a few ms)
So these get 'misattributed' to the next test method... we can't do anything about that either without doing insane amounts of buffering.

So we need to improve the thread handling in general for the tests.


> LuceneTestCase's check for uncaught exceptions in threads causes collateral damage?
> -----------------------------------------------------------------------------------
>
>                 Key: LUCENE-2819
>                 URL: https://issues.apache.org/jira/browse/LUCENE-2819
>             Project: Lucene - Java
>          Issue Type: Bug
>          Components: Tests
>            Reporter: Michael McCandless
>             Fix For: 3.1, 4.0
>
>
> Eg see these failures:
>     https://hudson.apache.org/hudson/job/Lucene-3.x/214/
> Multiple test methods failed in TestIndexWriterOnDiskFull, but, I think only 1 test had a real failure but somehow our "thread hit exc" tracking incorrectly blames the other 3 cases?
> I'm not sure about this but it seems like something like that is going on...
> So, one problem is that LuceneTestCase.tearDown fails on any thread excs, but if CMS had also hit a failure, then fails to clear CMS's thread failures.  I think we should just remove CMS's thread failure tracking?  (It's static so it can definitely bleed across tests).  Ie, just rely on LuceneTestCase's tracking.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


[jira] Resolved: (LUCENE-2819) LuceneTestCase's check for uncaught exceptions in threads causes collateral damage?

Posted by "Robert Muir (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/LUCENE-2819?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Robert Muir resolved LUCENE-2819.
---------------------------------

    Resolution: Fixed

committed and merged to 3.x

in 3.x i kept the test code in CMS (even though unused) as i dont trust the 3.0 backwards LuceneTestCase 
enough to handle the uncaught exceptions... 

i marked @deprecated for us to remove in 3.2, i think thats easiest.

we should try to resolve some of the rogue thread issues so we can make this stuff actually fail instead of warn.

> LuceneTestCase's check for uncaught exceptions in threads causes collateral damage?
> -----------------------------------------------------------------------------------
>
>                 Key: LUCENE-2819
>                 URL: https://issues.apache.org/jira/browse/LUCENE-2819
>             Project: Lucene - Java
>          Issue Type: Bug
>          Components: Tests
>            Reporter: Michael McCandless
>             Fix For: 3.1, 4.0
>
>         Attachments: LUCENE-2819.patch, LUCENE-2819.patch, LUCENE-2819.patch, LUCENE-2819.patch
>
>
> Eg see these failures:
>     https://hudson.apache.org/hudson/job/Lucene-3.x/214/
> Multiple test methods failed in TestIndexWriterOnDiskFull, but, I think only 1 test had a real failure but somehow our "thread hit exc" tracking incorrectly blames the other 3 cases?
> I'm not sure about this but it seems like something like that is going on...
> So, one problem is that LuceneTestCase.tearDown fails on any thread excs, but if CMS had also hit a failure, then fails to clear CMS's thread failures.  I think we should just remove CMS's thread failure tracking?  (It's static so it can definitely bleed across tests).  Ie, just rely on LuceneTestCase's tracking.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


[jira] Updated: (LUCENE-2819) LuceneTestCase's check for uncaught exceptions in threads causes collateral damage?

Posted by "Michael McCandless (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/LUCENE-2819?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Michael McCandless updated LUCENE-2819:
---------------------------------------

    Attachment: LUCENE-2819.patch

Attaching current patch; includes lots of noise and does not work yet!!  (I still see collateral damage).

> LuceneTestCase's check for uncaught exceptions in threads causes collateral damage?
> -----------------------------------------------------------------------------------
>
>                 Key: LUCENE-2819
>                 URL: https://issues.apache.org/jira/browse/LUCENE-2819
>             Project: Lucene - Java
>          Issue Type: Bug
>          Components: Tests
>            Reporter: Michael McCandless
>             Fix For: 3.1, 4.0
>
>         Attachments: LUCENE-2819.patch
>
>
> Eg see these failures:
>     https://hudson.apache.org/hudson/job/Lucene-3.x/214/
> Multiple test methods failed in TestIndexWriterOnDiskFull, but, I think only 1 test had a real failure but somehow our "thread hit exc" tracking incorrectly blames the other 3 cases?
> I'm not sure about this but it seems like something like that is going on...
> So, one problem is that LuceneTestCase.tearDown fails on any thread excs, but if CMS had also hit a failure, then fails to clear CMS's thread failures.  I think we should just remove CMS's thread failure tracking?  (It's static so it can definitely bleed across tests).  Ie, just rely on LuceneTestCase's tracking.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org