You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@lucene.apache.org by "Robert Muir (Jira)" <ji...@apache.org> on 2021/03/01 03:02:00 UTC

[jira] [Created] (LUCENE-9817) pathological test fixes

Robert Muir created LUCENE-9817:
-----------------------------------

             Summary: pathological test fixes
                 Key: LUCENE-9817
                 URL: https://issues.apache.org/jira/browse/LUCENE-9817
             Project: Lucene - Core
          Issue Type: Improvement
            Reporter: Robert Muir
         Attachments: LUCENE-9817.patch

There are now 13,000+ tests in lucene, and if you don't have dozens of cores the situation is slow (around 7 minutes here, with everything tuned as fast as i can get it, running on tmpfs). 

It is tricky to keep the situation sustainable: so many tests that usually just take a few seconds but they all add up. To put it in perspective, imagine if all 13000 tests only took 1s each, that's 3.5 hours of cpu time.

From my inspection, there are a few cases of inefficiency:
* tests with bad random parameters: they might normally be semi-well-behaved, but "rarely" take 30 seconds. That's maybe like a 1% chance but keep in mind 1% equates to 130 wild-west tests every run.
* tests spinning up too many threads and indexing too many docs unnecessarily: there might literally be thousands of these, so that's a hard problem to fix... and developers love to use lots of threads and docs in tests.
* tests just being inefficient: stuff like creating indexes in setup/teardown when they have many methods that may not even use them (hey, why did testEqualsHashcode take 30 seconds, what is it doing?)

I only worked on the first case here, if i fixed anything involving the other two, it was just because I noticed them while I was there. I temporarily overrode methods like LuceneTestCase.rarely(), atLeast(), and so on to present more pathological/worst-case conditions and tried to address them all.

So here's a patch to give ~ 80 seconds of cpu-time in tests back. YMMV, maybe it helps you more if you are actually using hard disks and stuff!

Fixing the other issues here will require some more creativity/work, I will followup.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org