You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@lucene.apache.org by "Robert Muir (Jira)" <ji...@apache.org> on 2021/03/01 03:02:00 UTC

[jira] [Updated] (LUCENE-9817) pathological test fixes

     [ https://issues.apache.org/jira/browse/LUCENE-9817?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Robert Muir updated LUCENE-9817:
--------------------------------
    Attachment: LUCENE-9817.patch

> pathological test fixes
> -----------------------
>
>                 Key: LUCENE-9817
>                 URL: https://issues.apache.org/jira/browse/LUCENE-9817
>             Project: Lucene - Core
>          Issue Type: Improvement
>            Reporter: Robert Muir
>            Priority: Major
>         Attachments: LUCENE-9817.patch
>
>
> There are now 13,000+ tests in lucene, and if you don't have dozens of cores the situation is slow (around 7 minutes here, with everything tuned as fast as i can get it, running on tmpfs). 
> It is tricky to keep the situation sustainable: so many tests that usually just take a few seconds but they all add up. To put it in perspective, imagine if all 13000 tests only took 1s each, that's 3.5 hours of cpu time.
> From my inspection, there are a few cases of inefficiency:
> * tests with bad random parameters: they might normally be semi-well-behaved, but "rarely" take 30 seconds. That's maybe like a 1% chance but keep in mind 1% equates to 130 wild-west tests every run.
> * tests spinning up too many threads and indexing too many docs unnecessarily: there might literally be thousands of these, so that's a hard problem to fix... and developers love to use lots of threads and docs in tests.
> * tests just being inefficient: stuff like creating indexes in setup/teardown when they have many methods that may not even use them (hey, why did testEqualsHashcode take 30 seconds, what is it doing?)
> I only worked on the first case here, if i fixed anything involving the other two, it was just because I noticed them while I was there. I temporarily overrode methods like LuceneTestCase.rarely(), atLeast(), and so on to present more pathological/worst-case conditions and tried to address them all.
> So here's a patch to give ~ 80 seconds of cpu-time in tests back. YMMV, maybe it helps you more if you are actually using hard disks and stuff!
> Fixing the other issues here will require some more creativity/work, I will followup.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org