You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@poi.apache.org by Andreas Reichel <an...@manticore-projects.com> on 2021/11/01 07:36:55 UTC
Running tests in parallel?
Greetings.
Pardon me to ask: Am I right that most of the Use Cases tests are
executed serially (only)?
Executing the tests seem to take surprisingly long time (8+ minutes?!)
at almost no load on my cpu cores.
If my observation was right: Is that intentional and why would we not
run the tests in parallel?
With my limited understanding, most of the tests create a workbook and
sheets and cells and do something and the assert. I think this could be
done in parallel, unless multiple tests accessed the same workbook?
(I am new to apache projects and I am no developer, so please go soft
on me.)
Cheers
Andreas
Re: Running tests in parallel?
Posted by Andreas Beeker <ki...@apache.org>.
Hi Andreas,
the two files build/ooxml-lite-report.clazz & .xsb are generated via the OOXMLLiteAgent.
The agent is used on the modules with xmlbeans dependencies (poi-ooxml, poi-excelant and poi-integration) and incrementally gather the used XmlBeans classes and xsbs.
I haven't tried build caching up to now explicitly, but I assume it would skip generating poi-ooxml-full if the schemas haven't changed.
AFAIK the poi-ooxml-lite is not used in the main gradle build and we still need to migrate the poi-integration/distsourcebuild (build.xml) ant and jenkins job to actually test the lite jar. Maybe we can exclude the poi-ooxml-lite tasks in the "check" phase and only activate them in the "jenkins" phase.
Best wishes,
Andi
---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscribe@poi.apache.org
For additional commands, e-mail: user-help@poi.apache.org
Re: Running tests in parallel?
Posted by Andreas Reichel <an...@manticore-projects.com>.
Dear All,
I submitted PR #275 which reduced the build time by est. 27%. Its a
low hanging fruit, just using parallel building.
However, I was not able to address the big elephant in the room: Gradle
Build Caching.
When activating it, the subproject `POI-OOXML-LITE` will fail on the
task `generateModuleInfo` when a file is not found:
`Property '$1' specifies file '/home/are/Documents/src/poi/build/ooxml-
lite-report.clazz' which doesn't exist.`
Still I would love to drive that further, because when I excluded POI-
OOXML-LITE from the project, I was able to activate Gradle Build
Caching and a `gradle clean jar` (without changes) rebuilt in 14
seconds instead of optimized 6:20 minutes.
(I understand of course, that this works only when nothing has changed,
but this is a typical development interest. You change only a few lines
and then want to run you tests as fast as possible before going to the
next. Without caching, clean jar build ALWAYS takes 6:20 even when
nothing has changed! )
So in my limited understanding, POI-OOXML-LITE:generateModuleInfo seems
to be the only showstopper.
My question is: can anyone tell me, which step/task creates the file
`File clazzFile = file("${OOXML_LITE_REPORT}.clazz")`
I would love to try my luck and hard forcing this step/task before POI-
OOXML-LITE:generateModuleInfo kicks in.
Build Caching looks too sweet for me.
Thanks in advance for advise and cheers
Andreas
Re: Running tests in parallel?
Posted by Andreas Reichel <an...@manticore-projects.com>.
Dear All.
On Mon, 2021-11-01 at 14:36 +0700, Andreas Reichel wrote:
> Am I right that most of the Use Cases tests are executed serially
> (only)?
Looks like I was semi-right:
// set heap size for the test JVM(s)
minHeapSize = "128m"
maxHeapSize = "1512m"
// Specifying the local via system properties did not work, so we set them this way
jvmArgs << [
'-Djava.io.tmpdir=build',
'-DPOI.testdata.path=../test-data',
'-Djava.awt.headless=true',
'-Djava.locale.providers=JRE,CLDR',
'-Duser.language=en',
'-Duser.country=US',
'-Djavax.xml.stream.XMLInputFactory=com.sun.xml.internal.stream.XMLInputFactoryImpl',
"-Dversion.id=${project.version}",
'-ea',
'-Djunit.jupiter.execution.parallel.config.strategy=fixed',
'-Djunit.jupiter.execution.parallel.config.fixed.parallelism=2'
// -Xjit:verbose={compileStart|compileEnd},vlog=build/jit.log${no.jit.sherlock} ... if ${isIBMVM}
]
Questions please:
1) why do we not allocate maxHeapSize dynamically based on the (Free)
Memory of the OS, e.g. use 50% of that memory?
2) why do we not allocate all Cpu Cores, but just 2? What advantage do
the following lines have:
'-Djunit.jupiter.execution.parallel.config.strategy=fixed',
'-Djunit.jupiter.execution.parallel.config.fixed.parallelism=2'
Best regards
Andreas