You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@pulsar.apache.org by GitBox <gi...@apache.org> on 2020/11/10 05:58:20 UTC

[GitHub] [pulsar] lhotari commented on pull request #8485: [CI] Refactor Github Workflows to reduce CI build times

lhotari commented on pull request #8485:
URL: https://github.com/apache/pulsar/pull/8485#issuecomment-724474645


   I'm currently experimenting a solution where the build is split in to multiple phases
   
   1. license check, build Pulsar artifacts
   2. run unit tests
   3. build docker images
   4. run integration tests
   
   Each phase is a "job" in Github Workflow. The the unit test and integration test jobs have parallel sub-jobs by using the matrix feature of Github Flows.
   
   The challenge is the large size of Pulsar artifacts. Currently the ~/.m2/repository/org/apache/pulsar files installed with "mvn install" are about 2.5 GB in size.
   Break down of directory sizes in MB:
   https://gist.github.com/lhotari/3da3b220edd5684e54a005f358f3d045
   
   The large size of artifacts seems to be caused by shaded and bundled dependencies. 
   The bundled dependencies seems to be the pulsar-io modules built with [nifi-nar-maven-plugin](https://github.com/apache/nifi-maven). This results in the excessive IO during builds.
   
   The solution seems to be to create yet another maven profile that is for building just the essentials for running unit tests. Unit tests should be able to run without building the shaded jars, the distribution or the nar modules with the embedded dependencies.
   
   Perhaps there's also a way to share the dependencies across the Pulsar IO nar modules. It seems like a waste to duplicate most of the same dependencies in each nar file.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org