You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@flink.apache.org by Chesnay Schepler <ch...@apache.org> on 2017/07/03 14:42:27 UTC

[DISCUSS] Long-term plan for build times

Hello,

I want to start a discussion on how we plan to deal with ever-increasing 
build times in the long term.

While the profile reorganization proposed in FLINK-7047 again brings us 
below 50 minute build times
it is only a matter of time once we hit the timeout again. There is 
already one profile getting close to
it (flink-tests + java7/scala10 @ 45-47m) which we can't split up 
further on a module level.

The following is a list of suggestions/efforts brought forth by various 
people and myself that could help:


      Compile time reductions:


 1. *Rework shading model* (in-progress)
      * With the addition of flink-shaded we can reduce the build-times by
         1. not having to build shaded modules for every build
         2. requiring less shading to be required, as we work against
            the shaded namespaces directly
      * *actual impact not measured
 2. *Drop flink-storm*
      * There's been discussions for parking flink-storm anyway, and we
        would save between 1 to 2 minutes.
 3. *R**epository split*
      * A "flink-libraries" repo wouldn't require building core modules,
        connectors; by design it would download all these and only
        compile flink-libraries modules.


      Test time reductions:


 1. *Don't run expensive tests* (duh)*
    *
      * A significant portion of the build time is spent on a relatively
        small number of tests (excerpt below):
          o flink-runtime
              + ChannelViewsTest (1m)
              + FileChannelStreamsITCase (40s)
              + ExternalSortITCase (2.5m)
              + ExternalSortLargeRecordsITCase (2m)
          o flink-tests
              + JobManagerHACheckpointRecoveryITCase (2.5m)
              + TaskManagerProcessFailureBatchRecoveryITCase (1m)
              + JobManagerHAProcessFailureBatchRecoveryITCase (1.5m)
              + UdfStreamOperatorCheckpointingITCase (2m)
              + IncrementalRocksDbBackendEventTimeWindowCheckpointingITCase
                (3m)
      * We can either rework/tweak these tests or move them into
        nightly/cron builds
 2. *R**epository split*
      * In the context of build times the main benefit of a repo split
        is reduction in compile times. A "flink-libraries" repo wouldn't
        require building core modules, connectors or a complicated maven
        setup to work around these; by design it would download all
        these and only compile flink-libraries modules.

Let me know what you think.


Regards,

Chesnay Schepler


Re: [DISCUSS] Long-term plan for build times

Posted by Ted Yu <yu...@gmail.com>.
Have we considered using tool(s) such as nailgun ?

https://www.lightbend.com/blog/zinc-and-incremental-compilation

Cheers

On Mon, Jul 3, 2017 at 7:42 AM, Chesnay Schepler <ch...@apache.org> wrote:

> Hello,
>
> I want to start a discussion on how we plan to deal with ever-increasing
> build times in the long term.
>
> While the profile reorganization proposed in FLINK-7047 again brings us
> below 50 minute build times
> it is only a matter of time once we hit the timeout again. There is
> already one profile getting close to
> it (flink-tests + java7/scala10 @ 45-47m) which we can't split up further
> on a module level.
>
> The following is a list of suggestions/efforts brought forth by various
> people and myself that could help:
>
>
>      Compile time reductions:
>
>
> 1. *Rework shading model* (in-progress)
>      * With the addition of flink-shaded we can reduce the build-times by
>         1. not having to build shaded modules for every build
>         2. requiring less shading to be required, as we work against
>            the shaded namespaces directly
>      * *actual impact not measured
> 2. *Drop flink-storm*
>      * There's been discussions for parking flink-storm anyway, and we
>        would save between 1 to 2 minutes.
> 3. *R**epository split*
>      * A "flink-libraries" repo wouldn't require building core modules,
>        connectors; by design it would download all these and only
>        compile flink-libraries modules.
>
>
>      Test time reductions:
>
>
> 1. *Don't run expensive tests* (duh)*
>    *
>      * A significant portion of the build time is spent on a relatively
>        small number of tests (excerpt below):
>          o flink-runtime
>              + ChannelViewsTest (1m)
>              + FileChannelStreamsITCase (40s)
>              + ExternalSortITCase (2.5m)
>              + ExternalSortLargeRecordsITCase (2m)
>          o flink-tests
>              + JobManagerHACheckpointRecoveryITCase (2.5m)
>              + TaskManagerProcessFailureBatchRecoveryITCase (1m)
>              + JobManagerHAProcessFailureBatchRecoveryITCase (1.5m)
>              + UdfStreamOperatorCheckpointingITCase (2m)
>              + IncrementalRocksDbBackendEventTimeWindowCheckpointingITCase
>                (3m)
>      * We can either rework/tweak these tests or move them into
>        nightly/cron builds
> 2. *R**epository split*
>      * In the context of build times the main benefit of a repo split
>        is reduction in compile times. A "flink-libraries" repo wouldn't
>        require building core modules, connectors or a complicated maven
>        setup to work around these; by design it would download all
>        these and only compile flink-libraries modules.
>
> Let me know what you think.
>
>
> Regards,
>
> Chesnay Schepler
>
>