You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@flink.apache.org by Márton Balassi <ba...@gmail.com> on 2014/10/05 14:15:31 UTC

Re: Clean up dependencies in streaming connectors

Gabor excluded the unnecessary recursive dependencies:

https://github.com/mbalassi/incubator-flink/commit/9caece6bef610cbebaeb538f5a358ce363c055b7#diff-127d25c59a9bb45f12aab41520d65d42R103

Scala e.g. was eliminated. We could not get rid of zookeeper by the way.

On Tue, Sep 30, 2014 at 11:16 AM, Stephan Ewen <se...@apache.org> wrote:

> Have a look at this PR and maybe build on top of it:
> https://github.com/apache/incubator-flink/pull/133
>
> On Mon, Sep 29, 2014 at 10:45 PM, Márton Balassi <balassi.marton@gmail.com
> >
> wrote:
>
> > Good catch. Give me some time to deal with my fresh jet lag and we will
> > figure it out with Gyula. :)
> > On Sep 29, 2014 12:50 PM, "Stephan Ewen" <se...@apache.org> wrote:
> >
> > > Shipping the connectors with the job jars would thin out the
> > dependencies,
> > > but make it more cumbersome to assemble a job jar.
> > >
> > > On Mon, Sep 29, 2014 at 6:47 PM, Gyula Fora <gy...@gmail.com>
> > wrote:
> > >
> > > > Thanks, I will look into this and try to figure it out, as you can
> see
> > I
> > > > am not a maven pro :)
> > > >
> > > > On 29 Sep 2014, at 18:44, Stephan Ewen <se...@apache.org> wrote:
> > > >
> > > > > You may be able to solve this with careful exclusions.
> > > > >
> > > > > It seems kafka is monolithic, having no separation between
> connector
> > > and
> > > > > engine. If you know for example that zookeeper is not required by
> the
> > > > > connector (you have to be sure), you can exclude it as the
> > dependency.
> > > We
> > > > > have done this for Hadoop1, where we only use the HDFS client
> > > > functionality.
> > > > >
> > > > > On Mon, Sep 29, 2014 at 6:40 PM, Gyula Fóra <gy...@gmail.com>
> > > > wrote:
> > > > >
> > > > >> Yes, you are right, kafka and flume are the heavy ones.
> > > > >>
> > > > >> We always have the choice to take out them from the package and
> > maybe
> > > > have
> > > > >> a separate repo for all the different connectors and only keep 1-2
> > > most
> > > > >> important ones. I don't think there's much else to do because we
> > don't
> > > > use
> > > > >> the packages you mentioned, but they get pulled by the kafka and
> > flume
> > > > >> dependencies.
> > > > >>
> > > > >>
> > > > >>
> > > > >>
> > > > >> On Mon, Sep 29, 2014 at 6:24 PM, Stephan Ewen <se...@apache.org>
> > > wrote:
> > > > >>
> > > > >>> The streaming connectors currently pull a massive amount of
> > > > dependencies.
> > > > >>>
> > > > >>> For example, we transitively get the scala
> compiler/reflection/etc
> > > and
> > > > >>> ZooKeeper.
> > > > >>>
> > > > >>> A lot of stuff comes with flume and kafka. Are those required to
> > make
> > > > the
> > > > >>> connectors work? Otherwise, it might be good to exclude them, to
> > > > prevent
> > > > >>> conflicts for users that actually depend on those components.
> > > > >>>
> > > > >>
> > > >
> > > >
> > >
> >
>