You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@storm.apache.org by Sriharsha Chintalapani <sc...@hortonworks.com> on 2015/07/30 16:35:05 UTC

[DISCUSSION] packaging storm connenctors

Hi All,
              Currently the way we publish storm connector jars into maven repositories is to just publish storm-kafka, hive, hbase without any of its dependencies included.
The expectation here is user will include their version of hdfs and kafka dependencies along with storm-hdfs or kafka and package it with topology as a uber jar.
IMO this is most painful step in deploying/building a topology as observed here  https://issues.apache.org/jira/browse/STORM-967 .  I think we need to standardize either assembly or shade plugin.
Also why don’t we publish connectors with all the dependencies included and user only need to include a dependency storm-hdfs in their pom.xml and it will bring all the other dependencies its needed.
Any ideas on improving this?

Thanks,
Harsha


Re: [DISCUSSION] packaging storm connenctors

Posted by "P. Taylor Goetz" <pt...@gmail.com>.
If we standardize on an approach, I think it should be the shade plugin because it properly handles META-INF content (required for HDFS, HBase, etc.). The assembly plugin doesn’t.

If we specify a Hadoop version and allow all the dependencies to be pulled in, we would just be shifting the complexity since users would then have to exclude the bundled dependencies in addition to specifying the version they want.

I tested using the Kafka spout and HDFS bolt from an unmodified flux-examples jar (from 0.10.0-beta1) and it worked without a hitch.

I understand the frustration, but I’m not sure much can be done about it. We can’t control the dependencies of 3rd party projects, and especially with Hadoop, there are many dependencies that can change from version to version and lead to conflicts.

-Taylor

> On Jul 30, 2015, at 10:35 AM, Sriharsha Chintalapani <sc...@hortonworks.com> wrote:
> 
> Hi All,
>              Currently the way we publish storm connector jars into maven repositories is to just publish storm-kafka, hive, hbase without any of its dependencies included.
> The expectation here is user will include their version of hdfs and kafka dependencies along with storm-hdfs or kafka and package it with topology as a uber jar.
> IMO this is most painful step in deploying/building a topology as observed here  https://issues.apache.org/jira/browse/STORM-967 .  I think we need to standardize either assembly or shade plugin.
> Also why don’t we publish connectors with all the dependencies included and user only need to include a dependency storm-hdfs in their pom.xml and it will bring all the other dependencies its needed.
> Any ideas on improving this?
> 
> Thanks,
> Harsha
>