You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@metron.apache.org by Michael Miklavcic <mi...@gmail.com> on 2016/09/19 14:08:33 UTC

[DISCUSS] Storm topology sideloading jars

As part of https://issues.apache.org/jira/browse/METRON-356 it is now
possible to add hbase and hadoop conf to the Storm topology classpath. It
is also desirable to expand this functionality to sideloading jars for
Storm topologies. That way, users can add additional dependencies without
having to recompile/repackage existing jars. One suggestion is to leverage
HDFS to store custom jars and add them to the topology.classpath. I want to
open this discussion to the community.

Best,

Mike

Re: [DISCUSS] Storm topology sideloading jars

Posted by Casey Stella <ce...@gmail.com>.
For those curious about that code, it appears to be here
<https://github.com/apache/accumulo/blob/master/start/src/main/java/org/apache/accumulo/start/classloader/vfs/AccumuloVFSClassLoader.java>.
It appears that VFSClassloader alone from VFS Commons will support HDFS.
As always, we are going to have to be careful when mucking about with
classloaders, but that approach looks very promising.  I like it.




On Mon, Sep 19, 2016 at 10:23 AM, David Lyle <dl...@gmail.com> wrote:

> I don't believe Storm does, we would have to modify our bolts to add an
> additional classloader to the current classloader chain. Shouldn't be too
> much work, but a ton of reward.
>
> Accumulo had a similar requirement- distributing and synchronizing jars to
> distributed components. They introduced a vfs classloader to their
> classloader chain.
>
> https://blogs.apache.org/accumulo/entry/the_accumulo_classloader
>
> -D...
>
>
> On Mon, Sep 19, 2016 at 10:17 AM, Casey Stella <ce...@gmail.com> wrote:
>
> > I really would like this functionality.  Just a brief aside on what this
> > buys us.  As it stands now, we have a few main extension points:
> >
> >    - Custom Java parsers are found via fully qualified classname
> >    - Custom Stellar functions are found via annotation of the class and
> >    being dropped on the classpath
> >    - Custom Enrichment adapters (e.g. the geo enrichment) are found via
> >    incorporating changes into the enrichment topology flux file
> >
> > Right now the only way to add such things is to compile your code along
> > with ours and have them placed in our uber jars that we submit to Storm.
> > What would be better is for us to expose the interfaces and have
> developers
> > use them to create their custom functionality, build just their extension
> > points and the dependencies and drop the jar file in a directory to be
> > picked up the next time the topology starts.
> >
> > I like the HDFS idea because it means that we do not have to ensure a 3rd
> > party jar directory is sync'd across the storm supervisors.  My question
> is
> > whether Storm supports pulling external dependencies via HDFS.  Does
> anyone
> > know?
> >
> > Thanks for bringing this up, Mike.
> >
> > Best,
> >
> > Casey
> >
> > On Mon, Sep 19, 2016 at 10:08 AM, Michael Miklavcic <
> > michael.miklavcic@gmail.com> wrote:
> >
> > > As part of https://issues.apache.org/jira/browse/METRON-356 it is now
> > > possible to add hbase and hadoop conf to the Storm topology classpath.
> It
> > > is also desirable to expand this functionality to sideloading jars for
> > > Storm topologies. That way, users can add additional dependencies
> without
> > > having to recompile/repackage existing jars. One suggestion is to
> > leverage
> > > HDFS to store custom jars and add them to the topology.classpath. I
> want
> > to
> > > open this discussion to the community.
> > >
> > > Best,
> > >
> > > Mike
> > >
> >
>

Re: [DISCUSS] Storm topology sideloading jars

Posted by David Lyle <dl...@gmail.com>.
I don't believe Storm does, we would have to modify our bolts to add an
additional classloader to the current classloader chain. Shouldn't be too
much work, but a ton of reward.

Accumulo had a similar requirement- distributing and synchronizing jars to
distributed components. They introduced a vfs classloader to their
classloader chain.

https://blogs.apache.org/accumulo/entry/the_accumulo_classloader

-D...


On Mon, Sep 19, 2016 at 10:17 AM, Casey Stella <ce...@gmail.com> wrote:

> I really would like this functionality.  Just a brief aside on what this
> buys us.  As it stands now, we have a few main extension points:
>
>    - Custom Java parsers are found via fully qualified classname
>    - Custom Stellar functions are found via annotation of the class and
>    being dropped on the classpath
>    - Custom Enrichment adapters (e.g. the geo enrichment) are found via
>    incorporating changes into the enrichment topology flux file
>
> Right now the only way to add such things is to compile your code along
> with ours and have them placed in our uber jars that we submit to Storm.
> What would be better is for us to expose the interfaces and have developers
> use them to create their custom functionality, build just their extension
> points and the dependencies and drop the jar file in a directory to be
> picked up the next time the topology starts.
>
> I like the HDFS idea because it means that we do not have to ensure a 3rd
> party jar directory is sync'd across the storm supervisors.  My question is
> whether Storm supports pulling external dependencies via HDFS.  Does anyone
> know?
>
> Thanks for bringing this up, Mike.
>
> Best,
>
> Casey
>
> On Mon, Sep 19, 2016 at 10:08 AM, Michael Miklavcic <
> michael.miklavcic@gmail.com> wrote:
>
> > As part of https://issues.apache.org/jira/browse/METRON-356 it is now
> > possible to add hbase and hadoop conf to the Storm topology classpath. It
> > is also desirable to expand this functionality to sideloading jars for
> > Storm topologies. That way, users can add additional dependencies without
> > having to recompile/repackage existing jars. One suggestion is to
> leverage
> > HDFS to store custom jars and add them to the topology.classpath. I want
> to
> > open this discussion to the community.
> >
> > Best,
> >
> > Mike
> >
>

Re: [DISCUSS] Storm topology sideloading jars

Posted by Casey Stella <ce...@gmail.com>.
I really would like this functionality.  Just a brief aside on what this
buys us.  As it stands now, we have a few main extension points:

   - Custom Java parsers are found via fully qualified classname
   - Custom Stellar functions are found via annotation of the class and
   being dropped on the classpath
   - Custom Enrichment adapters (e.g. the geo enrichment) are found via
   incorporating changes into the enrichment topology flux file

Right now the only way to add such things is to compile your code along
with ours and have them placed in our uber jars that we submit to Storm.
What would be better is for us to expose the interfaces and have developers
use them to create their custom functionality, build just their extension
points and the dependencies and drop the jar file in a directory to be
picked up the next time the topology starts.

I like the HDFS idea because it means that we do not have to ensure a 3rd
party jar directory is sync'd across the storm supervisors.  My question is
whether Storm supports pulling external dependencies via HDFS.  Does anyone
know?

Thanks for bringing this up, Mike.

Best,

Casey

On Mon, Sep 19, 2016 at 10:08 AM, Michael Miklavcic <
michael.miklavcic@gmail.com> wrote:

> As part of https://issues.apache.org/jira/browse/METRON-356 it is now
> possible to add hbase and hadoop conf to the Storm topology classpath. It
> is also desirable to expand this functionality to sideloading jars for
> Storm topologies. That way, users can add additional dependencies without
> having to recompile/repackage existing jars. One suggestion is to leverage
> HDFS to store custom jars and add them to the topology.classpath. I want to
> open this discussion to the community.
>
> Best,
>
> Mike
>