You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@accumulo.apache.org by Bo...@l-3com.com on 2012/10/17 18:55:13 UTC

storm and accumulo

Has anyone done any integration between storm an accumulo?  Primarily
I'm interested in establish some topology from message queues to
batchWriter(s) while performing real-time analytics on the contents.  

Bob Thorman
Engineering Fellow
L-3 Communications, ComCept
1700 Science Place
Rockwall, TX 75032
(972) 772-7501 work
Bob.Thorman@ncct.af.smil.mil
rdthorm@nsa.ic.gov


RE: storm and accumulo

Posted by "Ott, Charles H." <CH...@saic.com>.
I am not personally working on the storm aspect of my project, but there
are existing implementations of storm topology using niagra files
(nifi?) as a stream to feed bolts that write data to accumulo.   If my
understanding is correct, you can create several 'bolt' instances in
your topology across multiple servers to scale your needs, whether the
focus is ingestion or analysis.



-----Original Message-----
From: user-return-1483-CHARLES.H.OTT=saic.com@accumulo.apache.org
[mailto:user-return-1483-CHARLES.H.OTT=saic.com@accumulo.apache.org] On
Behalf Of Bob.Thorman@l-3com.com
Sent: Wednesday, October 17, 2012 12:55 PM
To: user@accumulo.apache.org
Subject: storm and accumulo

Has anyone done any integration between storm an accumulo?  Primarily
I'm interested in establish some topology from message queues to
batchWriter(s) while performing real-time analytics on the contents.  

Bob Thorman
Engineering Fellow
L-3 Communications, ComCept
1700 Science Place
Rockwall, TX 75032
(972) 772-7501 work
Bob.Thorman@ncct.af.smil.mil
rdthorm@nsa.ic.gov


Re: storm and accumulo

Posted by Juan Moreno <jw...@gmail.com>.
Sure, you can connect to any message queue like Kestrel or a Kafka and emit
them to your topology. At the end you can have multiple instances of a bolt
that writes to Accumulo. Just make sure to initiate the connection in the
prepare() method so that it happens when the instances are started at the
cluster. Since Bolt instances can't really share anything, it'd be just
easier if each instance had its own batchWriter copy.

On Wed, Oct 17, 2012 at 1:59 PM, <Bo...@l-3com.com> wrote:

> I’m trying to connect message queues to a topology that will terminate in
> a batchWriter.  Along the way I want to put some bolts in place that detect
> certain events/sequences and make the proper asynchronous notifications.
> I’m considering storm for its ability to scale horizontally because I
> expect the number of message queues and message volume to grow
> considerably.  That being said I’m wondering if I should build in some
> scalability into the batchWriters (i.e. the number of instances of
> batchWriter) or will it scale by itself as long as I configure it with the
> upper end of the volume, like the diagram below (if it makes it through).*
> ***
>
> ** **
>
> ****
>
> ** **
>
> *From:* Juan Moreno [mailto:jwellington.moreno@gmail.com]
> *Sent:* Wednesday, October 17, 2012 12:16
> *To:* user@accumulo.apache.org
> *Subject:* Re: storm and accumulo****
>
> ** **
>
> Hello Bob,****
>
> ** **
>
> We use Accumulo and Storm together in my project and have had good success
> with it. The only challenge  of course, is when you care about order and
> exactly-once semantics. That said, Storm's transactional Topologies isn't
> something easily adapted for Accumulo.****
>
> Also, if you care about guaranteeing data processing, be sure to do some
> manual acking.****
>
>
> Accumulo is failry record/tuple oriented which works well with storm. What
> are you trying to do with Storm and Accumulo exactly?****
>
> On Wed, Oct 17, 2012 at 12:55 PM, <Bo...@l-3com.com> wrote:****
>
> Has anyone done any integration between storm an accumulo?  Primarily
> I'm interested in establish some topology from message queues to
> batchWriter(s) while performing real-time analytics on the contents.
>
> Bob Thorman
> Engineering Fellow
> L-3 Communications, ComCept
> 1700 Science Place
> Rockwall, TX 75032
> (972) 772-7501 work
> Bob.Thorman@ncct.af.smil.mil
> rdthorm@nsa.ic.gov****
>
> ** **
>

RE: storm and accumulo

Posted by Bo...@l-3com.com.
I'm trying to connect message queues to a topology that will terminate
in a batchWriter.  Along the way I want to put some bolts in place that
detect certain events/sequences and make the proper asynchronous
notifications.  I'm considering storm for its ability to scale
horizontally because I expect the number of message queues and message
volume to grow considerably.  That being said I'm wondering if I should
build in some scalability into the batchWriters (i.e. the number of
instances of batchWriter) or will it scale by itself as long as I
configure it with the upper end of the volume, like the diagram below
(if it makes it through).

 



 

From: Juan Moreno [mailto:jwellington.moreno@gmail.com] 
Sent: Wednesday, October 17, 2012 12:16
To: user@accumulo.apache.org
Subject: Re: storm and accumulo

 

Hello Bob,

 

We use Accumulo and Storm together in my project and have had good
success with it. The only challenge  of course, is when you care about
order and exactly-once semantics. That said, Storm's transactional
Topologies isn't something easily adapted for Accumulo.

Also, if you care about guaranteeing data processing, be sure to do some
manual acking.


Accumulo is failry record/tuple oriented which works well with storm.
What are you trying to do with Storm and Accumulo exactly?

On Wed, Oct 17, 2012 at 12:55 PM, <Bo...@l-3com.com> wrote:

Has anyone done any integration between storm an accumulo?  Primarily
I'm interested in establish some topology from message queues to
batchWriter(s) while performing real-time analytics on the contents.

Bob Thorman
Engineering Fellow
L-3 Communications, ComCept
1700 Science Place
Rockwall, TX 75032
(972) 772-7501 <tel:%28972%29%20772-7501>  work
Bob.Thorman@ncct.af.smil.mil
rdthorm@nsa.ic.gov

 


Re: storm and accumulo

Posted by Juan Moreno <jw...@gmail.com>.
Hello Bob,

We use Accumulo and Storm together in my project and have had good success
with it. The only challenge  of course, is when you care about order and
exactly-once semantics. That said, Storm's transactional Topologies isn't
something easily adapted for Accumulo.
Also, if you care about guaranteeing data processing, be sure to do some
manual acking.

Accumulo is failry record/tuple oriented which works well with storm. What
are you trying to do with Storm and Accumulo exactly?

On Wed, Oct 17, 2012 at 12:55 PM, <Bo...@l-3com.com> wrote:

> Has anyone done any integration between storm an accumulo?  Primarily
> I'm interested in establish some topology from message queues to
> batchWriter(s) while performing real-time analytics on the contents.
>
> Bob Thorman
> Engineering Fellow
> L-3 Communications, ComCept
> 1700 Science Place
> Rockwall, TX 75032
> (972) 772-7501 work
> Bob.Thorman@ncct.af.smil.mil
> rdthorm@nsa.ic.gov
>
>