You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@storm.apache.org by Stephen Powis <sp...@salesforce.com> on 2015/05/14 15:58:39 UTC

Hibernate + Storm

Hello everyone!

I'm currently toying around with a prototype built ontop of Storm and have
been running into some not so easy going while trying to work with
Hibernate and storm.  I was hoping to get input on if this is just a case
of "I'm doing it wrong" or maybe get some useful tips.

In my prototype, I have a need to fan out a single tuple to several bolts
which do data retrieval from our database in parallel, which then get
merged back into a single stream.  These data retrieval bolts all find
various hibernate entities and pass them along to the merge bolt.  We've
written a kryo serializer that converts from the hibernate entities into
POJOs, which get sent to the merge bolt in tuples.  Once all the tuples get
to the merge bolt, it collects them all into a single tuple and passes it
downstream to a bolt which does processing using the entities.

So it looks something like this.

                      ---- (retrieve bolt a) ----
                    / ---- (retrieve bolt b) ----\
                   /------(retrieve bolt c) -----\
--- (split bolt)------(retrieve bolt d)-------(merge bolt) -----
(processing bolt)

So dealing with detaching the hibernate entities from the session to
serialize them, and then further downstream when we want to work with the
entities again, we have to reattach them to a new session....this seems
kind of awkward.

Does doing the above make sense?  Has anyone attempted to do the above?
Any tips or things we should watch out for?  Basically looking for any kind
of input for this use case.

Thanks!

Re: Hibernate + Storm

Posted by Fan Jiang <dc...@gmail.com>.
It makes sense to me since the bolts can be distributed over different supervisors and may have different DB connections. In this case, you have to detach the hibernate entities from one connection and then re-attach them to the other, if you want to pass them between bolts. —
Sincerely,
Fan Jiang

On Thu, May 14, 2015 at 9:58 AM, Stephen Powis <sp...@salesforce.com>
wrote:

> Hello everyone!
> I'm currently toying around with a prototype built ontop of Storm and have
> been running into some not so easy going while trying to work with
> Hibernate and storm.  I was hoping to get input on if this is just a case
> of "I'm doing it wrong" or maybe get some useful tips.
> In my prototype, I have a need to fan out a single tuple to several bolts
> which do data retrieval from our database in parallel, which then get
> merged back into a single stream.  These data retrieval bolts all find
> various hibernate entities and pass them along to the merge bolt.  We've
> written a kryo serializer that converts from the hibernate entities into
> POJOs, which get sent to the merge bolt in tuples.  Once all the tuples get
> to the merge bolt, it collects them all into a single tuple and passes it
> downstream to a bolt which does processing using the entities.
> So it looks something like this.
>                       ---- (retrieve bolt a) ----
>                     / ---- (retrieve bolt b) ----\
>                    /------(retrieve bolt c) -----\
> --- (split bolt)------(retrieve bolt d)-------(merge bolt) -----
> (processing bolt)
> So dealing with detaching the hibernate entities from the session to
> serialize them, and then further downstream when we want to work with the
> entities again, we have to reattach them to a new session....this seems
> kind of awkward.
> Does doing the above make sense?  Has anyone attempted to do the above?
> Any tips or things we should watch out for?  Basically looking for any kind
> of input for this use case.
> Thanks!

Re: Hibernate + Storm

Posted by Enno Shioji <es...@gmail.com>.
The reason objects are serialized is so that they can be shipped to another
process. As long as that's what you want, it follows that you'd have to
share the sessions across processes. I don't think this is possible or wise!


On Thu, May 14, 2015 at 2:58 PM, Stephen Powis <sp...@salesforce.com>
wrote:

> Hello everyone!
>
> I'm currently toying around with a prototype built ontop of Storm and have
> been running into some not so easy going while trying to work with
> Hibernate and storm.  I was hoping to get input on if this is just a case
> of "I'm doing it wrong" or maybe get some useful tips.
>
> In my prototype, I have a need to fan out a single tuple to several bolts
> which do data retrieval from our database in parallel, which then get
> merged back into a single stream.  These data retrieval bolts all find
> various hibernate entities and pass them along to the merge bolt.  We've
> written a kryo serializer that converts from the hibernate entities into
> POJOs, which get sent to the merge bolt in tuples.  Once all the tuples get
> to the merge bolt, it collects them all into a single tuple and passes it
> downstream to a bolt which does processing using the entities.
>
> So it looks something like this.
>
>                       ---- (retrieve bolt a) ----
>                     / ---- (retrieve bolt b) ----\
>                    /------(retrieve bolt c) -----\
> --- (split bolt)------(retrieve bolt d)-------(merge bolt) -----
> (processing bolt)
>
> So dealing with detaching the hibernate entities from the session to
> serialize them, and then further downstream when we want to work with the
> entities again, we have to reattach them to a new session....this seems
> kind of awkward.
>
> Does doing the above make sense?  Has anyone attempted to do the above?
> Any tips or things we should watch out for?  Basically looking for any kind
> of input for this use case.
>
> Thanks!
>