You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@spark.apache.org by Kevin Burton <bu...@spinn3r.com> on 2014/12/27 21:23:05 UTC
init / shutdown for complex map job?
I have a job where I want to map over all data in a cassandra database.
I’m then selectively sending things to my own external system (ActiveMQ) if
the item matches criteria.
The problem is that I need to do some init and shutdown. Basically on init
I need to create ActiveMQ connections and on shutdown I need to close them
or daemon threads will be left running.
What’s the best way to accomplish this. I could find it after I RTFMd…(but
perhaps I missed it)
--
Founder/CEO Spinn3r.com
Location: *San Francisco, CA*
blog: http://burtonator.wordpress.com
… or check out my Google+ profile
<https://plus.google.com/102718274791889610666/posts>
<http://spinn3r.com>
Re: init / shutdown for complex map job?
Posted by Kevin Burton <bu...@spinn3r.com>.
Yes. I can do a just in time init… I can see that the first map was done.
However, I can’t see that the last map was done I think.. and the shutdown
is the key part. Without it all my daemon threads won’t properly exit and
I will not have all messages sent over the wire.
On Sun, Dec 28, 2014 at 12:18 AM, Akhil Das <ak...@sigmoidanalytics.com>
wrote:
> Something like?
>
> val a = myRDD.mapPartitions(p => {
>
>
>
> //Do the init
>
> //Perform some operations
>
> //Shut it down?
>
> })
>
>
>
> Thanks
> Best Regards
>
> On Sun, Dec 28, 2014 at 1:53 AM, Kevin Burton <bu...@spinn3r.com> wrote:
>
>> I have a job where I want to map over all data in a cassandra database.
>>
>> I’m then selectively sending things to my own external system (ActiveMQ)
>> if the item matches criteria.
>>
>> The problem is that I need to do some init and shutdown. Basically on
>> init I need to create ActiveMQ connections and on shutdown I need to close
>> them or daemon threads will be left running.
>>
>> What’s the best way to accomplish this. I could find it after I
>> RTFMd…(but perhaps I missed it)
>>
>> --
>>
>> Founder/CEO Spinn3r.com
>> Location: *San Francisco, CA*
>> blog: http://burtonator.wordpress.com
>> … or check out my Google+ profile
>> <https://plus.google.com/102718274791889610666/posts>
>> <http://spinn3r.com>
>>
>>
>
--
Founder/CEO Spinn3r.com
Location: *San Francisco, CA*
blog: http://burtonator.wordpress.com
… or check out my Google+ profile
<https://plus.google.com/102718274791889610666/posts>
<http://spinn3r.com>
Re: init / shutdown for complex map job?
Posted by Sean Owen <so...@cloudera.com>.
(Still pending, but believe it's in progress and being written by a
colleague here.)
On Sun, Dec 28, 2014 at 2:41 PM, Ray Melton <rt...@gmail.com> wrote:
> A follow-up to the blog cited below was hinted at, per "But Wait,
> There's More ... To keep this post brief, the remainder will be left to
> a follow-up post."
>
> Is this follow-up pending? Is it sort of pending? Did the follow-up
> happen, but I just couldn't find it on the web?
>
> Regards, Ray.
---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscribe@spark.apache.org
For additional commands, e-mail: user-help@spark.apache.org
Re: init / shutdown for complex map job?
Posted by Ray Melton <rt...@gmail.com>.
A follow-up to the blog cited below was hinted at, per "But Wait,
There's More ... To keep this post brief, the remainder will be left to
a follow-up post."
Is this follow-up pending? Is it sort of pending? Did the follow-up
happen, but I just couldn't find it on the web?
Regards, Ray.
On Sun, 28 Dec 2014 08:54:13 +0000
Sean Owen <so...@cloudera.com> wrote:
> You can't quite do cleanup in mapPartitions in that way. Here is a
> bit more explanation (farther down):
> http://blog.cloudera.com/blog/2014/09/how-to-translate-from-mapreduce-to-apache-spark/
> On Dec 28, 2014 8:18 AM, "Akhil Das" <ak...@sigmoidanalytics.com>
> wrote:
>
> > Something like?
> >
> > val a = myRDD.mapPartitions(p => {
> >
> >
> >
> > //Do the init
> >
> > //Perform some operations
> >
> > //Shut it down?
> >
> > })
>
---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscribe@spark.apache.org
For additional commands, e-mail: user-help@spark.apache.org
Re: init / shutdown for complex map job?
Posted by Sean Owen <so...@cloudera.com>.
You can't quite do cleanup in mapPartitions in that way. Here is a bit more
explanation (farther down):
http://blog.cloudera.com/blog/2014/09/how-to-translate-from-mapreduce-to-apache-spark/
On Dec 28, 2014 8:18 AM, "Akhil Das" <ak...@sigmoidanalytics.com> wrote:
> Something like?
>
> val a = myRDD.mapPartitions(p => {
>
>
>
> //Do the init
>
> //Perform some operations
>
> //Shut it down?
>
> })
>
>
>
> Thanks
> Best Regards
>
> On Sun, Dec 28, 2014 at 1:53 AM, Kevin Burton <bu...@spinn3r.com> wrote:
>
>> I have a job where I want to map over all data in a cassandra database.
>>
>> I’m then selectively sending things to my own external system (ActiveMQ)
>> if the item matches criteria.
>>
>> The problem is that I need to do some init and shutdown. Basically on
>> init I need to create ActiveMQ connections and on shutdown I need to close
>> them or daemon threads will be left running.
>>
>> What’s the best way to accomplish this. I could find it after I
>> RTFMd…(but perhaps I missed it)
>>
>> --
>>
>> Founder/CEO Spinn3r.com
>> Location: *San Francisco, CA*
>> blog: http://burtonator.wordpress.com
>> … or check out my Google+ profile
>> <https://plus.google.com/102718274791889610666/posts>
>> <http://spinn3r.com>
>>
>>
>
Re: init / shutdown for complex map job?
Posted by Akhil Das <ak...@sigmoidanalytics.com>.
Something like?
val a = myRDD.mapPartitions(p => {
//Do the init
//Perform some operations
//Shut it down?
})
Thanks
Best Regards
On Sun, Dec 28, 2014 at 1:53 AM, Kevin Burton <bu...@spinn3r.com> wrote:
> I have a job where I want to map over all data in a cassandra database.
>
> I’m then selectively sending things to my own external system (ActiveMQ)
> if the item matches criteria.
>
> The problem is that I need to do some init and shutdown. Basically on
> init I need to create ActiveMQ connections and on shutdown I need to close
> them or daemon threads will be left running.
>
> What’s the best way to accomplish this. I could find it after I RTFMd…(but
> perhaps I missed it)
>
> --
>
> Founder/CEO Spinn3r.com
> Location: *San Francisco, CA*
> blog: http://burtonator.wordpress.com
> … or check out my Google+ profile
> <https://plus.google.com/102718274791889610666/posts>
> <http://spinn3r.com>
>
>