You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@samza.apache.org by Tommy Becker <to...@Tivo.com> on 2015/01/17 03:25:44 UTC

Job-level lifecycle hooks?

We have some collaborator classes that we need access to not only from task instances but also from custom SerdeFactorys.  Unfortunately Samza doesn't really provide a method to share state, so we've resorted to a singleton service-locator type class.  That solves the problem of sharing instances but not the problem of where to initialize these instances.  I'm curious if any thought been given to providing job-level lifecycle hooks? If anyone else has had need to share state within jobs, how did you do it?

-Tommy

________________________________

This email and any attachments may contain confidential and privileged material for the sole use of the intended recipient. Any review, copying, or distribution of this email (or any attachments) by others is prohibited. If you are not the intended recipient, please contact the sender immediately and permanently delete this email and any attachments. No employee or agent of TiVo Inc. is authorized to conclude any binding agreement on behalf of TiVo Inc. by email. Binding agreements with TiVo Inc. may only be made by a signed written agreement.

RE: Job-level lifecycle hooks?

Posted by Tommy Becker <to...@Tivo.com>.
Ok, I just opened https://issues.apache.org/jira/browse/SAMZA-514 and included some initial thoughts for an implementation.
________________________________________
From: Chinmay Soman [chinmay.cerebro@gmail.com]
Sent: Saturday, January 17, 2015 12:28 PM
To: dev@samza.incubator.apache.org
Subject: RE: Job-level lifecycle hooks?

Totally agreed. Can you please open a ticket for this ?
On Jan 17, 2015 7:03 AM, "Tommy Becker" <to...@tivo.com> wrote:

> Thanks for your reply.  I suppose "state" wasn't really the best term to
> use; our real need is simply to share objects between task instances and
> serde factorys in a single container.  Concretely, we have an Avro schema
> registry that we need access to.  It maintains connections to Zookeeper, so
> we would prefer to only have one per container.  We have what I would
> consider a hacky implementation working via Singletons, but it would be
> nicer if we could pass instances around in some sort of job-level context
> object.
>
> -Tommy
>
>
> ________________________________________
> From: Chinmay Soman [chinmay.cerebro@gmail.com]
> Sent: Friday, January 16, 2015 9:43 PM
> To: dev@samza.incubator.apache.org
> Subject: Re: Job-level lifecycle hooks?
>
> I don't think there's a built in way. One "hacky" way to do it is have a
> static flag to indicate whether or not the shared instance initialization
> is complete.
>
> As far as the shared state goes - there's a ticket tracking this issue:
> SAMZA-402 <https://issues.apache.org/jira/browse/SAMZA-402> . But I don't
> think there's any ticket for the use case you mention.
>
> On Fri, Jan 16, 2015 at 6:25 PM, Tommy Becker <to...@tivo.com> wrote:
>
> > We have some collaborator classes that we need access to not only from
> > task instances but also from custom SerdeFactorys.  Unfortunately Samza
> > doesn't really provide a method to share state, so we've resorted to a
> > singleton service-locator type class.  That solves the problem of sharing
> > instances but not the problem of where to initialize these instances.
> I'm
> > curious if any thought been given to providing job-level lifecycle hooks?
> > If anyone else has had need to share state within jobs, how did you do
> it?
> >
> > -Tommy
> >
> > ________________________________
> >
> > This email and any attachments may contain confidential and privileged
> > material for the sole use of the intended recipient. Any review, copying,
> > or distribution of this email (or any attachments) by others is
> prohibited.
> > If you are not the intended recipient, please contact the sender
> > immediately and permanently delete this email and any attachments. No
> > employee or agent of TiVo Inc. is authorized to conclude any binding
> > agreement on behalf of TiVo Inc. by email. Binding agreements with TiVo
> > Inc. may only be made by a signed written agreement.
> >
>
>
>
> --
> Thanks and regards
>
> Chinmay Soman
>
> ________________________________
>
> This email and any attachments may contain confidential and privileged
> material for the sole use of the intended recipient. Any review, copying,
> or distribution of this email (or any attachments) by others is prohibited.
> If you are not the intended recipient, please contact the sender
> immediately and permanently delete this email and any attachments. No
> employee or agent of TiVo Inc. is authorized to conclude any binding
> agreement on behalf of TiVo Inc. by email. Binding agreements with TiVo
> Inc. may only be made by a signed written agreement.
>

________________________________

This email and any attachments may contain confidential and privileged material for the sole use of the intended recipient. Any review, copying, or distribution of this email (or any attachments) by others is prohibited. If you are not the intended recipient, please contact the sender immediately and permanently delete this email and any attachments. No employee or agent of TiVo Inc. is authorized to conclude any binding agreement on behalf of TiVo Inc. by email. Binding agreements with TiVo Inc. may only be made by a signed written agreement.

RE: Job-level lifecycle hooks?

Posted by Chinmay Soman <ch...@gmail.com>.
Totally agreed. Can you please open a ticket for this ?
On Jan 17, 2015 7:03 AM, "Tommy Becker" <to...@tivo.com> wrote:

> Thanks for your reply.  I suppose "state" wasn't really the best term to
> use; our real need is simply to share objects between task instances and
> serde factorys in a single container.  Concretely, we have an Avro schema
> registry that we need access to.  It maintains connections to Zookeeper, so
> we would prefer to only have one per container.  We have what I would
> consider a hacky implementation working via Singletons, but it would be
> nicer if we could pass instances around in some sort of job-level context
> object.
>
> -Tommy
>
>
> ________________________________________
> From: Chinmay Soman [chinmay.cerebro@gmail.com]
> Sent: Friday, January 16, 2015 9:43 PM
> To: dev@samza.incubator.apache.org
> Subject: Re: Job-level lifecycle hooks?
>
> I don't think there's a built in way. One "hacky" way to do it is have a
> static flag to indicate whether or not the shared instance initialization
> is complete.
>
> As far as the shared state goes - there's a ticket tracking this issue:
> SAMZA-402 <https://issues.apache.org/jira/browse/SAMZA-402> . But I don't
> think there's any ticket for the use case you mention.
>
> On Fri, Jan 16, 2015 at 6:25 PM, Tommy Becker <to...@tivo.com> wrote:
>
> > We have some collaborator classes that we need access to not only from
> > task instances but also from custom SerdeFactorys.  Unfortunately Samza
> > doesn't really provide a method to share state, so we've resorted to a
> > singleton service-locator type class.  That solves the problem of sharing
> > instances but not the problem of where to initialize these instances.
> I'm
> > curious if any thought been given to providing job-level lifecycle hooks?
> > If anyone else has had need to share state within jobs, how did you do
> it?
> >
> > -Tommy
> >
> > ________________________________
> >
> > This email and any attachments may contain confidential and privileged
> > material for the sole use of the intended recipient. Any review, copying,
> > or distribution of this email (or any attachments) by others is
> prohibited.
> > If you are not the intended recipient, please contact the sender
> > immediately and permanently delete this email and any attachments. No
> > employee or agent of TiVo Inc. is authorized to conclude any binding
> > agreement on behalf of TiVo Inc. by email. Binding agreements with TiVo
> > Inc. may only be made by a signed written agreement.
> >
>
>
>
> --
> Thanks and regards
>
> Chinmay Soman
>
> ________________________________
>
> This email and any attachments may contain confidential and privileged
> material for the sole use of the intended recipient. Any review, copying,
> or distribution of this email (or any attachments) by others is prohibited.
> If you are not the intended recipient, please contact the sender
> immediately and permanently delete this email and any attachments. No
> employee or agent of TiVo Inc. is authorized to conclude any binding
> agreement on behalf of TiVo Inc. by email. Binding agreements with TiVo
> Inc. may only be made by a signed written agreement.
>

RE: Job-level lifecycle hooks?

Posted by Tommy Becker <to...@Tivo.com>.
Thanks for your reply.  I suppose "state" wasn't really the best term to use; our real need is simply to share objects between task instances and serde factorys in a single container.  Concretely, we have an Avro schema registry that we need access to.  It maintains connections to Zookeeper, so we would prefer to only have one per container.  We have what I would consider a hacky implementation working via Singletons, but it would be nicer if we could pass instances around in some sort of job-level context object.

-Tommy


________________________________________
From: Chinmay Soman [chinmay.cerebro@gmail.com]
Sent: Friday, January 16, 2015 9:43 PM
To: dev@samza.incubator.apache.org
Subject: Re: Job-level lifecycle hooks?

I don't think there's a built in way. One "hacky" way to do it is have a
static flag to indicate whether or not the shared instance initialization
is complete.

As far as the shared state goes - there's a ticket tracking this issue:
SAMZA-402 <https://issues.apache.org/jira/browse/SAMZA-402> . But I don't
think there's any ticket for the use case you mention.

On Fri, Jan 16, 2015 at 6:25 PM, Tommy Becker <to...@tivo.com> wrote:

> We have some collaborator classes that we need access to not only from
> task instances but also from custom SerdeFactorys.  Unfortunately Samza
> doesn't really provide a method to share state, so we've resorted to a
> singleton service-locator type class.  That solves the problem of sharing
> instances but not the problem of where to initialize these instances.  I'm
> curious if any thought been given to providing job-level lifecycle hooks?
> If anyone else has had need to share state within jobs, how did you do it?
>
> -Tommy
>
> ________________________________
>
> This email and any attachments may contain confidential and privileged
> material for the sole use of the intended recipient. Any review, copying,
> or distribution of this email (or any attachments) by others is prohibited.
> If you are not the intended recipient, please contact the sender
> immediately and permanently delete this email and any attachments. No
> employee or agent of TiVo Inc. is authorized to conclude any binding
> agreement on behalf of TiVo Inc. by email. Binding agreements with TiVo
> Inc. may only be made by a signed written agreement.
>



--
Thanks and regards

Chinmay Soman

________________________________

This email and any attachments may contain confidential and privileged material for the sole use of the intended recipient. Any review, copying, or distribution of this email (or any attachments) by others is prohibited. If you are not the intended recipient, please contact the sender immediately and permanently delete this email and any attachments. No employee or agent of TiVo Inc. is authorized to conclude any binding agreement on behalf of TiVo Inc. by email. Binding agreements with TiVo Inc. may only be made by a signed written agreement.

Re: Job-level lifecycle hooks?

Posted by Chinmay Soman <ch...@gmail.com>.
I don't think there's a built in way. One "hacky" way to do it is have a
static flag to indicate whether or not the shared instance initialization
is complete.

As far as the shared state goes - there's a ticket tracking this issue:
SAMZA-402 <https://issues.apache.org/jira/browse/SAMZA-402> . But I don't
think there's any ticket for the use case you mention.

On Fri, Jan 16, 2015 at 6:25 PM, Tommy Becker <to...@tivo.com> wrote:

> We have some collaborator classes that we need access to not only from
> task instances but also from custom SerdeFactorys.  Unfortunately Samza
> doesn't really provide a method to share state, so we've resorted to a
> singleton service-locator type class.  That solves the problem of sharing
> instances but not the problem of where to initialize these instances.  I'm
> curious if any thought been given to providing job-level lifecycle hooks?
> If anyone else has had need to share state within jobs, how did you do it?
>
> -Tommy
>
> ________________________________
>
> This email and any attachments may contain confidential and privileged
> material for the sole use of the intended recipient. Any review, copying,
> or distribution of this email (or any attachments) by others is prohibited.
> If you are not the intended recipient, please contact the sender
> immediately and permanently delete this email and any attachments. No
> employee or agent of TiVo Inc. is authorized to conclude any binding
> agreement on behalf of TiVo Inc. by email. Binding agreements with TiVo
> Inc. may only be made by a signed written agreement.
>



-- 
Thanks and regards

Chinmay Soman