You are viewing a plain text version of this content. The canonical link for it is here.

Posted to dev@flink.apache.org by Chen Qin <qi...@gmail.com> on 2017/07/06 05:19:40 UTC

Make SubmittedJobGraphStore configurable

Hi there,

I would like to propose/discuss median level refactor work to make
submittedJobGraphStore configurable and extensible.

The rationale behind is to allow users offload those meta data to durable
cross dc read after write strong consistency storage and decouple with zk
quorum.


https://issues.apache.org/jira/browse/FLINK-7106

<https://issues.apache.org/jira/browse/FLINK-7106>
New configurable setting in flink.conf
 looks like following

g
raph
-s
tore:
customized/zookeeper
g
raph
-s
tore.class: xx.yy.MyS3SubmittedJobGraphStoreImp

g
raph
-s
tore.
endpoint
: s3.amazonaws.com
g
raph
-s
tore.path.root:
s3:/

/
my root/

Thanks,
Chen

Re: Make SubmittedJobGraphStore configurable

Posted by Chen Qin <qi...@gmail.com>.

Hi Till,

As far as I know there is interests of keep job graphs recoverable from shared zk hiccups. Or standalone mode with customized leader election. 

I plan to spend a bit time prototyping back up to Amazon S3. Will keep folks updated as along as I got happy pass going.

Thanks,
Chen

> On Jul 25, 2017, at 6:07 AM, Till Rohrmann <tr...@apache.org> wrote:
> 
> If there is a need for this, then we can definitely make this configurable.
> The interface SubmittedJobGraphStore is already there.
> 
> Cheers,
> Till
> 
> 
>> On Fri, Jul 7, 2017 at 6:32 AM, Chen Qin <qi...@gmail.com> wrote:
>> 
>> Sure,
>>  I would imagine couple of extra lines within flink.conf
>> ...
>> graphstore.type: customized/zookeeper
>> graphstore.class:
>> org
>> .
>> apache.flink.contrib
>> .MyS3SubmittedJobGraphStoreImp
>> graphstore.endpoint: s3.amazonaws.com
>> graphstore.path.root: s3://my root/
>> 
>> which overwrites initiation of
>> 
>> *org.apache.flink.runtime.highavailability.HighAvailabilityServices*
>> 
>> /**
>> * Gets the submitted job graph store for the job manager
>> *
>> * @return Submitted job graph store
>> * @throws Exception if the submitted job graph store could not be created
>> */
>> 
>> SubmittedJobGraphStore *getSubmittedJobGraphStore*() throws Exception;
>> 
>> In this case, user implemented their own s3 backed job graph store and
>> stores job graphs in s3 instead of zookeeper(high availability) or
>> never(nonha)
>> 
>> I find [1] is somehow related and more focus on life cycle and dependency
>> aspect of graph-store and checkpoint-store. FLINK-7106 in this case limited
>> to enable user implemented their own jobgraphstore instead of hardcoded to
>> zookeeper.
>> 
>> Thanks,
>> Chen
>> 
>> 
>> [1] https://issues.apache.org/jira/browse/FLINK-6626
>> 
>> 
>>> On Thu, Jul 6, 2017 at 2:47 AM, Ted Yu <yu...@gmail.com> wrote:
>>> 
>>> The sample config entries are broken into multiple lines.
>>> 
>>> Can you send the config again with one config on one line ?
>>> 
>>> Cheers
>>> 
>>>> On Wed, Jul 5, 2017 at 10:19 PM, Chen Qin <qi...@gmail.com> wrote:
>>>> 
>>>> Hi there,
>>>> 
>>>> I would like to propose/discuss median level refactor work to make
>>>> submittedJobGraphStore configurable and extensible.
>>>> 
>>>> The rationale behind is to allow users offload those meta data to
>> durable
>>>> cross dc read after write strong consistency storage and decouple with
>> zk
>>>> quorum.
>>>> 
>>>> 
>>>> https://issues.apache.org/jira/browse/FLINK-7106
>>>> 
>>>> <https://issues.apache.org/jira/browse/FLINK-7106>
>>>> New configurable setting in flink.conf
>>>>  looks like following
>>>> 
>>>> g
>>>> raph
>>>> -s
>>>> tore:
>>>> customized/zookeeper
>>>> g
>>>> raph
>>>> -s
>>>> tore.class: xx.yy.MyS3SubmittedJobGraphStoreImp
>>>> 
>>>> g
>>>> raph
>>>> -s
>>>> tore.
>>>> endpoint
>>>> : s3.amazonaws.com
>>>> g
>>>> raph
>>>> -s
>>>> tore.path.root:
>>>> s3:/
>>>> 
>>>> /
>>>> my root/
>>>> 
>>>> Thanks,
>>>> Chen
>>>> 
>>> 
>>

Re: Make SubmittedJobGraphStore configurable

Posted by Till Rohrmann <tr...@apache.org>.

If there is a need for this, then we can definitely make this configurable.
The interface SubmittedJobGraphStore is already there.

Cheers,
Till


On Fri, Jul 7, 2017 at 6:32 AM, Chen Qin <qi...@gmail.com> wrote:

> Sure,
>  I would imagine couple of extra lines within flink.conf
> ...
> graphstore.type: customized/zookeeper
> graphstore.class:
> org
> .
> apache.flink.contrib
> .MyS3SubmittedJobGraphStoreImp
> graphstore.endpoint: s3.amazonaws.com
> graphstore.path.root: s3://my root/
>
> which overwrites initiation of
>
> *org.apache.flink.runtime.highavailability.HighAvailabilityServices*
>
> /**
> * Gets the submitted job graph store for the job manager
> *
> * @return Submitted job graph store
> * @throws Exception if the submitted job graph store could not be created
> */
>
> SubmittedJobGraphStore *getSubmittedJobGraphStore*() throws Exception;
>
> In this case, user implemented their own s3 backed job graph store and
> stores job graphs in s3 instead of zookeeper(high availability) or
> never(nonha)
>
> I find [1] is somehow related and more focus on life cycle and dependency
> aspect of graph-store and checkpoint-store. FLINK-7106 in this case limited
> to enable user implemented their own jobgraphstore instead of hardcoded to
> zookeeper.
>
> Thanks,
> Chen
>
>
> [1] https://issues.apache.org/jira/browse/FLINK-6626
>
>
> On Thu, Jul 6, 2017 at 2:47 AM, Ted Yu <yu...@gmail.com> wrote:
>
> > The sample config entries are broken into multiple lines.
> >
> > Can you send the config again with one config on one line ?
> >
> > Cheers
> >
> > On Wed, Jul 5, 2017 at 10:19 PM, Chen Qin <qi...@gmail.com> wrote:
> >
> > > Hi there,
> > >
> > > I would like to propose/discuss median level refactor work to make
> > > submittedJobGraphStore configurable and extensible.
> > >
> > > The rationale behind is to allow users offload those meta data to
> durable
> > > cross dc read after write strong consistency storage and decouple with
> zk
> > > quorum.
> > > 
> > >
> > > https://issues.apache.org/jira/browse/FLINK-7106
> > >
> > > <https://issues.apache.org/jira/browse/FLINK-7106>
> > > New configurable setting in flink.conf
> > >  looks like following
> > >
> > > g
> > > raph
> > > -s
> > > tore:
> > > customized/zookeeper
> > > g
> > > raph
> > > -s
> > > tore.class: xx.yy.MyS3SubmittedJobGraphStoreImp
> > >
> > > g
> > > raph
> > > -s
> > > tore.
> > > endpoint
> > > : s3.amazonaws.com
> > > g
> > > raph
> > > -s
> > > tore.path.root:
> > > s3:/
> > > 
> > > /
> > > my root/
> > >
> > > Thanks,
> > > Chen
> > >
> >
>

Re: Make SubmittedJobGraphStore configurable

Posted by Chen Qin <qi...@gmail.com>.

Sure,
 I would imagine couple of extra lines within flink.conf
...
graphstore.type: customized/zookeeper
graphstore.class:
org
.
apache.flink.contrib
.MyS3SubmittedJobGraphStoreImp
graphstore.endpoint: s3.amazonaws.com
graphstore.path.root: s3://my root/

which overwrites initiation of

*org.apache.flink.runtime.highavailability.HighAvailabilityServices*

/**
* Gets the submitted job graph store for the job manager
*
* @return Submitted job graph store
* @throws Exception if the submitted job graph store could not be created
*/

SubmittedJobGraphStore *getSubmittedJobGraphStore*() throws Exception;

In this case, user implemented their own s3 backed job graph store and
stores job graphs in s3 instead of zookeeper(high availability) or
never(nonha)

I find [1] is somehow related and more focus on life cycle and dependency
aspect of graph-store and checkpoint-store. FLINK-7106 in this case limited
to enable user implemented their own jobgraphstore instead of hardcoded to
zookeeper.

Thanks,
Chen


[1] https://issues.apache.org/jira/browse/FLINK-6626


On Thu, Jul 6, 2017 at 2:47 AM, Ted Yu <yu...@gmail.com> wrote:

> The sample config entries are broken into multiple lines.
>
> Can you send the config again with one config on one line ?
>
> Cheers
>
> On Wed, Jul 5, 2017 at 10:19 PM, Chen Qin <qi...@gmail.com> wrote:
>
> > Hi there,
> >
> > I would like to propose/discuss median level refactor work to make
> > submittedJobGraphStore configurable and extensible.
> >
> > The rationale behind is to allow users offload those meta data to durable
> > cross dc read after write strong consistency storage and decouple with zk
> > quorum.
> > 
> >
> > https://issues.apache.org/jira/browse/FLINK-7106
> >
> > <https://issues.apache.org/jira/browse/FLINK-7106>
> > New configurable setting in flink.conf
> >  looks like following
> >
> > g
> > raph
> > -s
> > tore:
> > customized/zookeeper
> > g
> > raph
> > -s
> > tore.class: xx.yy.MyS3SubmittedJobGraphStoreImp
> >
> > g
> > raph
> > -s
> > tore.
> > endpoint
> > : s3.amazonaws.com
> > g
> > raph
> > -s
> > tore.path.root:
> > s3:/
> > 
> > /
> > my root/
> >
> > Thanks,
> > Chen
> >
>

Re: Make SubmittedJobGraphStore configurable

Posted by Ted Yu <yu...@gmail.com>.

The sample config entries are broken into multiple lines.

Can you send the config again with one config on one line ?

Cheers

On Wed, Jul 5, 2017 at 10:19 PM, Chen Qin <qi...@gmail.com> wrote:

> Hi there,
>
> I would like to propose/discuss median level refactor work to make
> submittedJobGraphStore configurable and extensible.
>
> The rationale behind is to allow users offload those meta data to durable
> cross dc read after write strong consistency storage and decouple with zk
> quorum.
> 
>
> https://issues.apache.org/jira/browse/FLINK-7106
>
> <https://issues.apache.org/jira/browse/FLINK-7106>
> New configurable setting in flink.conf
>  looks like following
>
> g
> raph
> -s
> tore:
> customized/zookeeper
> g
> raph
> -s
> tore.class: xx.yy.MyS3SubmittedJobGraphStoreImp
>
> g
> raph
> -s
> tore.
> endpoint
> : s3.amazonaws.com
> g
> raph
> -s
> tore.path.root:
> s3:/
> 
> /
> my root/
>
> Thanks,
> Chen
>