You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@hadoop.apache.org by Suma Shivaprasad <su...@gmail.com> on 2015/02/11 06:59:10 UTC

FileSystem Vs ZKStateStore for RM recovery

We are planning to deploy Hadoop 2.6.0 with a default configuration to
cache 10000 entries in the state store. With a workload of 150-250
concurrent applications at any time , which state store is better to use
and for what reasons ?

Thanks
Suma

Re: FileSystem Vs ZKStateStore for RM recovery

Posted by Suma Shivaprasad <su...@gmail.com>.
We have set yarn.resourcemanager.max-completed-applications=10000. I assume
this is the no of entries kept in RMStateStore since I see these in logs

2015-02-11 00:00:00,579 INFO
org.apache.hadoop.yarn.server.resourcemanager.RMAppManager: Max number of
completed apps kept in state store met: maxCompletedAppsInStateStore =
10000, removing app application_1422879343468_543109 from state store.

Thanks
Suma


On Wed, Feb 11, 2015 at 1:09 PM, Nikhil <mn...@gmail.com> wrote:

> >>cache 10000 entries in the state store.
> May I know what are 10000 entries you are talking about here? Any details
> please.
>
> On Tue, Feb 10, 2015 at 9:59 PM, Suma Shivaprasad <
> sumasai.shivaprasad@gmail.com> wrote:
>
>> We are planning to deploy Hadoop 2.6.0 with a default configuration to
>> cache 10000 entries in the state store. With a workload of 150-250
>> concurrent applications at any time , which state store is better to use
>> and for what reasons ?
>>
>> Thanks
>> Suma
>>
>
>

Re: FileSystem Vs ZKStateStore for RM recovery

Posted by Suma Shivaprasad <su...@gmail.com>.
We have set yarn.resourcemanager.max-completed-applications=10000. I assume
this is the no of entries kept in RMStateStore since I see these in logs

2015-02-11 00:00:00,579 INFO
org.apache.hadoop.yarn.server.resourcemanager.RMAppManager: Max number of
completed apps kept in state store met: maxCompletedAppsInStateStore =
10000, removing app application_1422879343468_543109 from state store.

Thanks
Suma


On Wed, Feb 11, 2015 at 1:09 PM, Nikhil <mn...@gmail.com> wrote:

> >>cache 10000 entries in the state store.
> May I know what are 10000 entries you are talking about here? Any details
> please.
>
> On Tue, Feb 10, 2015 at 9:59 PM, Suma Shivaprasad <
> sumasai.shivaprasad@gmail.com> wrote:
>
>> We are planning to deploy Hadoop 2.6.0 with a default configuration to
>> cache 10000 entries in the state store. With a workload of 150-250
>> concurrent applications at any time , which state store is better to use
>> and for what reasons ?
>>
>> Thanks
>> Suma
>>
>
>

Re: FileSystem Vs ZKStateStore for RM recovery

Posted by Suma Shivaprasad <su...@gmail.com>.
We have set yarn.resourcemanager.max-completed-applications=10000. I assume
this is the no of entries kept in RMStateStore since I see these in logs

2015-02-11 00:00:00,579 INFO
org.apache.hadoop.yarn.server.resourcemanager.RMAppManager: Max number of
completed apps kept in state store met: maxCompletedAppsInStateStore =
10000, removing app application_1422879343468_543109 from state store.

Thanks
Suma


On Wed, Feb 11, 2015 at 1:09 PM, Nikhil <mn...@gmail.com> wrote:

> >>cache 10000 entries in the state store.
> May I know what are 10000 entries you are talking about here? Any details
> please.
>
> On Tue, Feb 10, 2015 at 9:59 PM, Suma Shivaprasad <
> sumasai.shivaprasad@gmail.com> wrote:
>
>> We are planning to deploy Hadoop 2.6.0 with a default configuration to
>> cache 10000 entries in the state store. With a workload of 150-250
>> concurrent applications at any time , which state store is better to use
>> and for what reasons ?
>>
>> Thanks
>> Suma
>>
>
>

Re: FileSystem Vs ZKStateStore for RM recovery

Posted by Suma Shivaprasad <su...@gmail.com>.
We have set yarn.resourcemanager.max-completed-applications=10000. I assume
this is the no of entries kept in RMStateStore since I see these in logs

2015-02-11 00:00:00,579 INFO
org.apache.hadoop.yarn.server.resourcemanager.RMAppManager: Max number of
completed apps kept in state store met: maxCompletedAppsInStateStore =
10000, removing app application_1422879343468_543109 from state store.

Thanks
Suma


On Wed, Feb 11, 2015 at 1:09 PM, Nikhil <mn...@gmail.com> wrote:

> >>cache 10000 entries in the state store.
> May I know what are 10000 entries you are talking about here? Any details
> please.
>
> On Tue, Feb 10, 2015 at 9:59 PM, Suma Shivaprasad <
> sumasai.shivaprasad@gmail.com> wrote:
>
>> We are planning to deploy Hadoop 2.6.0 with a default configuration to
>> cache 10000 entries in the state store. With a workload of 150-250
>> concurrent applications at any time , which state store is better to use
>> and for what reasons ?
>>
>> Thanks
>> Suma
>>
>
>

Re: FileSystem Vs ZKStateStore for RM recovery

Posted by Nikhil <mn...@gmail.com>.
>>cache 10000 entries in the state store.
May I know what are 10000 entries you are talking about here? Any details
please.

On Tue, Feb 10, 2015 at 9:59 PM, Suma Shivaprasad <
sumasai.shivaprasad@gmail.com> wrote:

> We are planning to deploy Hadoop 2.6.0 with a default configuration to
> cache 10000 entries in the state store. With a workload of 150-250
> concurrent applications at any time , which state store is better to use
> and for what reasons ?
>
> Thanks
> Suma
>

RE: FileSystem Vs ZKStateStore for RM recovery

Posted by Brahma Reddy Battula <br...@huawei.com>.
Yes, you can configure yarn.resourcemanager.max-completed-applications based on your usage..All the applications( RMStateStore ) will be stored in the ZK.As Tsuyoshi mentioned Zk will support..




Thanks & Regards
Brahma Reddy Battula
________________________________________
From: Tsuyoshi Ozawa [ozawa@apache.org]
Sent: Friday, February 13, 2015 12:30 PM
To: Tsuyoshi Ozawa
Cc: user@hadoop.apache.org; yarn-dev@hadoop.apache.org
Subject: Re: FileSystem Vs ZKStateStore for RM recovery

> I think ZooKeeper can handle thousands of updates,

I meant "thousands of updates per second".

Thanks,
- Tsuyoshi

On Fri, Feb 13, 2015 at 3:59 PM, Tsuyoshi Ozawa <oz...@apache.org> wrote:
> Hi Suma,
>
> I think ZooKeeper can handle thousands of updates, so thousands of
> jobs can be "launched" at the same time.
> More jobs can be running at the same time since the number of updates
> against ZooKeeper is less than the number of jobs. Please free to ask
> us if you face the scalability or performance issue when you test. We
> can tackle the issue.
>
> Thanks,
> - Tsuyoshi
>
> On Wed, Feb 11, 2015 at 6:08 PM, Suma Shivaprasad
> <su...@gmail.com> wrote:
>> Can ZKStateStore scale for large clusters. Any idea on the number of
>> concurrent jobs that can be supported on top of these ?
>>
>> Thanks
>> Suma
>>
>> On Wed, Feb 11, 2015 at 1:45 PM, Karthik Kambatla <ka...@cloudera.com>
>> wrote:
>>>
>>> We recommend ZK-store, particularly if you plan to deploy multiple
>>> ResourceManagers with failover. ZK-store ensures a single RM has write
>>> access and thus is better protected against split-brain cases where both RMs
>>> think they are active.
>>>
>>> On Tue, Feb 10, 2015 at 9:59 PM, Suma Shivaprasad
>>> <su...@gmail.com> wrote:
>>>>
>>>> We are planning to deploy Hadoop 2.6.0 with a default configuration to
>>>> cache 10000 entries in the state store. With a workload of 150-250
>>>> concurrent applications at any time , which state store is better to use
>>>> and for what reasons ?
>>>>
>>>> Thanks
>>>> Suma
>>>
>>>
>>>
>>>
>>> --
>>> Karthik Kambatla
>>> Software Engineer, Cloudera Inc.
>>> --------------------------------------------
>>> http://five.sentenc.es
>>>
>>

Re: FileSystem Vs ZKStateStore for RM recovery

Posted by Tsuyoshi Ozawa <oz...@apache.org>.
> I think ZooKeeper can handle thousands of updates,

I meant "thousands of updates per second".

Thanks,
- Tsuyoshi

On Fri, Feb 13, 2015 at 3:59 PM, Tsuyoshi Ozawa <oz...@apache.org> wrote:
> Hi Suma,
>
> I think ZooKeeper can handle thousands of updates, so thousands of
> jobs can be "launched" at the same time.
> More jobs can be running at the same time since the number of updates
> against ZooKeeper is less than the number of jobs. Please free to ask
> us if you face the scalability or performance issue when you test. We
> can tackle the issue.
>
> Thanks,
> - Tsuyoshi
>
> On Wed, Feb 11, 2015 at 6:08 PM, Suma Shivaprasad
> <su...@gmail.com> wrote:
>> Can ZKStateStore scale for large clusters. Any idea on the number of
>> concurrent jobs that can be supported on top of these ?
>>
>> Thanks
>> Suma
>>
>> On Wed, Feb 11, 2015 at 1:45 PM, Karthik Kambatla <ka...@cloudera.com>
>> wrote:
>>>
>>> We recommend ZK-store, particularly if you plan to deploy multiple
>>> ResourceManagers with failover. ZK-store ensures a single RM has write
>>> access and thus is better protected against split-brain cases where both RMs
>>> think they are active.
>>>
>>> On Tue, Feb 10, 2015 at 9:59 PM, Suma Shivaprasad
>>> <su...@gmail.com> wrote:
>>>>
>>>> We are planning to deploy Hadoop 2.6.0 with a default configuration to
>>>> cache 10000 entries in the state store. With a workload of 150-250
>>>> concurrent applications at any time , which state store is better to use
>>>> and for what reasons ?
>>>>
>>>> Thanks
>>>> Suma
>>>
>>>
>>>
>>>
>>> --
>>> Karthik Kambatla
>>> Software Engineer, Cloudera Inc.
>>> --------------------------------------------
>>> http://five.sentenc.es
>>>
>>

Re: FileSystem Vs ZKStateStore for RM recovery

Posted by Tsuyoshi Ozawa <oz...@apache.org>.
> I think ZooKeeper can handle thousands of updates,

I meant "thousands of updates per second".

Thanks,
- Tsuyoshi

On Fri, Feb 13, 2015 at 3:59 PM, Tsuyoshi Ozawa <oz...@apache.org> wrote:
> Hi Suma,
>
> I think ZooKeeper can handle thousands of updates, so thousands of
> jobs can be "launched" at the same time.
> More jobs can be running at the same time since the number of updates
> against ZooKeeper is less than the number of jobs. Please free to ask
> us if you face the scalability or performance issue when you test. We
> can tackle the issue.
>
> Thanks,
> - Tsuyoshi
>
> On Wed, Feb 11, 2015 at 6:08 PM, Suma Shivaprasad
> <su...@gmail.com> wrote:
>> Can ZKStateStore scale for large clusters. Any idea on the number of
>> concurrent jobs that can be supported on top of these ?
>>
>> Thanks
>> Suma
>>
>> On Wed, Feb 11, 2015 at 1:45 PM, Karthik Kambatla <ka...@cloudera.com>
>> wrote:
>>>
>>> We recommend ZK-store, particularly if you plan to deploy multiple
>>> ResourceManagers with failover. ZK-store ensures a single RM has write
>>> access and thus is better protected against split-brain cases where both RMs
>>> think they are active.
>>>
>>> On Tue, Feb 10, 2015 at 9:59 PM, Suma Shivaprasad
>>> <su...@gmail.com> wrote:
>>>>
>>>> We are planning to deploy Hadoop 2.6.0 with a default configuration to
>>>> cache 10000 entries in the state store. With a workload of 150-250
>>>> concurrent applications at any time , which state store is better to use
>>>> and for what reasons ?
>>>>
>>>> Thanks
>>>> Suma
>>>
>>>
>>>
>>>
>>> --
>>> Karthik Kambatla
>>> Software Engineer, Cloudera Inc.
>>> --------------------------------------------
>>> http://five.sentenc.es
>>>
>>

Re: FileSystem Vs ZKStateStore for RM recovery

Posted by Tsuyoshi Ozawa <oz...@apache.org>.
> I think ZooKeeper can handle thousands of updates,

I meant "thousands of updates per second".

Thanks,
- Tsuyoshi

On Fri, Feb 13, 2015 at 3:59 PM, Tsuyoshi Ozawa <oz...@apache.org> wrote:
> Hi Suma,
>
> I think ZooKeeper can handle thousands of updates, so thousands of
> jobs can be "launched" at the same time.
> More jobs can be running at the same time since the number of updates
> against ZooKeeper is less than the number of jobs. Please free to ask
> us if you face the scalability or performance issue when you test. We
> can tackle the issue.
>
> Thanks,
> - Tsuyoshi
>
> On Wed, Feb 11, 2015 at 6:08 PM, Suma Shivaprasad
> <su...@gmail.com> wrote:
>> Can ZKStateStore scale for large clusters. Any idea on the number of
>> concurrent jobs that can be supported on top of these ?
>>
>> Thanks
>> Suma
>>
>> On Wed, Feb 11, 2015 at 1:45 PM, Karthik Kambatla <ka...@cloudera.com>
>> wrote:
>>>
>>> We recommend ZK-store, particularly if you plan to deploy multiple
>>> ResourceManagers with failover. ZK-store ensures a single RM has write
>>> access and thus is better protected against split-brain cases where both RMs
>>> think they are active.
>>>
>>> On Tue, Feb 10, 2015 at 9:59 PM, Suma Shivaprasad
>>> <su...@gmail.com> wrote:
>>>>
>>>> We are planning to deploy Hadoop 2.6.0 with a default configuration to
>>>> cache 10000 entries in the state store. With a workload of 150-250
>>>> concurrent applications at any time , which state store is better to use
>>>> and for what reasons ?
>>>>
>>>> Thanks
>>>> Suma
>>>
>>>
>>>
>>>
>>> --
>>> Karthik Kambatla
>>> Software Engineer, Cloudera Inc.
>>> --------------------------------------------
>>> http://five.sentenc.es
>>>
>>

Re: FileSystem Vs ZKStateStore for RM recovery

Posted by Tsuyoshi Ozawa <oz...@apache.org>.
> I think ZooKeeper can handle thousands of updates,

I meant "thousands of updates per second".

Thanks,
- Tsuyoshi

On Fri, Feb 13, 2015 at 3:59 PM, Tsuyoshi Ozawa <oz...@apache.org> wrote:
> Hi Suma,
>
> I think ZooKeeper can handle thousands of updates, so thousands of
> jobs can be "launched" at the same time.
> More jobs can be running at the same time since the number of updates
> against ZooKeeper is less than the number of jobs. Please free to ask
> us if you face the scalability or performance issue when you test. We
> can tackle the issue.
>
> Thanks,
> - Tsuyoshi
>
> On Wed, Feb 11, 2015 at 6:08 PM, Suma Shivaprasad
> <su...@gmail.com> wrote:
>> Can ZKStateStore scale for large clusters. Any idea on the number of
>> concurrent jobs that can be supported on top of these ?
>>
>> Thanks
>> Suma
>>
>> On Wed, Feb 11, 2015 at 1:45 PM, Karthik Kambatla <ka...@cloudera.com>
>> wrote:
>>>
>>> We recommend ZK-store, particularly if you plan to deploy multiple
>>> ResourceManagers with failover. ZK-store ensures a single RM has write
>>> access and thus is better protected against split-brain cases where both RMs
>>> think they are active.
>>>
>>> On Tue, Feb 10, 2015 at 9:59 PM, Suma Shivaprasad
>>> <su...@gmail.com> wrote:
>>>>
>>>> We are planning to deploy Hadoop 2.6.0 with a default configuration to
>>>> cache 10000 entries in the state store. With a workload of 150-250
>>>> concurrent applications at any time , which state store is better to use
>>>> and for what reasons ?
>>>>
>>>> Thanks
>>>> Suma
>>>
>>>
>>>
>>>
>>> --
>>> Karthik Kambatla
>>> Software Engineer, Cloudera Inc.
>>> --------------------------------------------
>>> http://five.sentenc.es
>>>
>>

Re: FileSystem Vs ZKStateStore for RM recovery

Posted by Tsuyoshi Ozawa <oz...@apache.org>.
> I think ZooKeeper can handle thousands of updates,

I meant "thousands of updates per second".

Thanks,
- Tsuyoshi

On Fri, Feb 13, 2015 at 3:59 PM, Tsuyoshi Ozawa <oz...@apache.org> wrote:
> Hi Suma,
>
> I think ZooKeeper can handle thousands of updates, so thousands of
> jobs can be "launched" at the same time.
> More jobs can be running at the same time since the number of updates
> against ZooKeeper is less than the number of jobs. Please free to ask
> us if you face the scalability or performance issue when you test. We
> can tackle the issue.
>
> Thanks,
> - Tsuyoshi
>
> On Wed, Feb 11, 2015 at 6:08 PM, Suma Shivaprasad
> <su...@gmail.com> wrote:
>> Can ZKStateStore scale for large clusters. Any idea on the number of
>> concurrent jobs that can be supported on top of these ?
>>
>> Thanks
>> Suma
>>
>> On Wed, Feb 11, 2015 at 1:45 PM, Karthik Kambatla <ka...@cloudera.com>
>> wrote:
>>>
>>> We recommend ZK-store, particularly if you plan to deploy multiple
>>> ResourceManagers with failover. ZK-store ensures a single RM has write
>>> access and thus is better protected against split-brain cases where both RMs
>>> think they are active.
>>>
>>> On Tue, Feb 10, 2015 at 9:59 PM, Suma Shivaprasad
>>> <su...@gmail.com> wrote:
>>>>
>>>> We are planning to deploy Hadoop 2.6.0 with a default configuration to
>>>> cache 10000 entries in the state store. With a workload of 150-250
>>>> concurrent applications at any time , which state store is better to use
>>>> and for what reasons ?
>>>>
>>>> Thanks
>>>> Suma
>>>
>>>
>>>
>>>
>>> --
>>> Karthik Kambatla
>>> Software Engineer, Cloudera Inc.
>>> --------------------------------------------
>>> http://five.sentenc.es
>>>
>>

Re: FileSystem Vs ZKStateStore for RM recovery

Posted by Tsuyoshi Ozawa <oz...@apache.org>.
Hi Suma,

I think ZooKeeper can handle thousands of updates, so thousands of
jobs can be "launched" at the same time.
More jobs can be running at the same time since the number of updates
against ZooKeeper is less than the number of jobs. Please free to ask
us if you face the scalability or performance issue when you test. We
can tackle the issue.

Thanks,
- Tsuyoshi

On Wed, Feb 11, 2015 at 6:08 PM, Suma Shivaprasad
<su...@gmail.com> wrote:
> Can ZKStateStore scale for large clusters. Any idea on the number of
> concurrent jobs that can be supported on top of these ?
>
> Thanks
> Suma
>
> On Wed, Feb 11, 2015 at 1:45 PM, Karthik Kambatla <ka...@cloudera.com>
> wrote:
>>
>> We recommend ZK-store, particularly if you plan to deploy multiple
>> ResourceManagers with failover. ZK-store ensures a single RM has write
>> access and thus is better protected against split-brain cases where both RMs
>> think they are active.
>>
>> On Tue, Feb 10, 2015 at 9:59 PM, Suma Shivaprasad
>> <su...@gmail.com> wrote:
>>>
>>> We are planning to deploy Hadoop 2.6.0 with a default configuration to
>>> cache 10000 entries in the state store. With a workload of 150-250
>>> concurrent applications at any time , which state store is better to use
>>> and for what reasons ?
>>>
>>> Thanks
>>> Suma
>>
>>
>>
>>
>> --
>> Karthik Kambatla
>> Software Engineer, Cloudera Inc.
>> --------------------------------------------
>> http://five.sentenc.es
>>
>

Re: FileSystem Vs ZKStateStore for RM recovery

Posted by Tsuyoshi Ozawa <oz...@apache.org>.
Hi Suma,

I think ZooKeeper can handle thousands of updates, so thousands of
jobs can be "launched" at the same time.
More jobs can be running at the same time since the number of updates
against ZooKeeper is less than the number of jobs. Please free to ask
us if you face the scalability or performance issue when you test. We
can tackle the issue.

Thanks,
- Tsuyoshi

On Wed, Feb 11, 2015 at 6:08 PM, Suma Shivaprasad
<su...@gmail.com> wrote:
> Can ZKStateStore scale for large clusters. Any idea on the number of
> concurrent jobs that can be supported on top of these ?
>
> Thanks
> Suma
>
> On Wed, Feb 11, 2015 at 1:45 PM, Karthik Kambatla <ka...@cloudera.com>
> wrote:
>>
>> We recommend ZK-store, particularly if you plan to deploy multiple
>> ResourceManagers with failover. ZK-store ensures a single RM has write
>> access and thus is better protected against split-brain cases where both RMs
>> think they are active.
>>
>> On Tue, Feb 10, 2015 at 9:59 PM, Suma Shivaprasad
>> <su...@gmail.com> wrote:
>>>
>>> We are planning to deploy Hadoop 2.6.0 with a default configuration to
>>> cache 10000 entries in the state store. With a workload of 150-250
>>> concurrent applications at any time , which state store is better to use
>>> and for what reasons ?
>>>
>>> Thanks
>>> Suma
>>
>>
>>
>>
>> --
>> Karthik Kambatla
>> Software Engineer, Cloudera Inc.
>> --------------------------------------------
>> http://five.sentenc.es
>>
>

Re: FileSystem Vs ZKStateStore for RM recovery

Posted by Tsuyoshi Ozawa <oz...@apache.org>.
Hi Suma,

I think ZooKeeper can handle thousands of updates, so thousands of
jobs can be "launched" at the same time.
More jobs can be running at the same time since the number of updates
against ZooKeeper is less than the number of jobs. Please free to ask
us if you face the scalability or performance issue when you test. We
can tackle the issue.

Thanks,
- Tsuyoshi

On Wed, Feb 11, 2015 at 6:08 PM, Suma Shivaprasad
<su...@gmail.com> wrote:
> Can ZKStateStore scale for large clusters. Any idea on the number of
> concurrent jobs that can be supported on top of these ?
>
> Thanks
> Suma
>
> On Wed, Feb 11, 2015 at 1:45 PM, Karthik Kambatla <ka...@cloudera.com>
> wrote:
>>
>> We recommend ZK-store, particularly if you plan to deploy multiple
>> ResourceManagers with failover. ZK-store ensures a single RM has write
>> access and thus is better protected against split-brain cases where both RMs
>> think they are active.
>>
>> On Tue, Feb 10, 2015 at 9:59 PM, Suma Shivaprasad
>> <su...@gmail.com> wrote:
>>>
>>> We are planning to deploy Hadoop 2.6.0 with a default configuration to
>>> cache 10000 entries in the state store. With a workload of 150-250
>>> concurrent applications at any time , which state store is better to use
>>> and for what reasons ?
>>>
>>> Thanks
>>> Suma
>>
>>
>>
>>
>> --
>> Karthik Kambatla
>> Software Engineer, Cloudera Inc.
>> --------------------------------------------
>> http://five.sentenc.es
>>
>

Re: FileSystem Vs ZKStateStore for RM recovery

Posted by Tsuyoshi Ozawa <oz...@apache.org>.
Hi Suma,

I think ZooKeeper can handle thousands of updates, so thousands of
jobs can be "launched" at the same time.
More jobs can be running at the same time since the number of updates
against ZooKeeper is less than the number of jobs. Please free to ask
us if you face the scalability or performance issue when you test. We
can tackle the issue.

Thanks,
- Tsuyoshi

On Wed, Feb 11, 2015 at 6:08 PM, Suma Shivaprasad
<su...@gmail.com> wrote:
> Can ZKStateStore scale for large clusters. Any idea on the number of
> concurrent jobs that can be supported on top of these ?
>
> Thanks
> Suma
>
> On Wed, Feb 11, 2015 at 1:45 PM, Karthik Kambatla <ka...@cloudera.com>
> wrote:
>>
>> We recommend ZK-store, particularly if you plan to deploy multiple
>> ResourceManagers with failover. ZK-store ensures a single RM has write
>> access and thus is better protected against split-brain cases where both RMs
>> think they are active.
>>
>> On Tue, Feb 10, 2015 at 9:59 PM, Suma Shivaprasad
>> <su...@gmail.com> wrote:
>>>
>>> We are planning to deploy Hadoop 2.6.0 with a default configuration to
>>> cache 10000 entries in the state store. With a workload of 150-250
>>> concurrent applications at any time , which state store is better to use
>>> and for what reasons ?
>>>
>>> Thanks
>>> Suma
>>
>>
>>
>>
>> --
>> Karthik Kambatla
>> Software Engineer, Cloudera Inc.
>> --------------------------------------------
>> http://five.sentenc.es
>>
>

Re: FileSystem Vs ZKStateStore for RM recovery

Posted by Tsuyoshi Ozawa <oz...@apache.org>.
Hi Suma,

I think ZooKeeper can handle thousands of updates, so thousands of
jobs can be "launched" at the same time.
More jobs can be running at the same time since the number of updates
against ZooKeeper is less than the number of jobs. Please free to ask
us if you face the scalability or performance issue when you test. We
can tackle the issue.

Thanks,
- Tsuyoshi

On Wed, Feb 11, 2015 at 6:08 PM, Suma Shivaprasad
<su...@gmail.com> wrote:
> Can ZKStateStore scale for large clusters. Any idea on the number of
> concurrent jobs that can be supported on top of these ?
>
> Thanks
> Suma
>
> On Wed, Feb 11, 2015 at 1:45 PM, Karthik Kambatla <ka...@cloudera.com>
> wrote:
>>
>> We recommend ZK-store, particularly if you plan to deploy multiple
>> ResourceManagers with failover. ZK-store ensures a single RM has write
>> access and thus is better protected against split-brain cases where both RMs
>> think they are active.
>>
>> On Tue, Feb 10, 2015 at 9:59 PM, Suma Shivaprasad
>> <su...@gmail.com> wrote:
>>>
>>> We are planning to deploy Hadoop 2.6.0 with a default configuration to
>>> cache 10000 entries in the state store. With a workload of 150-250
>>> concurrent applications at any time , which state store is better to use
>>> and for what reasons ?
>>>
>>> Thanks
>>> Suma
>>
>>
>>
>>
>> --
>> Karthik Kambatla
>> Software Engineer, Cloudera Inc.
>> --------------------------------------------
>> http://five.sentenc.es
>>
>

Re: FileSystem Vs ZKStateStore for RM recovery

Posted by Suma Shivaprasad <su...@gmail.com>.
Can ZKStateStore scale for large clusters. Any idea on the number of
concurrent jobs that can be supported on top of these ?

Thanks
Suma

On Wed, Feb 11, 2015 at 1:45 PM, Karthik Kambatla <ka...@cloudera.com>
wrote:

> We recommend ZK-store, particularly if you plan to deploy multiple
> ResourceManagers with failover. ZK-store ensures a single RM has write
> access and thus is better protected against split-brain cases where both
> RMs think they are active.
>
> On Tue, Feb 10, 2015 at 9:59 PM, Suma Shivaprasad <
> sumasai.shivaprasad@gmail.com> wrote:
>
>> We are planning to deploy Hadoop 2.6.0 with a default configuration to
>> cache 10000 entries in the state store. With a workload of 150-250
>> concurrent applications at any time , which state store is better to use
>> and for what reasons ?
>>
>> Thanks
>> Suma
>>
>
>
>
> --
> Karthik Kambatla
> Software Engineer, Cloudera Inc.
> --------------------------------------------
> http://five.sentenc.es
>
>

Re: FileSystem Vs ZKStateStore for RM recovery

Posted by Suma Shivaprasad <su...@gmail.com>.
Can ZKStateStore scale for large clusters. Any idea on the number of
concurrent jobs that can be supported on top of these ?

Thanks
Suma

On Wed, Feb 11, 2015 at 1:45 PM, Karthik Kambatla <ka...@cloudera.com>
wrote:

> We recommend ZK-store, particularly if you plan to deploy multiple
> ResourceManagers with failover. ZK-store ensures a single RM has write
> access and thus is better protected against split-brain cases where both
> RMs think they are active.
>
> On Tue, Feb 10, 2015 at 9:59 PM, Suma Shivaprasad <
> sumasai.shivaprasad@gmail.com> wrote:
>
>> We are planning to deploy Hadoop 2.6.0 with a default configuration to
>> cache 10000 entries in the state store. With a workload of 150-250
>> concurrent applications at any time , which state store is better to use
>> and for what reasons ?
>>
>> Thanks
>> Suma
>>
>
>
>
> --
> Karthik Kambatla
> Software Engineer, Cloudera Inc.
> --------------------------------------------
> http://five.sentenc.es
>
>

Re: FileSystem Vs ZKStateStore for RM recovery

Posted by Suma Shivaprasad <su...@gmail.com>.
Can ZKStateStore scale for large clusters. Any idea on the number of
concurrent jobs that can be supported on top of these ?

Thanks
Suma

On Wed, Feb 11, 2015 at 1:45 PM, Karthik Kambatla <ka...@cloudera.com>
wrote:

> We recommend ZK-store, particularly if you plan to deploy multiple
> ResourceManagers with failover. ZK-store ensures a single RM has write
> access and thus is better protected against split-brain cases where both
> RMs think they are active.
>
> On Tue, Feb 10, 2015 at 9:59 PM, Suma Shivaprasad <
> sumasai.shivaprasad@gmail.com> wrote:
>
>> We are planning to deploy Hadoop 2.6.0 with a default configuration to
>> cache 10000 entries in the state store. With a workload of 150-250
>> concurrent applications at any time , which state store is better to use
>> and for what reasons ?
>>
>> Thanks
>> Suma
>>
>
>
>
> --
> Karthik Kambatla
> Software Engineer, Cloudera Inc.
> --------------------------------------------
> http://five.sentenc.es
>
>

Re: FileSystem Vs ZKStateStore for RM recovery

Posted by Suma Shivaprasad <su...@gmail.com>.
Can ZKStateStore scale for large clusters. Any idea on the number of
concurrent jobs that can be supported on top of these ?

Thanks
Suma

On Wed, Feb 11, 2015 at 1:45 PM, Karthik Kambatla <ka...@cloudera.com>
wrote:

> We recommend ZK-store, particularly if you plan to deploy multiple
> ResourceManagers with failover. ZK-store ensures a single RM has write
> access and thus is better protected against split-brain cases where both
> RMs think they are active.
>
> On Tue, Feb 10, 2015 at 9:59 PM, Suma Shivaprasad <
> sumasai.shivaprasad@gmail.com> wrote:
>
>> We are planning to deploy Hadoop 2.6.0 with a default configuration to
>> cache 10000 entries in the state store. With a workload of 150-250
>> concurrent applications at any time , which state store is better to use
>> and for what reasons ?
>>
>> Thanks
>> Suma
>>
>
>
>
> --
> Karthik Kambatla
> Software Engineer, Cloudera Inc.
> --------------------------------------------
> http://five.sentenc.es
>
>

Re: FileSystem Vs ZKStateStore for RM recovery

Posted by Suma Shivaprasad <su...@gmail.com>.
Can ZKStateStore scale for large clusters. Any idea on the number of
concurrent jobs that can be supported on top of these ?

Thanks
Suma

On Wed, Feb 11, 2015 at 1:45 PM, Karthik Kambatla <ka...@cloudera.com>
wrote:

> We recommend ZK-store, particularly if you plan to deploy multiple
> ResourceManagers with failover. ZK-store ensures a single RM has write
> access and thus is better protected against split-brain cases where both
> RMs think they are active.
>
> On Tue, Feb 10, 2015 at 9:59 PM, Suma Shivaprasad <
> sumasai.shivaprasad@gmail.com> wrote:
>
>> We are planning to deploy Hadoop 2.6.0 with a default configuration to
>> cache 10000 entries in the state store. With a workload of 150-250
>> concurrent applications at any time , which state store is better to use
>> and for what reasons ?
>>
>> Thanks
>> Suma
>>
>
>
>
> --
> Karthik Kambatla
> Software Engineer, Cloudera Inc.
> --------------------------------------------
> http://five.sentenc.es
>
>

Re: FileSystem Vs ZKStateStore for RM recovery

Posted by Karthik Kambatla <ka...@cloudera.com>.
We recommend ZK-store, particularly if you plan to deploy multiple
ResourceManagers with failover. ZK-store ensures a single RM has write
access and thus is better protected against split-brain cases where both
RMs think they are active.

On Tue, Feb 10, 2015 at 9:59 PM, Suma Shivaprasad <
sumasai.shivaprasad@gmail.com> wrote:

> We are planning to deploy Hadoop 2.6.0 with a default configuration to
> cache 10000 entries in the state store. With a workload of 150-250
> concurrent applications at any time , which state store is better to use
> and for what reasons ?
>
> Thanks
> Suma
>



-- 
Karthik Kambatla
Software Engineer, Cloudera Inc.
--------------------------------------------
http://five.sentenc.es

Re: FileSystem Vs ZKStateStore for RM recovery

Posted by Karthik Kambatla <ka...@cloudera.com>.
We recommend ZK-store, particularly if you plan to deploy multiple
ResourceManagers with failover. ZK-store ensures a single RM has write
access and thus is better protected against split-brain cases where both
RMs think they are active.

On Tue, Feb 10, 2015 at 9:59 PM, Suma Shivaprasad <
sumasai.shivaprasad@gmail.com> wrote:

> We are planning to deploy Hadoop 2.6.0 with a default configuration to
> cache 10000 entries in the state store. With a workload of 150-250
> concurrent applications at any time , which state store is better to use
> and for what reasons ?
>
> Thanks
> Suma
>



-- 
Karthik Kambatla
Software Engineer, Cloudera Inc.
--------------------------------------------
http://five.sentenc.es

Re: FileSystem Vs ZKStateStore for RM recovery

Posted by Karthik Kambatla <ka...@cloudera.com>.
We recommend ZK-store, particularly if you plan to deploy multiple
ResourceManagers with failover. ZK-store ensures a single RM has write
access and thus is better protected against split-brain cases where both
RMs think they are active.

On Tue, Feb 10, 2015 at 9:59 PM, Suma Shivaprasad <
sumasai.shivaprasad@gmail.com> wrote:

> We are planning to deploy Hadoop 2.6.0 with a default configuration to
> cache 10000 entries in the state store. With a workload of 150-250
> concurrent applications at any time , which state store is better to use
> and for what reasons ?
>
> Thanks
> Suma
>



-- 
Karthik Kambatla
Software Engineer, Cloudera Inc.
--------------------------------------------
http://five.sentenc.es

Re: FileSystem Vs ZKStateStore for RM recovery

Posted by Nikhil <mn...@gmail.com>.
>>cache 10000 entries in the state store.
May I know what are 10000 entries you are talking about here? Any details
please.

On Tue, Feb 10, 2015 at 9:59 PM, Suma Shivaprasad <
sumasai.shivaprasad@gmail.com> wrote:

> We are planning to deploy Hadoop 2.6.0 with a default configuration to
> cache 10000 entries in the state store. With a workload of 150-250
> concurrent applications at any time , which state store is better to use
> and for what reasons ?
>
> Thanks
> Suma
>

Re: FileSystem Vs ZKStateStore for RM recovery

Posted by Nikhil <mn...@gmail.com>.
>>cache 10000 entries in the state store.
May I know what are 10000 entries you are talking about here? Any details
please.

On Tue, Feb 10, 2015 at 9:59 PM, Suma Shivaprasad <
sumasai.shivaprasad@gmail.com> wrote:

> We are planning to deploy Hadoop 2.6.0 with a default configuration to
> cache 10000 entries in the state store. With a workload of 150-250
> concurrent applications at any time , which state store is better to use
> and for what reasons ?
>
> Thanks
> Suma
>

Re: FileSystem Vs ZKStateStore for RM recovery

Posted by Karthik Kambatla <ka...@cloudera.com>.
We recommend ZK-store, particularly if you plan to deploy multiple
ResourceManagers with failover. ZK-store ensures a single RM has write
access and thus is better protected against split-brain cases where both
RMs think they are active.

On Tue, Feb 10, 2015 at 9:59 PM, Suma Shivaprasad <
sumasai.shivaprasad@gmail.com> wrote:

> We are planning to deploy Hadoop 2.6.0 with a default configuration to
> cache 10000 entries in the state store. With a workload of 150-250
> concurrent applications at any time , which state store is better to use
> and for what reasons ?
>
> Thanks
> Suma
>



-- 
Karthik Kambatla
Software Engineer, Cloudera Inc.
--------------------------------------------
http://five.sentenc.es

Re: FileSystem Vs ZKStateStore for RM recovery

Posted by Karthik Kambatla <ka...@cloudera.com>.
We recommend ZK-store, particularly if you plan to deploy multiple
ResourceManagers with failover. ZK-store ensures a single RM has write
access and thus is better protected against split-brain cases where both
RMs think they are active.

On Tue, Feb 10, 2015 at 9:59 PM, Suma Shivaprasad <
sumasai.shivaprasad@gmail.com> wrote:

> We are planning to deploy Hadoop 2.6.0 with a default configuration to
> cache 10000 entries in the state store. With a workload of 150-250
> concurrent applications at any time , which state store is better to use
> and for what reasons ?
>
> Thanks
> Suma
>



-- 
Karthik Kambatla
Software Engineer, Cloudera Inc.
--------------------------------------------
http://five.sentenc.es

Re: FileSystem Vs ZKStateStore for RM recovery

Posted by Nikhil <mn...@gmail.com>.
>>cache 10000 entries in the state store.
May I know what are 10000 entries you are talking about here? Any details
please.

On Tue, Feb 10, 2015 at 9:59 PM, Suma Shivaprasad <
sumasai.shivaprasad@gmail.com> wrote:

> We are planning to deploy Hadoop 2.6.0 with a default configuration to
> cache 10000 entries in the state store. With a workload of 150-250
> concurrent applications at any time , which state store is better to use
> and for what reasons ?
>
> Thanks
> Suma
>