You are viewing a plain text version of this content. The canonical link for it is here.

Posted to dev@geode.apache.org by Wes Williams <ww...@pivotal.io> on 2015/05/26 22:17:59 UTC

Geode Cluster Sizing Online Feature

I think another useful feature to access from the Geode web site is a
system sizing spreadsheet. You plug in object size, # records, key size, #
indexes, whether you have stats enabled, etc., etc. and it gives you the
recommended # cache servers, cpu's.

Where is the request backlog again?

Thanks,

*Wes Williams | Pivotal Sr. **Data Engineer*
781.606.0325
http://pivotal.io/big-data/pivotal-gemfire

Re: Geode Cluster Sizing Online Feature

Posted by Justin Erenkrantz <ju...@erenkrantz.com>.

On Thu, May 28, 2015 at 5:29 AM, Edin Zulich <ez...@pivotal.io> wrote:
>
> Wes,
>
> If I may put in my 2c on this… First, I (and everybody I know) agree that we should keep working on making Geode as easy to use as possible. And we are trying. And we’ll keep on trying.
>
> Second, I also have to say that it will never be easy. And I think that it’s OK, and even good to call that out in an engagement, the sooner the better… Instead of trying to think how to “improve the presentation of EASY” (which I think is misleading), say that it is not easy. Why? Because there is no easy way to build high performance systems. That’s what it comes down to, and that is one of the first things I say to anyone thinking about anything high performance. Forget Geode; take any other in-memory data technology instead: you’ll have sizing and performance challenges; the higher the performance, the harder the challenges.
>
> Having a tool, such as a sizing spreadsheet, does not have to be misleading, as long as it’s made clear that no tool can provide an exact answer, just some kind of approximation. That is why “Sizing a Geode Cluster” talks about the sizing process the way it does.

This will probably make a great ApacheCon EU talk.  *nudge*  (There
will be a significant Big Data focus in Budapest.)

Then, we'll have slides to post on the ApacheGeode slideshare
account...and maybe videos too!  =)

Cheers.  -- justin

Re: Geode Cluster Sizing Online Feature

Posted by Udo Kohlmeyer <uk...@pivotal.io>.

+2 (deserves 2 not 1)
On 28 May 2015 7:29 pm, "Edin Zulich" <ez...@pivotal.io> wrote:

>
> Wes,
>
> If I may put in my 2c on this… First, I (and everybody I know) agree that
> we should keep working on making Geode as easy to use as possible. And we
> are trying. And we’ll keep on trying.
>
> Second, I also have to say that it will never be easy. And I think that
> it’s OK, and even good to call that out in an engagement, the sooner the
> better… Instead of trying to think how to “improve the presentation of
> EASY” (which I think is misleading), say that it is not easy. Why? Because
> there is no easy way to build high performance systems. That’s what it
> comes down to, and that is one of the first things I say to anyone thinking
> about anything high performance. Forget Geode; take any other in-memory
> data technology instead: you’ll have sizing and performance challenges; the
> higher the performance, the harder the challenges.
>
> Having a tool, such as a sizing spreadsheet, does not have to be
> misleading, as long as it’s made clear that no tool can provide an exact
> answer, just some kind of approximation. That is why “Sizing a Geode
> Cluster” talks about the sizing process the way it does.
>
> Edin
>
> On May 27, 2015, at 6:36 PM, Real Wes <Th...@outlook.com> wrote:
>
> > Unfortunately I am painfully aware of that.
> >
> > The motivation is to make Geode/ GemFire EASY (as possible) to use and
> set up as a balance to being painstakingly accurate but simultaneously
> giving the impression of reading “a complex and difficult user’s guide"
> >
> > What prompts this is a do-it-yourself PoC last week where the
> intelligent architect was so frustrated by out-of-memory exceptions that he
> lost 3 days thinking that Geode either had a bug or that he was doing
> something wrong but did not know what. He sized the memory by his own
> intuition but failed to account for index overhead. Also, to your point, he
> was doing heavy put (“insert”) activity that required yet more overhead. I
> forwarded Mike’s spreadsheet to him and he was profusely thankful.
> >
> > I wasn’t aware of the page that you cited and having Mike’s spreadsheet
> online finally is welcome. Still, I’m trying to create an impression for
> Geode/ GemFire as “easy” (as possible). What conveys an impression of
> “complex” is linking to a comprehensive detailed chapter on "Memory
> Requirements for Cached Data <
> http://geode-docs.cfapps.io/docs/reference/topics/memory_requirements_for_cache_data.html>”
> for the user to eventually find out that an index is up to 243 bytes. Can
> we make all of this easier?
> >
> > I propose that we present the calculations along with some check boxes
> or radio buttons that automate the additional calculations of “overflow”,
> “persistence” “# indexes”, “expiration”, “Insert Activity (choose one):
> Heavy, Balanced Insert and Query, Light”?  I am convinced that we can
> improve the presentation of “EASY” as a face to new users and evaluators.
> >
> > Thoughts on making it “EASY”? Or do you think such a tool could not
> avoid being misleading?
> >
> >
> >
> >> On May 26, 2015, at 5:07 PM, William Markito <wm...@pivotal.io>
> wrote:
> >>
> >> Hi Wes, feel free to create a JIRA for that if you would like but please
> >> note that sizing should take into consideration the specifics of the
> >> application which may not be easy captured in such estimating math
> >> efforts...
> >>
> >> Some more information is already provided in our wiki at
> >>
> https://cwiki.apache.org/confluence/display/GEODE/Sizing+a+Geode+Cluster
> >>
> >>
> >>
> >> On Tue, May 26, 2015 at 1:17 PM, Wes Williams <ww...@pivotal.io>
> wrote:
> >>
> >>> I think another useful feature to access from the Geode web site is a
> >>> system sizing spreadsheet. You plug in object size, # records, key
> size, #
> >>> indexes, whether you have stats enabled, etc., etc. and it gives you
> the
> >>> recommended # cache servers, cpu's.
> >>>
> >>> Where is the request backlog again?
> >>>
> >>> Thanks,
> >>>
> >>> *Wes Williams | Pivotal Sr. **Data Engineer*
> >>> 781.606.0325
> >>> http://pivotal.io/big-data/pivotal-gemfire
> >>>
> >>
> >>
> >>
> >> --
> >>
> >> William Markito Oliveira
> >>
> >> -- For questions about Apache Geode, please write to
> >> *dev@geode.incubator.apache.org
> >> <de...@geode.incubator.apache.org>*
> >
>
>

Re: Geode Cluster Sizing Online Feature

Posted by Edin Zulich <ez...@pivotal.io>.

Wes,

If I may put in my 2c on this… First, I (and everybody I know) agree that we should keep working on making Geode as easy to use as possible. And we are trying. And we’ll keep on trying. 

Second, I also have to say that it will never be easy. And I think that it’s OK, and even good to call that out in an engagement, the sooner the better… Instead of trying to think how to “improve the presentation of EASY” (which I think is misleading), say that it is not easy. Why? Because there is no easy way to build high performance systems. That’s what it comes down to, and that is one of the first things I say to anyone thinking about anything high performance. Forget Geode; take any other in-memory data technology instead: you’ll have sizing and performance challenges; the higher the performance, the harder the challenges. 

Having a tool, such as a sizing spreadsheet, does not have to be misleading, as long as it’s made clear that no tool can provide an exact answer, just some kind of approximation. That is why “Sizing a Geode Cluster” talks about the sizing process the way it does. 

Edin

On May 27, 2015, at 6:36 PM, Real Wes <Th...@outlook.com> wrote:

> Unfortunately I am painfully aware of that. 
> 
> The motivation is to make Geode/ GemFire EASY (as possible) to use and set up as a balance to being painstakingly accurate but simultaneously giving the impression of reading “a complex and difficult user’s guide"
> 
> What prompts this is a do-it-yourself PoC last week where the intelligent architect was so frustrated by out-of-memory exceptions that he lost 3 days thinking that Geode either had a bug or that he was doing something wrong but did not know what. He sized the memory by his own intuition but failed to account for index overhead. Also, to your point, he was doing heavy put (“insert”) activity that required yet more overhead. I forwarded Mike’s spreadsheet to him and he was profusely thankful.
> 
> I wasn’t aware of the page that you cited and having Mike’s spreadsheet online finally is welcome. Still, I’m trying to create an impression for Geode/ GemFire as “easy” (as possible). What conveys an impression of “complex” is linking to a comprehensive detailed chapter on "Memory Requirements for Cached Data <http://geode-docs.cfapps.io/docs/reference/topics/memory_requirements_for_cache_data.html>” for the user to eventually find out that an index is up to 243 bytes. Can we make all of this easier?
> 
> I propose that we present the calculations along with some check boxes or radio buttons that automate the additional calculations of “overflow”, “persistence” “# indexes”, “expiration”, “Insert Activity (choose one): Heavy, Balanced Insert and Query, Light”?  I am convinced that we can improve the presentation of “EASY” as a face to new users and evaluators.
> 
> Thoughts on making it “EASY”? Or do you think such a tool could not avoid being misleading?
> 
> 
> 
>> On May 26, 2015, at 5:07 PM, William Markito <wm...@pivotal.io> wrote:
>> 
>> Hi Wes, feel free to create a JIRA for that if you would like but please
>> note that sizing should take into consideration the specifics of the
>> application which may not be easy captured in such estimating math
>> efforts...
>> 
>> Some more information is already provided in our wiki at
>> https://cwiki.apache.org/confluence/display/GEODE/Sizing+a+Geode+Cluster
>> 
>> 
>> 
>> On Tue, May 26, 2015 at 1:17 PM, Wes Williams <ww...@pivotal.io> wrote:
>> 
>>> I think another useful feature to access from the Geode web site is a
>>> system sizing spreadsheet. You plug in object size, # records, key size, #
>>> indexes, whether you have stats enabled, etc., etc. and it gives you the
>>> recommended # cache servers, cpu's.
>>> 
>>> Where is the request backlog again?
>>> 
>>> Thanks,
>>> 
>>> *Wes Williams | Pivotal Sr. **Data Engineer*
>>> 781.606.0325
>>> http://pivotal.io/big-data/pivotal-gemfire
>>> 
>> 
>> 
>> 
>> -- 
>> 
>> William Markito Oliveira
>> 
>> -- For questions about Apache Geode, please write to
>> *dev@geode.incubator.apache.org
>> <de...@geode.incubator.apache.org>*
>

Re: Geode Cluster Sizing Online Feature

Posted by Real Wes <Th...@outlook.com>.

Unfortunately I am painfully aware of that. 

The motivation is to make Geode/ GemFire EASY (as possible) to use and set up as a balance to being painstakingly accurate but simultaneously giving the impression of reading “a complex and difficult user’s guide"

What prompts this is a do-it-yourself PoC last week where the intelligent architect was so frustrated by out-of-memory exceptions that he lost 3 days thinking that Geode either had a bug or that he was doing something wrong but did not know what. He sized the memory by his own intuition but failed to account for index overhead. Also, to your point, he was doing heavy put (“insert”) activity that required yet more overhead. I forwarded Mike’s spreadsheet to him and he was profusely thankful.

I wasn’t aware of the page that you cited and having Mike’s spreadsheet online finally is welcome. Still, I’m trying to create an impression for Geode/ GemFire as “easy” (as possible). What conveys an impression of “complex” is linking to a comprehensive detailed chapter on "Memory Requirements for Cached Data <http://geode-docs.cfapps.io/docs/reference/topics/memory_requirements_for_cache_data.html>” for the user to eventually find out that an index is up to 243 bytes. Can we make all of this easier?

I propose that we present the calculations along with some check boxes or radio buttons that automate the additional calculations of “overflow”, “persistence” “# indexes”, “expiration”, “Insert Activity (choose one): Heavy, Balanced Insert and Query, Light”?  I am convinced that we can improve the presentation of “EASY” as a face to new users and evaluators.

Thoughts on making it “EASY”? Or do you think such a tool could not avoid being misleading?

> On May 26, 2015, at 5:07 PM, William Markito <wm...@pivotal.io> wrote:
> 
> Hi Wes, feel free to create a JIRA for that if you would like but please
> note that sizing should take into consideration the specifics of the
> application which may not be easy captured in such estimating math
> efforts...
> 
> Some more information is already provided in our wiki at
> https://cwiki.apache.org/confluence/display/GEODE/Sizing+a+Geode+Cluster
> 
> 
> 
> On Tue, May 26, 2015 at 1:17 PM, Wes Williams <ww...@pivotal.io> wrote:
> 
>> I think another useful feature to access from the Geode web site is a
>> system sizing spreadsheet. You plug in object size, # records, key size, #
>> indexes, whether you have stats enabled, etc., etc. and it gives you the
>> recommended # cache servers, cpu's.
>> 
>> Where is the request backlog again?
>> 
>> Thanks,
>> 
>> *Wes Williams | Pivotal Sr. **Data Engineer*
>> 781.606.0325
>> http://pivotal.io/big-data/pivotal-gemfire
>> 
> 
> 
> 
> -- 
> 
> William Markito Oliveira
> 
> -- For questions about Apache Geode, please write to
> *dev@geode.incubator.apache.org
> <de...@geode.incubator.apache.org>*

Re: Geode Cluster Sizing Online Feature

Posted by Edin Zulich <ez...@pivotal.io>.

I just updated it to only mention Geode. Thank you, Tommy!

On May 27, 2015, at 12:38 PM, Tommy Jeppesen <tj...@pivotal.io> wrote:

> The spreadsheet refers to gemfire in various places. Did you change it to
> Geode Edin?
> 
> On Wednesday, 27 May 2015, Edin Zulich <ez...@pivotal.io> wrote:
> 
>> Thank you, Mike.
>> 
>> I forgot to attach a sizing spreadsheet. I’ve now attached your
>> spreadsheet to the article, and referenced it from the appropriate section
>> in the article.
>> 
>> Thanks again,
>> 
>> Edin
>> 
>> On May 27, 2015, at 6:37 AM, Michael Stolz <mstolz@pivotal.io
>> <javascript:;>> wrote:
>> 
>>> The linked article (Sizing+a+Geode+Cluster) refers to the sizing
>> spreadsheet several times, but doesn't point to where it can be found.
>>> 
>>> Here is a copy of the one I use. Please find an appropriate place to
>> post it.
>>> 
>>> --
>>> Mike Stolz
>>> Principal Technical Account Manager
>>> Mobile: 631-835-4771
>>> 
>>> On Tue, May 26, 2015 at 5:07 PM, William Markito <wmarkito@pivotal.io
>> <javascript:;>> wrote:
>>> Hi Wes, feel free to create a JIRA for that if you would like but please
>>> note that sizing should take into consideration the specifics of the
>>> application which may not be easy captured in such estimating math
>>> efforts...
>>> 
>>> Some more information is already provided in our wiki at
>>> https://cwiki.apache.org/confluence/display/GEODE/Sizing+a+Geode+Cluster
>>> 
>>> 
>>> 
>>> On Tue, May 26, 2015 at 1:17 PM, Wes Williams <wwilliams@pivotal.io
>> <javascript:;>> wrote:
>>> 
>>>> I think another useful feature to access from the Geode web site is a
>>>> system sizing spreadsheet. You plug in object size, # records, key
>> size, #
>>>> indexes, whether you have stats enabled, etc., etc. and it gives you
>> the
>>>> recommended # cache servers, cpu's.
>>>> 
>>>> Where is the request backlog again?
>>>> 
>>>> Thanks,
>>>> 
>>>> *Wes Williams | Pivotal Sr. **Data Engineer*
>>>> 781.606.0325
>>>> http://pivotal.io/big-data/pivotal-gemfire
>>>> 
>>> 
>>> 
>>> 
>>> --
>>> 
>>> William Markito Oliveira
>>> 
>>> -- For questions about Apache Geode, please write to
>>> *dev@geode.incubator.apache.org <javascript:;>
>>> <dev@geode.incubator.apache.org <javascript:;>>*
>>> 
>>> <System Sizing Worksheet.xlsx>
>> 
>> 
> 
> -- 
> 
> Best regards,
> 
> 
> 
> Tommy Jeppesen
> 
> Data Fabric Technical Support Engineer
> 
> EMEA - Spain
> 
> Email: tjeppesen@pivotal.io
> 
> Mobile: +34 646 878 424

Re: Geode Cluster Sizing Online Feature

Posted by Tommy Jeppesen <tj...@pivotal.io>.

The spreadsheet refers to gemfire in various places. Did you change it to
Geode Edin?

On Wednesday, 27 May 2015, Edin Zulich <ez...@pivotal.io> wrote:

> Thank you, Mike.
>
> I forgot to attach a sizing spreadsheet. I’ve now attached your
> spreadsheet to the article, and referenced it from the appropriate section
> in the article.
>
> Thanks again,
>
> Edin
>
> On May 27, 2015, at 6:37 AM, Michael Stolz <mstolz@pivotal.io
> <javascript:;>> wrote:
>
> > The linked article (Sizing+a+Geode+Cluster) refers to the sizing
> spreadsheet several times, but doesn't point to where it can be found.
> >
> > Here is a copy of the one I use. Please find an appropriate place to
> post it.
> >
> > --
> > Mike Stolz
> > Principal Technical Account Manager
> > Mobile: 631-835-4771
> >
> > On Tue, May 26, 2015 at 5:07 PM, William Markito <wmarkito@pivotal.io
> <javascript:;>> wrote:
> > Hi Wes, feel free to create a JIRA for that if you would like but please
> > note that sizing should take into consideration the specifics of the
> > application which may not be easy captured in such estimating math
> > efforts...
> >
> > Some more information is already provided in our wiki at
> > https://cwiki.apache.org/confluence/display/GEODE/Sizing+a+Geode+Cluster
> >
> >
> >
> > On Tue, May 26, 2015 at 1:17 PM, Wes Williams <wwilliams@pivotal.io
> <javascript:;>> wrote:
> >
> > > I think another useful feature to access from the Geode web site is a
> > > system sizing spreadsheet. You plug in object size, # records, key
> size, #
> > > indexes, whether you have stats enabled, etc., etc. and it gives you
> the
> > > recommended # cache servers, cpu's.
> > >
> > > Where is the request backlog again?
> > >
> > > Thanks,
> > >
> > > *Wes Williams | Pivotal Sr. **Data Engineer*
> > > 781.606.0325
> > > http://pivotal.io/big-data/pivotal-gemfire
> > >
> >
> >
> >
> > --
> >
> > William Markito Oliveira
> >
> > -- For questions about Apache Geode, please write to
> > *dev@geode.incubator.apache.org <javascript:;>
> > <dev@geode.incubator.apache.org <javascript:;>>*
> >
> > <System Sizing Worksheet.xlsx>
>
>

-- 

Best regards,



Tommy Jeppesen

Data Fabric Technical Support Engineer

EMEA - Spain

Email: tjeppesen@pivotal.io

Mobile: +34 646 878 424

Re: Geode Cluster Sizing Online Feature

Posted by Edin Zulich <ez...@pivotal.io>.

Thank you, Mike. 

I forgot to attach a sizing spreadsheet. I’ve now attached your spreadsheet to the article, and referenced it from the appropriate section in the article.

Thanks again,

Edin

On May 27, 2015, at 6:37 AM, Michael Stolz <ms...@pivotal.io> wrote:

> The linked article (Sizing+a+Geode+Cluster) refers to the sizing spreadsheet several times, but doesn't point to where it can be found.
> 
> Here is a copy of the one I use. Please find an appropriate place to post it.
> 
> --
> Mike Stolz
> Principal Technical Account Manager
> Mobile: 631-835-4771
> 
> On Tue, May 26, 2015 at 5:07 PM, William Markito <wm...@pivotal.io> wrote:
> Hi Wes, feel free to create a JIRA for that if you would like but please
> note that sizing should take into consideration the specifics of the
> application which may not be easy captured in such estimating math
> efforts...
> 
> Some more information is already provided in our wiki at
> https://cwiki.apache.org/confluence/display/GEODE/Sizing+a+Geode+Cluster
> 
> 
> 
> On Tue, May 26, 2015 at 1:17 PM, Wes Williams <ww...@pivotal.io> wrote:
> 
> > I think another useful feature to access from the Geode web site is a
> > system sizing spreadsheet. You plug in object size, # records, key size, #
> > indexes, whether you have stats enabled, etc., etc. and it gives you the
> > recommended # cache servers, cpu's.
> >
> > Where is the request backlog again?
> >
> > Thanks,
> >
> > *Wes Williams | Pivotal Sr. **Data Engineer*
> > 781.606.0325
> > http://pivotal.io/big-data/pivotal-gemfire
> >
> 
> 
> 
> --
> 
> William Markito Oliveira
> 
> -- For questions about Apache Geode, please write to
> *dev@geode.incubator.apache.org
> <de...@geode.incubator.apache.org>*
> 
> <System Sizing Worksheet.xlsx>

Re: Geode Cluster Sizing Online Feature

Posted by Michael Stolz <ms...@pivotal.io>.

The linked article (Sizing+a+Geode+Cluster) refers to the sizing
spreadsheet several times, but doesn't point to where it can be found.

Here is a copy of the one I use. Please find an appropriate place to post
it.

--
Mike Stolz
Principal Technical Account Manager
Mobile: 631-835-4771

On Tue, May 26, 2015 at 5:07 PM, William Markito <wm...@pivotal.io>
wrote:

> Hi Wes, feel free to create a JIRA for that if you would like but please
> note that sizing should take into consideration the specifics of the
> application which may not be easy captured in such estimating math
> efforts...
>
> Some more information is already provided in our wiki at
> https://cwiki.apache.org/confluence/display/GEODE/Sizing+a+Geode+Cluster
>
>
>
> On Tue, May 26, 2015 at 1:17 PM, Wes Williams <ww...@pivotal.io>
> wrote:
>
> > I think another useful feature to access from the Geode web site is a
> > system sizing spreadsheet. You plug in object size, # records, key size,
> #
> > indexes, whether you have stats enabled, etc., etc. and it gives you the
> > recommended # cache servers, cpu's.
> >
> > Where is the request backlog again?
> >
> > Thanks,
> >
> > *Wes Williams | Pivotal Sr. **Data Engineer*
> > 781.606.0325
> > http://pivotal.io/big-data/pivotal-gemfire
> >
>
>
>
> --
>
> William Markito Oliveira
>
> -- For questions about Apache Geode, please write to
> *dev@geode.incubator.apache.org
> <de...@geode.incubator.apache.org>*
>

Re: Geode Cluster Sizing Online Feature

Posted by William Markito <wm...@pivotal.io>.

Hi Wes, feel free to create a JIRA for that if you would like but please
note that sizing should take into consideration the specifics of the
application which may not be easy captured in such estimating math
efforts...

Some more information is already provided in our wiki at
https://cwiki.apache.org/confluence/display/GEODE/Sizing+a+Geode+Cluster



On Tue, May 26, 2015 at 1:17 PM, Wes Williams <ww...@pivotal.io> wrote:

> I think another useful feature to access from the Geode web site is a
> system sizing spreadsheet. You plug in object size, # records, key size, #
> indexes, whether you have stats enabled, etc., etc. and it gives you the
> recommended # cache servers, cpu's.
>
> Where is the request backlog again?
>
> Thanks,
>
> *Wes Williams | Pivotal Sr. **Data Engineer*
> 781.606.0325
> http://pivotal.io/big-data/pivotal-gemfire
>



-- 

William Markito Oliveira

-- For questions about Apache Geode, please write to
*dev@geode.incubator.apache.org
<de...@geode.incubator.apache.org>*

Re: Geode Cluster Sizing Online Feature

Posted by Udo Kohlmeyer <uk...@pivotal.io>.

Wes, as you are well aware that this is only indicative sizing in memory.
Given tps requirements this confirmation could change significantly.

I like the spreadsheet for that purpose to estimate memory requirements but
should only ever be used for that.

--Udo
On 27 May 2015 6:18 am, "Wes Williams" <ww...@pivotal.io> wrote:

> I think another useful feature to access from the Geode web site is a
> system sizing spreadsheet. You plug in object size, # records, key size, #
> indexes, whether you have stats enabled, etc., etc. and it gives you the
> recommended # cache servers, cpu's.
>
> Where is the request backlog again?
>
> Thanks,
>
> *Wes Williams | Pivotal Sr. **Data Engineer*
> 781.606.0325
> http://pivotal.io/big-data/pivotal-gemfire
>