You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@ignite.apache.org by Semyon Boikov <sb...@gridgain.com> on 2017/10/03 07:10:00 UTC

Re: Logical Cache Documented

Hi,

Regarding question about  default cache group: by default cache groups are
not enabled, each cache is started in separate group. Cache group is
enabled only if groupName is set in CacheConfiguration.

Thanks

On Sat, Sep 30, 2017 at 11:55 PM, <ds...@apache.org> wrote:

> Why not? Obviously compression would have to be enabled per group, not per
> cache.
>
> ⁣D.​
>
> On Sep 29, 2017, 10:50 PM, at 10:50 PM, Vladimir Ozerov <
> vozerov@gridgain.com> wrote:
> >And it will continue hitting us in future. For example, when data
> >compression is implemented, for logical caches compression rate will be
> >poor, as it would be impossbile to build efficient dictionaries in
> >mixed
> >data pages.
> >
> >On Sat, Sep 30, 2017 at 8:48 AM, Vladimir Ozerov <vo...@gridgain.com>
> >wrote:
> >
> >> Folks,
> >>
> >> Honesly, to me logical caches appears to be a dirty shortcut to
> >mitigate
> >> some inefficient internal implementation. Why can't we merge
> >partition maps
> >> in runtime? This should not be a problem for context-independent
> >affinity
> >> functions (e.g. RendezvousAffinityFunction). From user perspective
> >logic
> >> caches feature is:
> >> 1) Bad API. One cannot define group configuration. All you can do is
> >to
> >> define group name on cache lavel and hope that nobody started another
> >cache
> >> in the same group with different configuration before.
> >> 2) Performance impact for scans, as you have to iterate over mixed
> >data.
> >>
> >> Couldn't we fix partition map problem without cache groups?
> >>
> >> On Sat, Sep 30, 2017 at 2:35 AM, Denis Magda <dm...@apache.org>
> >wrote:
> >>
> >>> Guys,
> >>>
> >>> Another question. Does this capability enabled by default? If yes,
> >how do
> >>> we decide which group a cache goes to?
> >>>
> >>> —
> >>> Denis
> >>>
> >>> > On Sep 29, 2017, at 3:58 PM, Denis Magda <dm...@apache.org>
> >wrote:
> >>> >
> >>> > Igniters,
> >>> >
> >>> > I’ve put on paper the feature from the subj:
> >>> > https://apacheignite.readme.io/docs/logical-caches <
> >>> https://apacheignite.readme.io/docs/logical-caches>
> >>> >
> >>> > Sam, will appreciate if you read through it and confirm I
> >explained the
> >>> topic 100% technically correct.
> >>> >
> >>> > However, are there any negative impacts of having logical caches?
> >This
> >>> page has “Possible Impacts” section unfilled:
> >>> > https://cwiki.apache.org/confluence/display/IGNITE/Logical+Caches
> ><
> >>> https://cwiki.apache.org/confluence/display/IGNITE/Logical+Caches>
> >>> >
> >>> > —
> >>> > Denis
> >>>
> >>>
> >>
>

Re: Logical Cache Documented

Posted by Dmitry Pavlov <dp...@gmail.com>.
+1 to Vladimir. I think it is better to keep this feature disabled by
default. And enable cache groups only in case this feature is required by
user needs.

Enabling this will cause all caches will be placed into one tree.

After collapsing all B+-trees into one we will get log(N1+N2+N3) instead of
log(N1) complexity.

вт, 3 окт. 2017 г. в 22:48, Denis Magda <dm...@apache.org>:

> Vladimir,
>
> Please share more details that I can put on the paper. Presently the
> feature is described as a must have and I struggled finding any negative
> impact related info.
>
> —
> Denis
>
> > On Oct 3, 2017, at 12:46 PM, Vladimir Ozerov <vo...@gridgain.com>
> wrote:
> >
> > Denis,
> >
> > This feature should not be enabled by default as it negatively affects
> read
> > performance.
> >
> > On Tue, Oct 3, 2017 at 10:31 PM, Denis Magda <dm...@apache.org> wrote:
> >
> >> Sam,
> >>
> >> Is there any technical limitation that prevents us from assigning caches
> >> with similar parameters to relevant groups on-the-fly?
> >>
> >> After finishing the doc, I’m convinced the feature should be enabled by
> >> default unless there are some pitfalls not known by me.
> >>
> >> BTW, decided to avoid logical caches term usage falling back to vivid
> >> cache groups notion:
> >> https://apacheignite.readme.io/docs/cache-groups <
> >> https://apacheignite.readme.io/docs/cache-groups>
> >>
> >> —
> >> Denis
> >>
> >>> On Oct 3, 2017, at 12:10 AM, Semyon Boikov <sb...@gridgain.com>
> wrote:
> >>>
> >>> Hi,
> >>>
> >>> Regarding question about  default cache group: by default cache groups
> >> are
> >>> not enabled, each cache is started in separate group. Cache group is
> >>> enabled only if groupName is set in CacheConfiguration.
> >>>
> >>> Thanks
> >>>
> >>> On Sat, Sep 30, 2017 at 11:55 PM, <ds...@apache.org> wrote:
> >>>
> >>>> Why not? Obviously compression would have to be enabled per group, not
> >> per
> >>>> cache.
> >>>>
> >>>> ⁣D.​
> >>>>
> >>>> On Sep 29, 2017, 10:50 PM, at 10:50 PM, Vladimir Ozerov <
> >>>> vozerov@gridgain.com> wrote:
> >>>>> And it will continue hitting us in future. For example, when data
> >>>>> compression is implemented, for logical caches compression rate will
> be
> >>>>> poor, as it would be impossbile to build efficient dictionaries in
> >>>>> mixed
> >>>>> data pages.
> >>>>>
> >>>>> On Sat, Sep 30, 2017 at 8:48 AM, Vladimir Ozerov <
> vozerov@gridgain.com
> >>>
> >>>>> wrote:
> >>>>>
> >>>>>> Folks,
> >>>>>>
> >>>>>> Honesly, to me logical caches appears to be a dirty shortcut to
> >>>>> mitigate
> >>>>>> some inefficient internal implementation. Why can't we merge
> >>>>> partition maps
> >>>>>> in runtime? This should not be a problem for context-independent
> >>>>> affinity
> >>>>>> functions (e.g. RendezvousAffinityFunction). From user perspective
> >>>>> logic
> >>>>>> caches feature is:
> >>>>>> 1) Bad API. One cannot define group configuration. All you can do is
> >>>>> to
> >>>>>> define group name on cache lavel and hope that nobody started
> another
> >>>>> cache
> >>>>>> in the same group with different configuration before.
> >>>>>> 2) Performance impact for scans, as you have to iterate over mixed
> >>>>> data.
> >>>>>>
> >>>>>> Couldn't we fix partition map problem without cache groups?
> >>>>>>
> >>>>>> On Sat, Sep 30, 2017 at 2:35 AM, Denis Magda <dm...@apache.org>
> >>>>> wrote:
> >>>>>>
> >>>>>>> Guys,
> >>>>>>>
> >>>>>>> Another question. Does this capability enabled by default? If yes,
> >>>>> how do
> >>>>>>> we decide which group a cache goes to?
> >>>>>>>
> >>>>>>> —
> >>>>>>> Denis
> >>>>>>>
> >>>>>>>> On Sep 29, 2017, at 3:58 PM, Denis Magda <dm...@apache.org>
> >>>>> wrote:
> >>>>>>>>
> >>>>>>>> Igniters,
> >>>>>>>>
> >>>>>>>> I’ve put on paper the feature from the subj:
> >>>>>>>> https://apacheignite.readme.io/docs/logical-caches <
> >>>>>>> https://apacheignite.readme.io/docs/logical-caches>
> >>>>>>>>
> >>>>>>>> Sam, will appreciate if you read through it and confirm I
> >>>>> explained the
> >>>>>>> topic 100% technically correct.
> >>>>>>>>
> >>>>>>>> However, are there any negative impacts of having logical caches?
> >>>>> This
> >>>>>>> page has “Possible Impacts” section unfilled:
> >>>>>>>> https://cwiki.apache.org/confluence/display/IGNITE/Logical+Caches
> >>>>> <
> >>>>>>> https://cwiki.apache.org/confluence/display/IGNITE/Logical+Caches>
> >>>>>>>>
> >>>>>>>> —
> >>>>>>>> Denis
> >>>>>>>
> >>>>>>>
> >>>>>>
> >>>>
> >>
> >>
>
>

Re: Logical Cache Documented

Posted by Dmitry Pavlov <dp...@gmail.com>.
Hi, bigger B+ Tree = more operations to find a value.

User may expect that cache having 20 entries (e.g. dictionary) will have
great performance on get() and put().

But instead (if the 1 global cache group became default), such caches will
take the same amount of time as the huge cache with millions of records.

ср, 4 окт. 2017 г. в 8:39, Vladimir Ozerov <vo...@gridgain.com>:

> I do not think that bigger B+Tree matter much. I was talking only about
> data blocks. When you have a lot of logical caches, all of them are mixed
> in the same data blocks. As a result you typically have to perform more IO
> operations to read the same amount of data, as data block content becomes
> more "chaotic".
>
> Currently all scans go through primary index.
>
> On Wed, Oct 4, 2017 at 12:24 AM, Denis Magda <dm...@apache.org> wrote:
>
> > Vladimir,
> >
> > Thanks for the explanation and see inline
> >
> > > On Oct 3, 2017, at 12:57 PM, Vladimir Ozerov <vo...@gridgain.com>
> > wrote:
> > >
> > > Denis,
> > >
> > > This is not a "must have", neither I can name it a "feature". We have
> > > internal partition state metadata. When there is a lot of caches, there
> > is
> > > a lot of metadata. It consumes local Java heap, causes high network
> > traffic
> > > on rebalance, and require Ignite to create a lot of files when
> > persistence
> > > is enabled, what slows down checkpoints. All these problems could be
> > > resolved by better storage architecture and "joining" of partition maps
> > of
> > > caches with same affinity functions in runtime.
> > >
> > > But this is difficult, so we created "cache groups" as a kind of
> > shortcut.
> > > It saves heap, saves network, and reduces number of files. But it comes
> > at
> > > a cost - now single data page contain data from different caches. This
> > > causes higher than usual miss rate (and as a result more OS calls) for
> > > random cache operations and index lookups.
> >
> > Do you mean longer traverse of the b+tree under the "higher miss rate”?
> > Has anybody measured the impact? Personally, for me log(n1) is not that
> > different from log(n1 + n2 + n3) unless n is a big coefficient.
> >
> >
> > > In future it will also cause
> > > poor compression rates when compression is implemented, and it will
> cause
> > > poor scan performance when efficient scans are implemented.
> > >
> >
> > How do we scan grouped caches presently? Simply filtering out the entries
> > not belonging to a cache of interest?
> >
> > > To summarize, we *SHOULD NOT* advise users to use this feature unless
> > they
> > > have problems with high heap usage due to partition maps, or poor
> > > chekpointing performance due to excessive fsyncs.
> > >
> >
> > Ivan R., Alex G., could you comment on the checkpointing performance? I
> > don’t get why a number of opened files affects it. What should matter is
> > the frequency of fsync, shouldn’t it? If we have fewer files then the
> > frequency will soar since every cache writes into a single destination.
> >
> > Vladimir, what’s about long joining process and rebalancing kick-off on
> > node failure? I heard an amount of partition maps influences on this and
> > put this on paper.
> >
> > —
> > Denis
> >
> > > On Tue, Oct 3, 2017 at 10:48 PM, Denis Magda <dm...@apache.org>
> wrote:
> > >
> > >> Vladimir,
> > >>
> > >> Please share more details that I can put on the paper. Presently the
> > >> feature is described as a must have and I struggled finding any
> negative
> > >> impact related info.
> > >>
> > >> —
> > >> Denis
> > >>
> > >>> On Oct 3, 2017, at 12:46 PM, Vladimir Ozerov <vo...@gridgain.com>
> > >> wrote:
> > >>>
> > >>> Denis,
> > >>>
> > >>> This feature should not be enabled by default as it negatively
> affects
> > >> read
> > >>> performance.
> > >>>
> > >>> On Tue, Oct 3, 2017 at 10:31 PM, Denis Magda <dm...@apache.org>
> > wrote:
> > >>>
> > >>>> Sam,
> > >>>>
> > >>>> Is there any technical limitation that prevents us from assigning
> > caches
> > >>>> with similar parameters to relevant groups on-the-fly?
> > >>>>
> > >>>> After finishing the doc, I’m convinced the feature should be enabled
> > by
> > >>>> default unless there are some pitfalls not known by me.
> > >>>>
> > >>>> BTW, decided to avoid logical caches term usage falling back to
> vivid
> > >>>> cache groups notion:
> > >>>> https://apacheignite.readme.io/docs/cache-groups <
> > >>>> https://apacheignite.readme.io/docs/cache-groups>
> > >>>>
> > >>>> —
> > >>>> Denis
> > >>>>
> > >>>>> On Oct 3, 2017, at 12:10 AM, Semyon Boikov <sb...@gridgain.com>
> > >> wrote:
> > >>>>>
> > >>>>> Hi,
> > >>>>>
> > >>>>> Regarding question about  default cache group: by default cache
> > groups
> > >>>> are
> > >>>>> not enabled, each cache is started in separate group. Cache group
> is
> > >>>>> enabled only if groupName is set in CacheConfiguration.
> > >>>>>
> > >>>>> Thanks
> > >>>>>
> > >>>>> On Sat, Sep 30, 2017 at 11:55 PM, <ds...@apache.org> wrote:
> > >>>>>
> > >>>>>> Why not? Obviously compression would have to be enabled per group,
> > not
> > >>>> per
> > >>>>>> cache.
> > >>>>>>
> > >>>>>> ⁣D.​
> > >>>>>>
> > >>>>>> On Sep 29, 2017, 10:50 PM, at 10:50 PM, Vladimir Ozerov <
> > >>>>>> vozerov@gridgain.com> wrote:
> > >>>>>>> And it will continue hitting us in future. For example, when data
> > >>>>>>> compression is implemented, for logical caches compression rate
> > will
> > >> be
> > >>>>>>> poor, as it would be impossbile to build efficient dictionaries
> in
> > >>>>>>> mixed
> > >>>>>>> data pages.
> > >>>>>>>
> > >>>>>>> On Sat, Sep 30, 2017 at 8:48 AM, Vladimir Ozerov <
> > >> vozerov@gridgain.com
> > >>>>>
> > >>>>>>> wrote:
> > >>>>>>>
> > >>>>>>>> Folks,
> > >>>>>>>>
> > >>>>>>>> Honesly, to me logical caches appears to be a dirty shortcut to
> > >>>>>>> mitigate
> > >>>>>>>> some inefficient internal implementation. Why can't we merge
> > >>>>>>> partition maps
> > >>>>>>>> in runtime? This should not be a problem for context-independent
> > >>>>>>> affinity
> > >>>>>>>> functions (e.g. RendezvousAffinityFunction). From user
> perspective
> > >>>>>>> logic
> > >>>>>>>> caches feature is:
> > >>>>>>>> 1) Bad API. One cannot define group configuration. All you can
> do
> > is
> > >>>>>>> to
> > >>>>>>>> define group name on cache lavel and hope that nobody started
> > >> another
> > >>>>>>> cache
> > >>>>>>>> in the same group with different configuration before.
> > >>>>>>>> 2) Performance impact for scans, as you have to iterate over
> mixed
> > >>>>>>> data.
> > >>>>>>>>
> > >>>>>>>> Couldn't we fix partition map problem without cache groups?
> > >>>>>>>>
> > >>>>>>>> On Sat, Sep 30, 2017 at 2:35 AM, Denis Magda <dmagda@apache.org
> >
> > >>>>>>> wrote:
> > >>>>>>>>
> > >>>>>>>>> Guys,
> > >>>>>>>>>
> > >>>>>>>>> Another question. Does this capability enabled by default? If
> > yes,
> > >>>>>>> how do
> > >>>>>>>>> we decide which group a cache goes to?
> > >>>>>>>>>
> > >>>>>>>>> —
> > >>>>>>>>> Denis
> > >>>>>>>>>
> > >>>>>>>>>> On Sep 29, 2017, at 3:58 PM, Denis Magda <dm...@apache.org>
> > >>>>>>> wrote:
> > >>>>>>>>>>
> > >>>>>>>>>> Igniters,
> > >>>>>>>>>>
> > >>>>>>>>>> I’ve put on paper the feature from the subj:
> > >>>>>>>>>> https://apacheignite.readme.io/docs/logical-caches <
> > >>>>>>>>> https://apacheignite.readme.io/docs/logical-caches>
> > >>>>>>>>>>
> > >>>>>>>>>> Sam, will appreciate if you read through it and confirm I
> > >>>>>>> explained the
> > >>>>>>>>> topic 100% technically correct.
> > >>>>>>>>>>
> > >>>>>>>>>> However, are there any negative impacts of having logical
> > caches?
> > >>>>>>> This
> > >>>>>>>>> page has “Possible Impacts” section unfilled:
> > >>>>>>>>>> https://cwiki.apache.org/confluence/display/IGNITE/
> > Logical+Caches
> > >>>>>>> <
> > >>>>>>>>> https://cwiki.apache.org/confluence/display/IGNITE/
> > Logical+Caches>
> > >>>>>>>>>>
> > >>>>>>>>>> —
> > >>>>>>>>>> Denis
> > >>>>>>>>>
> > >>>>>>>>>
> > >>>>>>>>
> > >>>>>>
> > >>>>
> > >>>>
> > >>
> > >>
> >
> >
>

Re: Logical Cache Documented

Posted by Vladimir Ozerov <vo...@gridgain.com>.
I do not think that bigger B+Tree matter much. I was talking only about
data blocks. When you have a lot of logical caches, all of them are mixed
in the same data blocks. As a result you typically have to perform more IO
operations to read the same amount of data, as data block content becomes
more "chaotic".

Currently all scans go through primary index.

On Wed, Oct 4, 2017 at 12:24 AM, Denis Magda <dm...@apache.org> wrote:

> Vladimir,
>
> Thanks for the explanation and see inline
>
> > On Oct 3, 2017, at 12:57 PM, Vladimir Ozerov <vo...@gridgain.com>
> wrote:
> >
> > Denis,
> >
> > This is not a "must have", neither I can name it a "feature". We have
> > internal partition state metadata. When there is a lot of caches, there
> is
> > a lot of metadata. It consumes local Java heap, causes high network
> traffic
> > on rebalance, and require Ignite to create a lot of files when
> persistence
> > is enabled, what slows down checkpoints. All these problems could be
> > resolved by better storage architecture and "joining" of partition maps
> of
> > caches with same affinity functions in runtime.
> >
> > But this is difficult, so we created "cache groups" as a kind of
> shortcut.
> > It saves heap, saves network, and reduces number of files. But it comes
> at
> > a cost - now single data page contain data from different caches. This
> > causes higher than usual miss rate (and as a result more OS calls) for
> > random cache operations and index lookups.
>
> Do you mean longer traverse of the b+tree under the "higher miss rate”?
> Has anybody measured the impact? Personally, for me log(n1) is not that
> different from log(n1 + n2 + n3) unless n is a big coefficient.
>
>
> > In future it will also cause
> > poor compression rates when compression is implemented, and it will cause
> > poor scan performance when efficient scans are implemented.
> >
>
> How do we scan grouped caches presently? Simply filtering out the entries
> not belonging to a cache of interest?
>
> > To summarize, we *SHOULD NOT* advise users to use this feature unless
> they
> > have problems with high heap usage due to partition maps, or poor
> > chekpointing performance due to excessive fsyncs.
> >
>
> Ivan R., Alex G., could you comment on the checkpointing performance? I
> don’t get why a number of opened files affects it. What should matter is
> the frequency of fsync, shouldn’t it? If we have fewer files then the
> frequency will soar since every cache writes into a single destination.
>
> Vladimir, what’s about long joining process and rebalancing kick-off on
> node failure? I heard an amount of partition maps influences on this and
> put this on paper.
>
> —
> Denis
>
> > On Tue, Oct 3, 2017 at 10:48 PM, Denis Magda <dm...@apache.org> wrote:
> >
> >> Vladimir,
> >>
> >> Please share more details that I can put on the paper. Presently the
> >> feature is described as a must have and I struggled finding any negative
> >> impact related info.
> >>
> >> —
> >> Denis
> >>
> >>> On Oct 3, 2017, at 12:46 PM, Vladimir Ozerov <vo...@gridgain.com>
> >> wrote:
> >>>
> >>> Denis,
> >>>
> >>> This feature should not be enabled by default as it negatively affects
> >> read
> >>> performance.
> >>>
> >>> On Tue, Oct 3, 2017 at 10:31 PM, Denis Magda <dm...@apache.org>
> wrote:
> >>>
> >>>> Sam,
> >>>>
> >>>> Is there any technical limitation that prevents us from assigning
> caches
> >>>> with similar parameters to relevant groups on-the-fly?
> >>>>
> >>>> After finishing the doc, I’m convinced the feature should be enabled
> by
> >>>> default unless there are some pitfalls not known by me.
> >>>>
> >>>> BTW, decided to avoid logical caches term usage falling back to vivid
> >>>> cache groups notion:
> >>>> https://apacheignite.readme.io/docs/cache-groups <
> >>>> https://apacheignite.readme.io/docs/cache-groups>
> >>>>
> >>>> —
> >>>> Denis
> >>>>
> >>>>> On Oct 3, 2017, at 12:10 AM, Semyon Boikov <sb...@gridgain.com>
> >> wrote:
> >>>>>
> >>>>> Hi,
> >>>>>
> >>>>> Regarding question about  default cache group: by default cache
> groups
> >>>> are
> >>>>> not enabled, each cache is started in separate group. Cache group is
> >>>>> enabled only if groupName is set in CacheConfiguration.
> >>>>>
> >>>>> Thanks
> >>>>>
> >>>>> On Sat, Sep 30, 2017 at 11:55 PM, <ds...@apache.org> wrote:
> >>>>>
> >>>>>> Why not? Obviously compression would have to be enabled per group,
> not
> >>>> per
> >>>>>> cache.
> >>>>>>
> >>>>>> ⁣D.​
> >>>>>>
> >>>>>> On Sep 29, 2017, 10:50 PM, at 10:50 PM, Vladimir Ozerov <
> >>>>>> vozerov@gridgain.com> wrote:
> >>>>>>> And it will continue hitting us in future. For example, when data
> >>>>>>> compression is implemented, for logical caches compression rate
> will
> >> be
> >>>>>>> poor, as it would be impossbile to build efficient dictionaries in
> >>>>>>> mixed
> >>>>>>> data pages.
> >>>>>>>
> >>>>>>> On Sat, Sep 30, 2017 at 8:48 AM, Vladimir Ozerov <
> >> vozerov@gridgain.com
> >>>>>
> >>>>>>> wrote:
> >>>>>>>
> >>>>>>>> Folks,
> >>>>>>>>
> >>>>>>>> Honesly, to me logical caches appears to be a dirty shortcut to
> >>>>>>> mitigate
> >>>>>>>> some inefficient internal implementation. Why can't we merge
> >>>>>>> partition maps
> >>>>>>>> in runtime? This should not be a problem for context-independent
> >>>>>>> affinity
> >>>>>>>> functions (e.g. RendezvousAffinityFunction). From user perspective
> >>>>>>> logic
> >>>>>>>> caches feature is:
> >>>>>>>> 1) Bad API. One cannot define group configuration. All you can do
> is
> >>>>>>> to
> >>>>>>>> define group name on cache lavel and hope that nobody started
> >> another
> >>>>>>> cache
> >>>>>>>> in the same group with different configuration before.
> >>>>>>>> 2) Performance impact for scans, as you have to iterate over mixed
> >>>>>>> data.
> >>>>>>>>
> >>>>>>>> Couldn't we fix partition map problem without cache groups?
> >>>>>>>>
> >>>>>>>> On Sat, Sep 30, 2017 at 2:35 AM, Denis Magda <dm...@apache.org>
> >>>>>>> wrote:
> >>>>>>>>
> >>>>>>>>> Guys,
> >>>>>>>>>
> >>>>>>>>> Another question. Does this capability enabled by default? If
> yes,
> >>>>>>> how do
> >>>>>>>>> we decide which group a cache goes to?
> >>>>>>>>>
> >>>>>>>>> —
> >>>>>>>>> Denis
> >>>>>>>>>
> >>>>>>>>>> On Sep 29, 2017, at 3:58 PM, Denis Magda <dm...@apache.org>
> >>>>>>> wrote:
> >>>>>>>>>>
> >>>>>>>>>> Igniters,
> >>>>>>>>>>
> >>>>>>>>>> I’ve put on paper the feature from the subj:
> >>>>>>>>>> https://apacheignite.readme.io/docs/logical-caches <
> >>>>>>>>> https://apacheignite.readme.io/docs/logical-caches>
> >>>>>>>>>>
> >>>>>>>>>> Sam, will appreciate if you read through it and confirm I
> >>>>>>> explained the
> >>>>>>>>> topic 100% technically correct.
> >>>>>>>>>>
> >>>>>>>>>> However, are there any negative impacts of having logical
> caches?
> >>>>>>> This
> >>>>>>>>> page has “Possible Impacts” section unfilled:
> >>>>>>>>>> https://cwiki.apache.org/confluence/display/IGNITE/
> Logical+Caches
> >>>>>>> <
> >>>>>>>>> https://cwiki.apache.org/confluence/display/IGNITE/
> Logical+Caches>
> >>>>>>>>>>
> >>>>>>>>>> —
> >>>>>>>>>> Denis
> >>>>>>>>>
> >>>>>>>>>
> >>>>>>>>
> >>>>>>
> >>>>
> >>>>
> >>
> >>
>
>

Re: Logical Cache Documented

Posted by Alexey Goncharuk <al...@gmail.com>.
Denis,

There is an overhead for each separate fsync syscall which may become
significant with a large number of files, so having fewer files will
perform better in general. As for the heap structures overhead, this is
true that on large topologies having several cache groups will
significantly improve heap usage.

2017-10-04 0:24 GMT+03:00 Denis Magda <dm...@apache.org>:

> Vladimir,
>
> Thanks for the explanation and see inline
>
> > On Oct 3, 2017, at 12:57 PM, Vladimir Ozerov <vo...@gridgain.com>
> wrote:
> >
> > Denis,
> >
> > This is not a "must have", neither I can name it a "feature". We have
> > internal partition state metadata. When there is a lot of caches, there
> is
> > a lot of metadata. It consumes local Java heap, causes high network
> traffic
> > on rebalance, and require Ignite to create a lot of files when
> persistence
> > is enabled, what slows down checkpoints. All these problems could be
> > resolved by better storage architecture and "joining" of partition maps
> of
> > caches with same affinity functions in runtime.
> >
> > But this is difficult, so we created "cache groups" as a kind of
> shortcut.
> > It saves heap, saves network, and reduces number of files. But it comes
> at
> > a cost - now single data page contain data from different caches. This
> > causes higher than usual miss rate (and as a result more OS calls) for
> > random cache operations and index lookups.
>
> Do you mean longer traverse of the b+tree under the "higher miss rate”?
> Has anybody measured the impact? Personally, for me log(n1) is not that
> different from log(n1 + n2 + n3) unless n is a big coefficient.
>
>
> > In future it will also cause
> > poor compression rates when compression is implemented, and it will cause
> > poor scan performance when efficient scans are implemented.
> >
>
> How do we scan grouped caches presently? Simply filtering out the entries
> not belonging to a cache of interest?
>
> > To summarize, we *SHOULD NOT* advise users to use this feature unless
> they
> > have problems with high heap usage due to partition maps, or poor
> > chekpointing performance due to excessive fsyncs.
> >
>
> Ivan R., Alex G., could you comment on the checkpointing performance? I
> don’t get why a number of opened files affects it. What should matter is
> the frequency of fsync, shouldn’t it? If we have fewer files then the
> frequency will soar since every cache writes into a single destination.
>
> Vladimir, what’s about long joining process and rebalancing kick-off on
> node failure? I heard an amount of partition maps influences on this and
> put this on paper.
>
> —
> Denis
>
> > On Tue, Oct 3, 2017 at 10:48 PM, Denis Magda <dm...@apache.org> wrote:
> >
> >> Vladimir,
> >>
> >> Please share more details that I can put on the paper. Presently the
> >> feature is described as a must have and I struggled finding any negative
> >> impact related info.
> >>
> >> —
> >> Denis
> >>
> >>> On Oct 3, 2017, at 12:46 PM, Vladimir Ozerov <vo...@gridgain.com>
> >> wrote:
> >>>
> >>> Denis,
> >>>
> >>> This feature should not be enabled by default as it negatively affects
> >> read
> >>> performance.
> >>>
> >>> On Tue, Oct 3, 2017 at 10:31 PM, Denis Magda <dm...@apache.org>
> wrote:
> >>>
> >>>> Sam,
> >>>>
> >>>> Is there any technical limitation that prevents us from assigning
> caches
> >>>> with similar parameters to relevant groups on-the-fly?
> >>>>
> >>>> After finishing the doc, I’m convinced the feature should be enabled
> by
> >>>> default unless there are some pitfalls not known by me.
> >>>>
> >>>> BTW, decided to avoid logical caches term usage falling back to vivid
> >>>> cache groups notion:
> >>>> https://apacheignite.readme.io/docs/cache-groups <
> >>>> https://apacheignite.readme.io/docs/cache-groups>
> >>>>
> >>>> —
> >>>> Denis
> >>>>
> >>>>> On Oct 3, 2017, at 12:10 AM, Semyon Boikov <sb...@gridgain.com>
> >> wrote:
> >>>>>
> >>>>> Hi,
> >>>>>
> >>>>> Regarding question about  default cache group: by default cache
> groups
> >>>> are
> >>>>> not enabled, each cache is started in separate group. Cache group is
> >>>>> enabled only if groupName is set in CacheConfiguration.
> >>>>>
> >>>>> Thanks
> >>>>>
> >>>>> On Sat, Sep 30, 2017 at 11:55 PM, <ds...@apache.org> wrote:
> >>>>>
> >>>>>> Why not? Obviously compression would have to be enabled per group,
> not
> >>>> per
> >>>>>> cache.
> >>>>>>
> >>>>>> ⁣D.​
> >>>>>>
> >>>>>> On Sep 29, 2017, 10:50 PM, at 10:50 PM, Vladimir Ozerov <
> >>>>>> vozerov@gridgain.com> wrote:
> >>>>>>> And it will continue hitting us in future. For example, when data
> >>>>>>> compression is implemented, for logical caches compression rate
> will
> >> be
> >>>>>>> poor, as it would be impossbile to build efficient dictionaries in
> >>>>>>> mixed
> >>>>>>> data pages.
> >>>>>>>
> >>>>>>> On Sat, Sep 30, 2017 at 8:48 AM, Vladimir Ozerov <
> >> vozerov@gridgain.com
> >>>>>
> >>>>>>> wrote:
> >>>>>>>
> >>>>>>>> Folks,
> >>>>>>>>
> >>>>>>>> Honesly, to me logical caches appears to be a dirty shortcut to
> >>>>>>> mitigate
> >>>>>>>> some inefficient internal implementation. Why can't we merge
> >>>>>>> partition maps
> >>>>>>>> in runtime? This should not be a problem for context-independent
> >>>>>>> affinity
> >>>>>>>> functions (e.g. RendezvousAffinityFunction). From user perspective
> >>>>>>> logic
> >>>>>>>> caches feature is:
> >>>>>>>> 1) Bad API. One cannot define group configuration. All you can do
> is
> >>>>>>> to
> >>>>>>>> define group name on cache lavel and hope that nobody started
> >> another
> >>>>>>> cache
> >>>>>>>> in the same group with different configuration before.
> >>>>>>>> 2) Performance impact for scans, as you have to iterate over mixed
> >>>>>>> data.
> >>>>>>>>
> >>>>>>>> Couldn't we fix partition map problem without cache groups?
> >>>>>>>>
> >>>>>>>> On Sat, Sep 30, 2017 at 2:35 AM, Denis Magda <dm...@apache.org>
> >>>>>>> wrote:
> >>>>>>>>
> >>>>>>>>> Guys,
> >>>>>>>>>
> >>>>>>>>> Another question. Does this capability enabled by default? If
> yes,
> >>>>>>> how do
> >>>>>>>>> we decide which group a cache goes to?
> >>>>>>>>>
> >>>>>>>>> —
> >>>>>>>>> Denis
> >>>>>>>>>
> >>>>>>>>>> On Sep 29, 2017, at 3:58 PM, Denis Magda <dm...@apache.org>
> >>>>>>> wrote:
> >>>>>>>>>>
> >>>>>>>>>> Igniters,
> >>>>>>>>>>
> >>>>>>>>>> I’ve put on paper the feature from the subj:
> >>>>>>>>>> https://apacheignite.readme.io/docs/logical-caches <
> >>>>>>>>> https://apacheignite.readme.io/docs/logical-caches>
> >>>>>>>>>>
> >>>>>>>>>> Sam, will appreciate if you read through it and confirm I
> >>>>>>> explained the
> >>>>>>>>> topic 100% technically correct.
> >>>>>>>>>>
> >>>>>>>>>> However, are there any negative impacts of having logical
> caches?
> >>>>>>> This
> >>>>>>>>> page has “Possible Impacts” section unfilled:
> >>>>>>>>>> https://cwiki.apache.org/confluence/display/IGNITE/
> Logical+Caches
> >>>>>>> <
> >>>>>>>>> https://cwiki.apache.org/confluence/display/IGNITE/
> Logical+Caches>
> >>>>>>>>>>
> >>>>>>>>>> —
> >>>>>>>>>> Denis
> >>>>>>>>>
> >>>>>>>>>
> >>>>>>>>
> >>>>>>
> >>>>
> >>>>
> >>
> >>
>
>

Re: Logical Cache Documented

Posted by Denis Magda <dm...@apache.org>.
Vladimir, 

Thanks for the explanation and see inline

> On Oct 3, 2017, at 12:57 PM, Vladimir Ozerov <vo...@gridgain.com> wrote:
> 
> Denis,
> 
> This is not a "must have", neither I can name it a "feature". We have
> internal partition state metadata. When there is a lot of caches, there is
> a lot of metadata. It consumes local Java heap, causes high network traffic
> on rebalance, and require Ignite to create a lot of files when persistence
> is enabled, what slows down checkpoints. All these problems could be
> resolved by better storage architecture and "joining" of partition maps of
> caches with same affinity functions in runtime.
> 
> But this is difficult, so we created "cache groups" as a kind of shortcut.
> It saves heap, saves network, and reduces number of files. But it comes at
> a cost - now single data page contain data from different caches. This
> causes higher than usual miss rate (and as a result more OS calls) for
> random cache operations and index lookups.

Do you mean longer traverse of the b+tree under the "higher miss rate”? Has anybody measured the impact? Personally, for me log(n1) is not that different from log(n1 + n2 + n3) unless n is a big coefficient.


> In future it will also cause
> poor compression rates when compression is implemented, and it will cause
> poor scan performance when efficient scans are implemented.
> 

How do we scan grouped caches presently? Simply filtering out the entries not belonging to a cache of interest? 

> To summarize, we *SHOULD NOT* advise users to use this feature unless they
> have problems with high heap usage due to partition maps, or poor
> chekpointing performance due to excessive fsyncs.
> 

Ivan R., Alex G., could you comment on the checkpointing performance? I don’t get why a number of opened files affects it. What should matter is the frequency of fsync, shouldn’t it? If we have fewer files then the frequency will soar since every cache writes into a single destination.

Vladimir, what’s about long joining process and rebalancing kick-off on node failure? I heard an amount of partition maps influences on this and put this on paper.

—
Denis

> On Tue, Oct 3, 2017 at 10:48 PM, Denis Magda <dm...@apache.org> wrote:
> 
>> Vladimir,
>> 
>> Please share more details that I can put on the paper. Presently the
>> feature is described as a must have and I struggled finding any negative
>> impact related info.
>> 
>> —
>> Denis
>> 
>>> On Oct 3, 2017, at 12:46 PM, Vladimir Ozerov <vo...@gridgain.com>
>> wrote:
>>> 
>>> Denis,
>>> 
>>> This feature should not be enabled by default as it negatively affects
>> read
>>> performance.
>>> 
>>> On Tue, Oct 3, 2017 at 10:31 PM, Denis Magda <dm...@apache.org> wrote:
>>> 
>>>> Sam,
>>>> 
>>>> Is there any technical limitation that prevents us from assigning caches
>>>> with similar parameters to relevant groups on-the-fly?
>>>> 
>>>> After finishing the doc, I’m convinced the feature should be enabled by
>>>> default unless there are some pitfalls not known by me.
>>>> 
>>>> BTW, decided to avoid logical caches term usage falling back to vivid
>>>> cache groups notion:
>>>> https://apacheignite.readme.io/docs/cache-groups <
>>>> https://apacheignite.readme.io/docs/cache-groups>
>>>> 
>>>> —
>>>> Denis
>>>> 
>>>>> On Oct 3, 2017, at 12:10 AM, Semyon Boikov <sb...@gridgain.com>
>> wrote:
>>>>> 
>>>>> Hi,
>>>>> 
>>>>> Regarding question about  default cache group: by default cache groups
>>>> are
>>>>> not enabled, each cache is started in separate group. Cache group is
>>>>> enabled only if groupName is set in CacheConfiguration.
>>>>> 
>>>>> Thanks
>>>>> 
>>>>> On Sat, Sep 30, 2017 at 11:55 PM, <ds...@apache.org> wrote:
>>>>> 
>>>>>> Why not? Obviously compression would have to be enabled per group, not
>>>> per
>>>>>> cache.
>>>>>> 
>>>>>> ⁣D.​
>>>>>> 
>>>>>> On Sep 29, 2017, 10:50 PM, at 10:50 PM, Vladimir Ozerov <
>>>>>> vozerov@gridgain.com> wrote:
>>>>>>> And it will continue hitting us in future. For example, when data
>>>>>>> compression is implemented, for logical caches compression rate will
>> be
>>>>>>> poor, as it would be impossbile to build efficient dictionaries in
>>>>>>> mixed
>>>>>>> data pages.
>>>>>>> 
>>>>>>> On Sat, Sep 30, 2017 at 8:48 AM, Vladimir Ozerov <
>> vozerov@gridgain.com
>>>>> 
>>>>>>> wrote:
>>>>>>> 
>>>>>>>> Folks,
>>>>>>>> 
>>>>>>>> Honesly, to me logical caches appears to be a dirty shortcut to
>>>>>>> mitigate
>>>>>>>> some inefficient internal implementation. Why can't we merge
>>>>>>> partition maps
>>>>>>>> in runtime? This should not be a problem for context-independent
>>>>>>> affinity
>>>>>>>> functions (e.g. RendezvousAffinityFunction). From user perspective
>>>>>>> logic
>>>>>>>> caches feature is:
>>>>>>>> 1) Bad API. One cannot define group configuration. All you can do is
>>>>>>> to
>>>>>>>> define group name on cache lavel and hope that nobody started
>> another
>>>>>>> cache
>>>>>>>> in the same group with different configuration before.
>>>>>>>> 2) Performance impact for scans, as you have to iterate over mixed
>>>>>>> data.
>>>>>>>> 
>>>>>>>> Couldn't we fix partition map problem without cache groups?
>>>>>>>> 
>>>>>>>> On Sat, Sep 30, 2017 at 2:35 AM, Denis Magda <dm...@apache.org>
>>>>>>> wrote:
>>>>>>>> 
>>>>>>>>> Guys,
>>>>>>>>> 
>>>>>>>>> Another question. Does this capability enabled by default? If yes,
>>>>>>> how do
>>>>>>>>> we decide which group a cache goes to?
>>>>>>>>> 
>>>>>>>>> —
>>>>>>>>> Denis
>>>>>>>>> 
>>>>>>>>>> On Sep 29, 2017, at 3:58 PM, Denis Magda <dm...@apache.org>
>>>>>>> wrote:
>>>>>>>>>> 
>>>>>>>>>> Igniters,
>>>>>>>>>> 
>>>>>>>>>> I’ve put on paper the feature from the subj:
>>>>>>>>>> https://apacheignite.readme.io/docs/logical-caches <
>>>>>>>>> https://apacheignite.readme.io/docs/logical-caches>
>>>>>>>>>> 
>>>>>>>>>> Sam, will appreciate if you read through it and confirm I
>>>>>>> explained the
>>>>>>>>> topic 100% technically correct.
>>>>>>>>>> 
>>>>>>>>>> However, are there any negative impacts of having logical caches?
>>>>>>> This
>>>>>>>>> page has “Possible Impacts” section unfilled:
>>>>>>>>>> https://cwiki.apache.org/confluence/display/IGNITE/Logical+Caches
>>>>>>> <
>>>>>>>>> https://cwiki.apache.org/confluence/display/IGNITE/Logical+Caches>
>>>>>>>>>> 
>>>>>>>>>> —
>>>>>>>>>> Denis
>>>>>>>>> 
>>>>>>>>> 
>>>>>>>> 
>>>>>> 
>>>> 
>>>> 
>> 
>> 


Re: Logical Cache Documented

Posted by Vladimir Ozerov <vo...@gridgain.com>.
Denis,

This is not a "must have", neither I can name it a "feature". We have
internal partition state metadata. When there is a lot of caches, there is
a lot of metadata. It consumes local Java heap, causes high network traffic
on rebalance, and require Ignite to create a lot of files when persistence
is enabled, what slows down checkpoints. All these problems could be
resolved by better storage architecture and "joining" of partition maps of
caches with same affinity functions in runtime.

But this is difficult, so we created "cache groups" as a kind of shortcut.
It saves heap, saves network, and reduces number of files. But it comes at
a cost - now single data page contain data from different caches. This
causes higher than usual miss rate (and as a result more OS calls) for
random cache operations and index lookups. In future it will also cause
poor compression rates when compression is implemented, and it will cause
poor scan performance when efficient scans are implemented.

To summarize, we *SHOULD NOT* advise users to use this feature unless they
have problems with high heap usage due to partition maps, or poor
chekpointing performance due to excessive fsyncs.

On Tue, Oct 3, 2017 at 10:48 PM, Denis Magda <dm...@apache.org> wrote:

> Vladimir,
>
> Please share more details that I can put on the paper. Presently the
> feature is described as a must have and I struggled finding any negative
> impact related info.
>
> —
> Denis
>
> > On Oct 3, 2017, at 12:46 PM, Vladimir Ozerov <vo...@gridgain.com>
> wrote:
> >
> > Denis,
> >
> > This feature should not be enabled by default as it negatively affects
> read
> > performance.
> >
> > On Tue, Oct 3, 2017 at 10:31 PM, Denis Magda <dm...@apache.org> wrote:
> >
> >> Sam,
> >>
> >> Is there any technical limitation that prevents us from assigning caches
> >> with similar parameters to relevant groups on-the-fly?
> >>
> >> After finishing the doc, I’m convinced the feature should be enabled by
> >> default unless there are some pitfalls not known by me.
> >>
> >> BTW, decided to avoid logical caches term usage falling back to vivid
> >> cache groups notion:
> >> https://apacheignite.readme.io/docs/cache-groups <
> >> https://apacheignite.readme.io/docs/cache-groups>
> >>
> >> —
> >> Denis
> >>
> >>> On Oct 3, 2017, at 12:10 AM, Semyon Boikov <sb...@gridgain.com>
> wrote:
> >>>
> >>> Hi,
> >>>
> >>> Regarding question about  default cache group: by default cache groups
> >> are
> >>> not enabled, each cache is started in separate group. Cache group is
> >>> enabled only if groupName is set in CacheConfiguration.
> >>>
> >>> Thanks
> >>>
> >>> On Sat, Sep 30, 2017 at 11:55 PM, <ds...@apache.org> wrote:
> >>>
> >>>> Why not? Obviously compression would have to be enabled per group, not
> >> per
> >>>> cache.
> >>>>
> >>>> ⁣D.​
> >>>>
> >>>> On Sep 29, 2017, 10:50 PM, at 10:50 PM, Vladimir Ozerov <
> >>>> vozerov@gridgain.com> wrote:
> >>>>> And it will continue hitting us in future. For example, when data
> >>>>> compression is implemented, for logical caches compression rate will
> be
> >>>>> poor, as it would be impossbile to build efficient dictionaries in
> >>>>> mixed
> >>>>> data pages.
> >>>>>
> >>>>> On Sat, Sep 30, 2017 at 8:48 AM, Vladimir Ozerov <
> vozerov@gridgain.com
> >>>
> >>>>> wrote:
> >>>>>
> >>>>>> Folks,
> >>>>>>
> >>>>>> Honesly, to me logical caches appears to be a dirty shortcut to
> >>>>> mitigate
> >>>>>> some inefficient internal implementation. Why can't we merge
> >>>>> partition maps
> >>>>>> in runtime? This should not be a problem for context-independent
> >>>>> affinity
> >>>>>> functions (e.g. RendezvousAffinityFunction). From user perspective
> >>>>> logic
> >>>>>> caches feature is:
> >>>>>> 1) Bad API. One cannot define group configuration. All you can do is
> >>>>> to
> >>>>>> define group name on cache lavel and hope that nobody started
> another
> >>>>> cache
> >>>>>> in the same group with different configuration before.
> >>>>>> 2) Performance impact for scans, as you have to iterate over mixed
> >>>>> data.
> >>>>>>
> >>>>>> Couldn't we fix partition map problem without cache groups?
> >>>>>>
> >>>>>> On Sat, Sep 30, 2017 at 2:35 AM, Denis Magda <dm...@apache.org>
> >>>>> wrote:
> >>>>>>
> >>>>>>> Guys,
> >>>>>>>
> >>>>>>> Another question. Does this capability enabled by default? If yes,
> >>>>> how do
> >>>>>>> we decide which group a cache goes to?
> >>>>>>>
> >>>>>>> —
> >>>>>>> Denis
> >>>>>>>
> >>>>>>>> On Sep 29, 2017, at 3:58 PM, Denis Magda <dm...@apache.org>
> >>>>> wrote:
> >>>>>>>>
> >>>>>>>> Igniters,
> >>>>>>>>
> >>>>>>>> I’ve put on paper the feature from the subj:
> >>>>>>>> https://apacheignite.readme.io/docs/logical-caches <
> >>>>>>> https://apacheignite.readme.io/docs/logical-caches>
> >>>>>>>>
> >>>>>>>> Sam, will appreciate if you read through it and confirm I
> >>>>> explained the
> >>>>>>> topic 100% technically correct.
> >>>>>>>>
> >>>>>>>> However, are there any negative impacts of having logical caches?
> >>>>> This
> >>>>>>> page has “Possible Impacts” section unfilled:
> >>>>>>>> https://cwiki.apache.org/confluence/display/IGNITE/Logical+Caches
> >>>>> <
> >>>>>>> https://cwiki.apache.org/confluence/display/IGNITE/Logical+Caches>
> >>>>>>>>
> >>>>>>>> —
> >>>>>>>> Denis
> >>>>>>>
> >>>>>>>
> >>>>>>
> >>>>
> >>
> >>
>
>

Re: Logical Cache Documented

Posted by Denis Magda <dm...@apache.org>.
Vladimir,

Please share more details that I can put on the paper. Presently the feature is described as a must have and I struggled finding any negative impact related info.

—
Denis

> On Oct 3, 2017, at 12:46 PM, Vladimir Ozerov <vo...@gridgain.com> wrote:
> 
> Denis,
> 
> This feature should not be enabled by default as it negatively affects read
> performance.
> 
> On Tue, Oct 3, 2017 at 10:31 PM, Denis Magda <dm...@apache.org> wrote:
> 
>> Sam,
>> 
>> Is there any technical limitation that prevents us from assigning caches
>> with similar parameters to relevant groups on-the-fly?
>> 
>> After finishing the doc, I’m convinced the feature should be enabled by
>> default unless there are some pitfalls not known by me.
>> 
>> BTW, decided to avoid logical caches term usage falling back to vivid
>> cache groups notion:
>> https://apacheignite.readme.io/docs/cache-groups <
>> https://apacheignite.readme.io/docs/cache-groups>
>> 
>> —
>> Denis
>> 
>>> On Oct 3, 2017, at 12:10 AM, Semyon Boikov <sb...@gridgain.com> wrote:
>>> 
>>> Hi,
>>> 
>>> Regarding question about  default cache group: by default cache groups
>> are
>>> not enabled, each cache is started in separate group. Cache group is
>>> enabled only if groupName is set in CacheConfiguration.
>>> 
>>> Thanks
>>> 
>>> On Sat, Sep 30, 2017 at 11:55 PM, <ds...@apache.org> wrote:
>>> 
>>>> Why not? Obviously compression would have to be enabled per group, not
>> per
>>>> cache.
>>>> 
>>>> ⁣D.​
>>>> 
>>>> On Sep 29, 2017, 10:50 PM, at 10:50 PM, Vladimir Ozerov <
>>>> vozerov@gridgain.com> wrote:
>>>>> And it will continue hitting us in future. For example, when data
>>>>> compression is implemented, for logical caches compression rate will be
>>>>> poor, as it would be impossbile to build efficient dictionaries in
>>>>> mixed
>>>>> data pages.
>>>>> 
>>>>> On Sat, Sep 30, 2017 at 8:48 AM, Vladimir Ozerov <vozerov@gridgain.com
>>> 
>>>>> wrote:
>>>>> 
>>>>>> Folks,
>>>>>> 
>>>>>> Honesly, to me logical caches appears to be a dirty shortcut to
>>>>> mitigate
>>>>>> some inefficient internal implementation. Why can't we merge
>>>>> partition maps
>>>>>> in runtime? This should not be a problem for context-independent
>>>>> affinity
>>>>>> functions (e.g. RendezvousAffinityFunction). From user perspective
>>>>> logic
>>>>>> caches feature is:
>>>>>> 1) Bad API. One cannot define group configuration. All you can do is
>>>>> to
>>>>>> define group name on cache lavel and hope that nobody started another
>>>>> cache
>>>>>> in the same group with different configuration before.
>>>>>> 2) Performance impact for scans, as you have to iterate over mixed
>>>>> data.
>>>>>> 
>>>>>> Couldn't we fix partition map problem without cache groups?
>>>>>> 
>>>>>> On Sat, Sep 30, 2017 at 2:35 AM, Denis Magda <dm...@apache.org>
>>>>> wrote:
>>>>>> 
>>>>>>> Guys,
>>>>>>> 
>>>>>>> Another question. Does this capability enabled by default? If yes,
>>>>> how do
>>>>>>> we decide which group a cache goes to?
>>>>>>> 
>>>>>>> —
>>>>>>> Denis
>>>>>>> 
>>>>>>>> On Sep 29, 2017, at 3:58 PM, Denis Magda <dm...@apache.org>
>>>>> wrote:
>>>>>>>> 
>>>>>>>> Igniters,
>>>>>>>> 
>>>>>>>> I’ve put on paper the feature from the subj:
>>>>>>>> https://apacheignite.readme.io/docs/logical-caches <
>>>>>>> https://apacheignite.readme.io/docs/logical-caches>
>>>>>>>> 
>>>>>>>> Sam, will appreciate if you read through it and confirm I
>>>>> explained the
>>>>>>> topic 100% technically correct.
>>>>>>>> 
>>>>>>>> However, are there any negative impacts of having logical caches?
>>>>> This
>>>>>>> page has “Possible Impacts” section unfilled:
>>>>>>>> https://cwiki.apache.org/confluence/display/IGNITE/Logical+Caches
>>>>> <
>>>>>>> https://cwiki.apache.org/confluence/display/IGNITE/Logical+Caches>
>>>>>>>> 
>>>>>>>> —
>>>>>>>> Denis
>>>>>>> 
>>>>>>> 
>>>>>> 
>>>> 
>> 
>> 


Re: Logical Cache Documented

Posted by Vladimir Ozerov <vo...@gridgain.com>.
Denis,

This feature should not be enabled by default as it negatively affects read
performance.

On Tue, Oct 3, 2017 at 10:31 PM, Denis Magda <dm...@apache.org> wrote:

> Sam,
>
> Is there any technical limitation that prevents us from assigning caches
> with similar parameters to relevant groups on-the-fly?
>
> After finishing the doc, I’m convinced the feature should be enabled by
> default unless there are some pitfalls not known by me.
>
> BTW, decided to avoid logical caches term usage falling back to vivid
> cache groups notion:
> https://apacheignite.readme.io/docs/cache-groups <
> https://apacheignite.readme.io/docs/cache-groups>
>
> —
> Denis
>
> > On Oct 3, 2017, at 12:10 AM, Semyon Boikov <sb...@gridgain.com> wrote:
> >
> > Hi,
> >
> > Regarding question about  default cache group: by default cache groups
> are
> > not enabled, each cache is started in separate group. Cache group is
> > enabled only if groupName is set in CacheConfiguration.
> >
> > Thanks
> >
> > On Sat, Sep 30, 2017 at 11:55 PM, <ds...@apache.org> wrote:
> >
> >> Why not? Obviously compression would have to be enabled per group, not
> per
> >> cache.
> >>
> >> ⁣D.​
> >>
> >> On Sep 29, 2017, 10:50 PM, at 10:50 PM, Vladimir Ozerov <
> >> vozerov@gridgain.com> wrote:
> >>> And it will continue hitting us in future. For example, when data
> >>> compression is implemented, for logical caches compression rate will be
> >>> poor, as it would be impossbile to build efficient dictionaries in
> >>> mixed
> >>> data pages.
> >>>
> >>> On Sat, Sep 30, 2017 at 8:48 AM, Vladimir Ozerov <vozerov@gridgain.com
> >
> >>> wrote:
> >>>
> >>>> Folks,
> >>>>
> >>>> Honesly, to me logical caches appears to be a dirty shortcut to
> >>> mitigate
> >>>> some inefficient internal implementation. Why can't we merge
> >>> partition maps
> >>>> in runtime? This should not be a problem for context-independent
> >>> affinity
> >>>> functions (e.g. RendezvousAffinityFunction). From user perspective
> >>> logic
> >>>> caches feature is:
> >>>> 1) Bad API. One cannot define group configuration. All you can do is
> >>> to
> >>>> define group name on cache lavel and hope that nobody started another
> >>> cache
> >>>> in the same group with different configuration before.
> >>>> 2) Performance impact for scans, as you have to iterate over mixed
> >>> data.
> >>>>
> >>>> Couldn't we fix partition map problem without cache groups?
> >>>>
> >>>> On Sat, Sep 30, 2017 at 2:35 AM, Denis Magda <dm...@apache.org>
> >>> wrote:
> >>>>
> >>>>> Guys,
> >>>>>
> >>>>> Another question. Does this capability enabled by default? If yes,
> >>> how do
> >>>>> we decide which group a cache goes to?
> >>>>>
> >>>>> —
> >>>>> Denis
> >>>>>
> >>>>>> On Sep 29, 2017, at 3:58 PM, Denis Magda <dm...@apache.org>
> >>> wrote:
> >>>>>>
> >>>>>> Igniters,
> >>>>>>
> >>>>>> I’ve put on paper the feature from the subj:
> >>>>>> https://apacheignite.readme.io/docs/logical-caches <
> >>>>> https://apacheignite.readme.io/docs/logical-caches>
> >>>>>>
> >>>>>> Sam, will appreciate if you read through it and confirm I
> >>> explained the
> >>>>> topic 100% technically correct.
> >>>>>>
> >>>>>> However, are there any negative impacts of having logical caches?
> >>> This
> >>>>> page has “Possible Impacts” section unfilled:
> >>>>>> https://cwiki.apache.org/confluence/display/IGNITE/Logical+Caches
> >>> <
> >>>>> https://cwiki.apache.org/confluence/display/IGNITE/Logical+Caches>
> >>>>>>
> >>>>>> —
> >>>>>> Denis
> >>>>>
> >>>>>
> >>>>
> >>
>
>

Re: Logical Cache Documented

Posted by Denis Magda <dm...@apache.org>.
Sam,

Is there any technical limitation that prevents us from assigning caches with similar parameters to relevant groups on-the-fly?

After finishing the doc, I’m convinced the feature should be enabled by default unless there are some pitfalls not known by me.

BTW, decided to avoid logical caches term usage falling back to vivid cache groups notion:
https://apacheignite.readme.io/docs/cache-groups <https://apacheignite.readme.io/docs/cache-groups>

—
Denis

> On Oct 3, 2017, at 12:10 AM, Semyon Boikov <sb...@gridgain.com> wrote:
> 
> Hi,
> 
> Regarding question about  default cache group: by default cache groups are
> not enabled, each cache is started in separate group. Cache group is
> enabled only if groupName is set in CacheConfiguration.
> 
> Thanks
> 
> On Sat, Sep 30, 2017 at 11:55 PM, <ds...@apache.org> wrote:
> 
>> Why not? Obviously compression would have to be enabled per group, not per
>> cache.
>> 
>> ⁣D.​
>> 
>> On Sep 29, 2017, 10:50 PM, at 10:50 PM, Vladimir Ozerov <
>> vozerov@gridgain.com> wrote:
>>> And it will continue hitting us in future. For example, when data
>>> compression is implemented, for logical caches compression rate will be
>>> poor, as it would be impossbile to build efficient dictionaries in
>>> mixed
>>> data pages.
>>> 
>>> On Sat, Sep 30, 2017 at 8:48 AM, Vladimir Ozerov <vo...@gridgain.com>
>>> wrote:
>>> 
>>>> Folks,
>>>> 
>>>> Honesly, to me logical caches appears to be a dirty shortcut to
>>> mitigate
>>>> some inefficient internal implementation. Why can't we merge
>>> partition maps
>>>> in runtime? This should not be a problem for context-independent
>>> affinity
>>>> functions (e.g. RendezvousAffinityFunction). From user perspective
>>> logic
>>>> caches feature is:
>>>> 1) Bad API. One cannot define group configuration. All you can do is
>>> to
>>>> define group name on cache lavel and hope that nobody started another
>>> cache
>>>> in the same group with different configuration before.
>>>> 2) Performance impact for scans, as you have to iterate over mixed
>>> data.
>>>> 
>>>> Couldn't we fix partition map problem without cache groups?
>>>> 
>>>> On Sat, Sep 30, 2017 at 2:35 AM, Denis Magda <dm...@apache.org>
>>> wrote:
>>>> 
>>>>> Guys,
>>>>> 
>>>>> Another question. Does this capability enabled by default? If yes,
>>> how do
>>>>> we decide which group a cache goes to?
>>>>> 
>>>>> —
>>>>> Denis
>>>>> 
>>>>>> On Sep 29, 2017, at 3:58 PM, Denis Magda <dm...@apache.org>
>>> wrote:
>>>>>> 
>>>>>> Igniters,
>>>>>> 
>>>>>> I’ve put on paper the feature from the subj:
>>>>>> https://apacheignite.readme.io/docs/logical-caches <
>>>>> https://apacheignite.readme.io/docs/logical-caches>
>>>>>> 
>>>>>> Sam, will appreciate if you read through it and confirm I
>>> explained the
>>>>> topic 100% technically correct.
>>>>>> 
>>>>>> However, are there any negative impacts of having logical caches?
>>> This
>>>>> page has “Possible Impacts” section unfilled:
>>>>>> https://cwiki.apache.org/confluence/display/IGNITE/Logical+Caches
>>> <
>>>>> https://cwiki.apache.org/confluence/display/IGNITE/Logical+Caches>
>>>>>> 
>>>>>> —
>>>>>> Denis
>>>>> 
>>>>> 
>>>> 
>>