You are viewing a plain text version of this content. The canonical link for it is here.

Posted to user@ignite.apache.org by Denis Mekhanikov <dm...@gmail.com> on 2018/11/01 10:24:23 UTC

Re: Long activation times with Ignite persistence enabled

Naveen,

How many caches do you have?
As Alexey mentioned, usage of cache groups
<https://apacheignite.readme.io/docs/cache-groups> could reduce the number
of created partitions and improve the startup time.

Denis

сб, 27 окт. 2018 г. в 11:12, Naveen <na...@gmail.com>:

> Do we have any  update long activation times ?
>
> I too face the same issue, am using 2.6.
>
> Cluster with 100 GB of disk size, got activated in 5 minutes, and when I
> tried with a cluster which has 3 TB is taking close to an hour.
>
> Is it the expected behavior OR some configuration I am missing here
>
> Thanks
>
>
>
>
> --
> Sent from: http://apache-ignite-users.70518.x6.nabble.com/
>

Re: Long activation times with Ignite persistence enabled

Posted by Naveen <na...@gmail.com>.

Do we have any update on this

Thanks



--
Sent from: http://apache-ignite-users.70518.x6.nabble.com/

Re: Long activation times with Ignite persistence enabled

Posted by Naveen <na...@gmail.com>.

Hi Pavel 

We are using Ignite 2.6 
You were saying, usage of cache groups is definitely needed to improve to
the cluster activation time ? 

I could see below documentation on the usage of cache groups. 
  
Should the grouping be used all the times? 
With all the benefits the cache groups have, they might impact the
performance of read operations and indexes lookups. This is caused by the
fact that all the data and indexes get mixed in shared data structures
(partition maps, B+trees) and it will take more time querying over them. 

Thus, consider using the cache groups if you have a cluster of dozens and
hundreds of nodes and caches, and you spot increased Java heap usage by
internal structures, checkpointing performance drop, slow node connectivity
to the cluster. 


In our case, we do have around 50 caches, and at the max we may have 10
nodes. do you still recommend cache groups for our use case. 

And, ours upserts TPS is very low, may be 1 k per sec, but query or read TPS
is quite high, close to 10 K TPS. So as per the above lines, it says
performance of read operations is impacted since all the caches are gong to
use the shared structures. 
We are looking for a design which can improve the cluster activation time,
but not at the expense of compromising on the query performance, since our
solution being a read intensive, we cant afford to reduce query performance.
In worst scenario, we can live with poor cluster activation process as well,
since it only affects us at the time cluster restart which is performed only
in case of cluster crash or planned maintenance. 

One more thing, if at all we need to change the system pool, below is the
command to change the system pool ?? 
IgniteConfiguration.setSystemThreadPoolSize(...) 
We have 128 CPU machines, what would be the ideal system thread pool size ??
of course, it should be tried and tested but still some number.. 

Regarding the cache groups design, anything I should consider when we are
grouping the cache 

1. We have around 40 caches, no indexes, we only have lookup on primary key,
some of them are simple keys and some of them are having complex primary
key. Some of the caches are queries together, does it help if we group them
into a cache group ? 
2. What if we are trying query the caches which are part of different cache
groups ? 
3. We are going to have close to half a billion records in each cache, so
how do we group them 
4. Some of the caches are independent , does not have any relation with
other caches 

So in case, I am going with cache group, shall I change the partition to 128
or keep the default ? 


Thanks 
Naveen



--
Sent from: http://apache-ignite-users.70518.x6.nabble.com/

Re: Long activation times with Ignite persistence enabled

Posted by Naveen <na...@gmail.com>.

Hi Pavel

We are using Ignite 2.6
You were saying, usage of cache groups is definitely needed to improve to
the cluster activation time ?

I could see below documentation on the usage of cache groups.
 
*Should the grouping be used all the times?
With all the benefits the cache groups have, they might impact the
performance of read operations and indexes lookups. This is caused by the
fact that all the data and indexes get mixed in shared data structures
(partition maps, B+trees) and it will take more time querying over them.

Thus, consider using the cache groups if you have a cluster of dozens and
hundreds of nodes and caches, and you spot increased Java heap usage by
internal structures, checkpointing performance drop, slow node connectivity
to the cluster.
*

In our case, we do have around 50 caches, and at the max we may have 10
nodes. do you still recommend cache groups for our use case. 

And, ours upserts TPS is very low, may be 1 k per sec, but query or read TPS
is quite high, close to 10 K TPS. So as per the above lines, it says
performance of read operations is impacted since all the caches are gong to
use the shared structures. 
We are looking for a design which can improve the cluster activation time,
but not at the expense of compromising on the query performance, since our
solution being a read intensive, we cant afford to reduce query performance.
In worst scenario, we can live with poor cluster activation process as well,
since it only affects us at the time cluster restart which is performed only
in case of cluster crash or planned maintenance.

One more thing, if at all we need to change the system pool, below is the
command to change the system pool ??
IgniteConfiguration.setSystemThreadPoolSize(...)
We have 128 CPU machines, what would be the ideal system thread pool size ??
of course, it should be tried and tested but still some number..

Regarding the cache groups design, anything I should consider when we are
grouping the cache

1. We have around 40 caches, no indexes, we only have lookup on primary key,
some of them are simple keys and some of them are having complex primary
key. Some of the caches are queries together, does it help if we group them
into a cache group ?
2. What if we are trying query the caches which are part of different cache
groups ?
3. We are going to have close to half a billion records in each cache, so
how do we group them
4. Some of the caches are independent , does not have any relation with
other caches

So in case, I am going with cache group, shall I change the partition to 128
or keep the default ?


Thanks
Naveen



--
Sent from: http://apache-ignite-users.70518.x6.nabble.com/

Re: Long activation times with Ignite persistence enabled

Posted by Pavel Kovalenko <jo...@gmail.com>.

Hi Naveen and Andrey,

We've recently done major optimization
https://issues.apache.org/jira/browse/IGNITE-9420 that will speed-up
activation time in your case.
Iteration over WAL now happens only on a node start-up, so it will not
affect activation anymore.
Partitions state restoring (which is the slowest part of the activation
phase as I see in the first message in the thread) was also optimized.
Now it is performed in parallel for each of available cache groups.
Parallelism level of that operation is controlled by System Pool size.
If you have enough CPU cores on your machines (more than the number of
configured cache groups) you can adjust System pool size and your
activation time will be significantly improved.

вт, 6 нояб. 2018 г. в 17:23, Naveen <na...@gmail.com>:

> Hi Denis
>
> We have already reduced the partition to 128, after which activation time
> has come down a bit.
>
> You were saying that, by reducing the partitions, it may lead to uneven
> distribution of data between nodes. Isn't it the same when we go for cache
> groups, group of caches will use the same resources /partitions, so here
> also resource contention may be there right ?? here also same set of
> partitions used by group of caches ?
> If we use cache group, partition size may grow very high since all the
> caches belong to that group will use the same set of partitions, does it
> have any negative effect on the cluster performance ??
>
>
>
> Thanks
> Naveen
>
>
>
> --
> Sent from: http://apache-ignite-users.70518.x6.nabble.com/
>

Re: Long activation times with Ignite persistence enabled

Posted by Naveen <na...@gmail.com>.

Hi Denis

We have already reduced the partition to 128, after which activation time
has come down a bit. 

You were saying that, by reducing the partitions, it may lead to uneven
distribution of data between nodes. Isn't it the same when we go for cache
groups, group of caches will use the same resources /partitions, so here
also resource contention may be there right ?? here also same set of
partitions used by group of caches ? 
If we use cache group, partition size may grow very high since all the
caches belong to that group will use the same set of partitions, does it
have any negative effect on the cluster performance ??



Thanks
Naveen



--
Sent from: http://apache-ignite-users.70518.x6.nabble.com/

Re: Long activation times with Ignite persistence enabled

Posted by Denis Mekhanikov <dm...@gmail.com>.

Naveen,

40 caches is quite a lot. It means, that Ignite needs to handle 40 *
(number of partitions) files.
By default each cache has 1024 partitions.
This is quite a lot, and a disk is the bottleneck here. Changing of thread
pool sizes won't save you.
If you divide your caches into cache groups, then they will share the same
partitions, thus number of files will be reduced.
You can also try reducing the number of partitions, but it may lead to
uneven distribution of data between nodes.
Any of these changes will require reloading of the data.

You can record a *dstat* on the host machine to make sure, that disk is the
weak place.
If its utilization is high, while CPU is not used, then it means, that you
need a faster disk.

Denis


пн, 5 нояб. 2018 г. в 17:10, Naveen <na...@gmail.com>:

> Hi Denis
>
> We have only 40 caches in our cluster.
> If we introduce grouping of caches, guess we need to reload the data from
> scratch, right ??
>
> We do have very powerful machines as part of cluster, they are 128 CPU very
> high end boxes and huge resources available, by increasing any of the below
> thread pools, can we reduce the cluster activation time.
>
> System Pool
> Public Pool
> Striped Pool
> Custom Thread Pools
>
> Thanks
> Naveen
>
>
>
> --
> Sent from: http://apache-ignite-users.70518.x6.nabble.com/
>

Re: Long activation times with Ignite persistence enabled

Posted by Naveen <na...@gmail.com>.

HI  Gianluca

Removing this  -XX:+AlwaysPreTouch did not help us at all, it took the same
time with or without this. 
But I observed without this was, OS has not allocated entire heap memory we
have given as JVM options, we were giving 200G as heap for ignite node, so
when we execute top command, Ignite node used to use 250GB odd. 
Now after removing AlwaysPreTouch, ignite node use only 50G (around). 

Other than, activation time remains same with or without AlwaysPreTouch. 

Thanks
Naveen



--
Sent from: http://apache-ignite-users.70518.x6.nabble.com/

Re: Long activation times with Ignite persistence enabled

Posted by Gianluca Bonetti <gi...@gmail.com>.

Hello

In my case of slow startup, as suggested from a member of this mailing
list, I deleted the -XX:+AlwaysPreTouch command line option from JVM
launch, and the cluster got back to very fast startup.
Don't know if you are using this option, hope it helps.

Cheers
Gianluca

Il giorno lun 5 nov 2018 alle ore 14:10 Naveen <na...@gmail.com>
ha scritto:

> Hi Denis
>
> We have only 40 caches in our cluster.
> If we introduce grouping of caches, guess we need to reload the data from
> scratch, right ??
>
> We do have very powerful machines as part of cluster, they are 128 CPU very
> high end boxes and huge resources available, by increasing any of the below
> thread pools, can we reduce the cluster activation time.
>
> System Pool
> Public Pool
> Striped Pool
> Custom Thread Pools
>
> Thanks
> Naveen
>
>
>
> --
> Sent from: http://apache-ignite-users.70518.x6.nabble.com/
>

Re: Long activation times with Ignite persistence enabled

Posted by Naveen <na...@gmail.com>.

Hi Denis

We have only 40 caches in our cluster.
If we introduce grouping of caches, guess we need to reload the data from
scratch, right ??

We do have very powerful machines as part of cluster, they are 128 CPU very
high end boxes and huge resources available, by increasing any of the below
thread pools, can we reduce the cluster activation time. 

System Pool
Public Pool
Striped Pool
Custom Thread Pools

Thanks
Naveen



--
Sent from: http://apache-ignite-users.70518.x6.nabble.com/