You are viewing a plain text version of this content. The canonical link for it is here.

Posted to user@ignite.apache.org by Sebastien Blind <se...@gmail.com> on 2021/03/14 20:14:28 UTC

Configuring persistence for large dataset on small cluster

Hello Apache Ignite friends,
I am trying to load 1bn records in an ignite cache with persistence enabled
for a poc. Records only consist of an id, a uuid and a state, and I'd like
to lookup by id (this is the key in the cache) and by token (sql queries
for that are fine or another cache keyed by uuid is another option).

The experiment is running locally on a MacOs (16Gb memory) laptop and I'd
like to use as little memory as possible, and use disk storage as much as
possible. Everything works ok until ~150M with query index enabled, and
~600M without index or side cache, so I'd be curious about what type of
settings (if any) could help w/ the setup, and what amount of data can a
single node be expected to handle. Restarting my ignite node and the
loading process seems to help and more data can be stuffed in the cache.

I've tried to play w/ settings like size, number of partitions etc and to
find some info on how to control what's onheap/offheap etc, any
overhead/sizing that could be taken into consideration without much
success. Maybe the experiment is doomed to fail just based on the specs,
but it would be good to understand the constraints..  so any help would be
much appreciated!

Thanks in advance!
Sebastien

PS:
JVM is running w/ -Xms4g -Xmx4g -server -XX:MaxMetaspaceSize=256m settings.

Some snippet of the config:

<property name=*"persistenceEnabled"* value=*"true"* />

<property name=*"initialSize"* value=*"#{4L * 1024 * 1024 * 1024}"* />

<property name=*"maxSize" *value=*"#{4L * 1024 * 1024 * 1024}"* />

<property name=*"pageEvictionMode"* value=*"RANDOM_2_LRU"* />

Cache configuration

<property name=*"cacheConfiguration"*>

<bean class=*"org.apache.ignite.configuration.CacheConfiguration"*>

<!-- Set the cache name. -->

<property name=*"name"* value=*"qa_sqr_txn"* />

<!-- Set the cache mode. -->

<property name=*"cacheMode"* value=*"PARTITIONED"* />

<property name=*"backups"* value=*"0"* />

<property name=*"storeKeepBinary"* value=*"true"* />


<property name=*"affinity"*>

<bean class=
*"org.apache.ignite.cache.affinity.rendezvous.RendezvousAffinityFunction"*>

<property name=*"partitions"* value=*"8192"* />

</bean>

</property>


<!-- Configure query entities -->

<property name=*"queryEntities"*>

<list>

<bean class=*"org.apache.ignite.cache.QueryEntity"*>

<!-- Setting the type of the key -->

<property name=*"keyType"* value=*"java.lang.String"* />

<property name=*"keyFieldName"* value=*"id"* />


<!-- Setting type of the value -->

<property name=*"valueType"* value=*"com.xxx.Txn"* />


<property name=*"fields"*>

<map>

<entry key=*"id"* value=*"java.lang.String"* />

<entry key=*"token"* value=*"java.lang.String"* />

<entry key=*"state"* value=*"java.lang.String "* />

</map>

</property>

<!--

<property name="indexes">

<list>

<bean class="org.apache.ignite.cache.QueryIndex">

<constructor-arg value="token" />

</bean>

</list>

</property>

 -->

</bean>

</list>

</property>

</bean>

</property>

Re: Configuring persistence for large dataset on small cluster

Posted by Sebastien Blind <se...@gmail.com>.

Definite slow down (from 160k/sec to 8k/sec with 10 concurrent clients) and
plenty of messages like

[07:08:16] Possible failure suppressed accordingly to a configured handler
[hnd=StopNodeOrHaltFailureHandler [tryStop=false, timeout=0,
super=AbstractFailureHandler [ignoredFailureTypes=UnmodifiableSet
[SYSTEM_WORKER_BLOCKED, SYSTEM_CRITICAL_OPERATION_TIMEOUT]]],
failureCtx=FailureContext [type=SYSTEM_CRITICAL_OPERATION_TIMEOUT,
err=class o.a.i.IgniteException: Checkpoint read lock acquisition has been
timed out.]]
[07:08:16,915][SEVERE][client-connector-#85][GridCacheDatabaseSharedManager]
Checkpoint read lock acquisition has been timed out.
class
org.apache.ignite.internal.processors.cache.persistence.GridCacheDatabaseSharedManager$CheckpointReadLockTimeoutException:
Checkpoint read lock acquisition has been timed out.
at
org.apache.ignite.internal.processors.cache.persistence.GridCacheDatabaseSharedManager.failCheckpointReadLock(GridCacheDatabaseSharedManager.java:1728)
at
org.apache.ignite.internal.processors.cache.persistence.GridCacheDatabaseSharedManager.checkpointReadLock(GridCacheDatabaseSharedManager.java:1654)
at
org.apache.ignite.internal.processors.cache.distributed.dht.atomic.GridDhtAtomicCache.updateAllAsyncInternal0(GridDhtAtomicCache.java:1819)
at
org.apache.ignite.internal.processors.cache.distributed.dht.atomic.GridDhtAtomicCache.updateAllAsyncInternal(GridDhtAtomicCache.java:1734)
at
org.apache.ignite.internal.processors.cache.distributed.dht.atomic.GridNearAtomicAbstractUpdateFuture.sendSingleRequest(GridNearAtomicAbstractUpdateFuture.java:300)
....

[09:04:25,831][SEVERE][tcp-disco-msg-worker-[crd]-#2-#56][G] Blocked
system-critical thread has been detected. This can lead to cluster-wide
undefined behaviour [workerName=db-checkpoint-thread,
threadName=db-checkpoint-thread-#79, blockedFor=13s]
[09:04:25] Possible failure suppressed accordingly to a configured handler
[hnd=StopNodeOrHaltFailureHandler [tryStop=false, timeout=0,
super=AbstractFailureHandler [ignoredFailureTypes=UnmodifiableSet
[SYSTEM_WORKER_BLOCKED, SYSTEM_CRITICAL_OPERATION_TIMEOUT]]],
failureCtx=FailureContext [type=SYSTEM_WORKER_BLOCKED, err=class
o.a.i.IgniteException: GridWorker [name=db-checkpoint-thread,
igniteInstanceName=null, finished=false, heartbeatTs=1615824252073]]]

On Mon, Mar 15, 2021 at 2:44 AM Stephen Darlington <
stephen.darlington@gridgain.com> wrote:

> What happens after you have 150mm records? You say it’s not okay, but you
> don’t say what does happen. Does it crash, slow down, what?
>
> With SQL and persistence, I think you’d probably need more heap space.
> Increasing the number of partitions isn’t likely to help.
>
> On 14 Mar 2021, at 20:14, Sebastien Blind <se...@gmail.com>
> wrote:
>
> Hello Apache Ignite friends,
> I am trying to load 1bn records in an ignite cache with persistence
> enabled for a poc. Records only consist of an id, a uuid and a state, and
> I'd like to lookup by id (this is the key in the cache) and by token (sql
> queries for that are fine or another cache keyed by uuid is another option).
>
> The experiment is running locally on a MacOs (16Gb memory) laptop and I'd
> like to use as little memory as possible, and use disk storage as much as
> possible. Everything works ok until ~150M with query index enabled, and
> ~600M without index or side cache, so I'd be curious about what type of
> settings (if any) could help w/ the setup, and what amount of data can a
> single node be expected to handle. Restarting my ignite node and the
> loading process seems to help and more data can be stuffed in the cache.
>
> I've tried to play w/ settings like size, number of partitions etc and to
> find some info on how to control what's onheap/offheap etc, any
> overhead/sizing that could be taken into consideration without much
> success. Maybe the experiment is doomed to fail just based on the specs,
> but it would be good to understand the constraints..  so any help would be
> much appreciated!
>
> Thanks in advance!
> Sebastien
>
> PS:
> JVM is running w/ -Xms4g -Xmx4g -server -XX:MaxMetaspaceSize=256m settings.
>
> Some snippet of the config:
> <property name=*"persistenceEnabled"* value=*"true"* />
> <property name=*"initialSize"* value=*"#{4L * 1024 * 1024 * 1024}"* />
> <property name=*"maxSize" *value=*"#{4L * 1024 * 1024 * 1024}"* />
> <property name=*"pageEvictionMode"* value=*"RANDOM_2_LRU"* />
>
> Cache configuration
> <property name=*"cacheConfiguration"*>
> <bean class=*"org.apache.ignite.configuration.CacheConfiguration"*>
> <!-- Set the cache name. -->
> <property name=*"name"* value=*"qa_sqr_txn"* />
> <!-- Set the cache mode. -->
> <property name=*"cacheMode"* value=*"PARTITIONED"* />
> <property name=*"backups"* value=*"0"* />
> <property name=*"storeKeepBinary"* value=*"true"* />
>
> <property name=*"affinity"*>
> <bean class=
> *"org.apache.ignite.cache.affinity.rendezvous.RendezvousAffinityFunction"*
> >
> <property name=*"partitions"* value=*"8192"* />
> </bean>
> </property>
>
> <!-- Configure query entities -->
> <property name=*"queryEntities"*>
> <list>
> <bean class=*"org.apache.ignite.cache.QueryEntity"*>
> <!-- Setting the type of the key -->
> <property name=*"keyType"* value=*"java.lang.String"* />
> <property name=*"keyFieldName"* value=*"id"* />
>
> <!-- Setting type of the value -->
> <property name=*"valueType"* value=*"com.xxx.Txn"* />
>
> <property name=*"fields"*>
> <map>
> <entry key=*"id"* value=*"java.lang.String"* />
> <entry key=*"token"* value=*"java.lang.String"* />
> <entry key=*"state"* value=*"java.lang.String "* />
> </map>
> </property>
> <!--
> <property name="indexes">
> <list>
> <bean class="org.apache.ignite.cache.QueryIndex">
> <constructor-arg value="token" />
> </bean>
> </list>
> </property>
>  -->
> </bean>
> </list>
> </property>
> </bean>
> </property>
>
>
>
>

Re: Configuring persistence for large dataset on small cluster

Posted by Stephen Darlington <st...@gridgain.com>.

What happens after you have 150mm records? You say it’s not okay, but you don’t say what does happen. Does it crash, slow down, what?

With SQL and persistence, I think you’d probably need more heap space. Increasing the number of partitions isn’t likely to help.

> On 14 Mar 2021, at 20:14, Sebastien Blind <se...@gmail.com> wrote:
> 
> Hello Apache Ignite friends,
> I am trying to load 1bn records in an ignite cache with persistence enabled for a poc. Records only consist of an id, a uuid and a state, and I'd like to lookup by id (this is the key in the cache) and by token (sql queries for that are fine or another cache keyed by uuid is another option).
> 
> The experiment is running locally on a MacOs (16Gb memory) laptop and I'd like to use as little memory as possible, and use disk storage as much as possible. Everything works ok until ~150M with query index enabled, and ~600M without index or side cache, so I'd be curious about what type of settings (if any) could help w/ the setup, and what amount of data can a single node be expected to handle. Restarting my ignite node and the loading process seems to help and more data can be stuffed in the cache.
> 
> I've tried to play w/ settings like size, number of partitions etc and to find some info on how to control what's onheap/offheap etc, any overhead/sizing that could be taken into consideration without much success. Maybe the experiment is doomed to fail just based on the specs, but it would be good to understand the constraints..  so any help would be much appreciated!
> 
> Thanks in advance!
> Sebastien
> 
> PS:
> JVM is running w/ -Xms4g -Xmx4g -server -XX:MaxMetaspaceSize=256m settings.
> 
> Some snippet of the config:
> <property name="persistenceEnabled" value="true" />
> <property name="initialSize" value="#{4L * 1024 * 1024 * 1024}" />
> <property name="maxSize" value="#{4L * 1024 * 1024 * 1024}" />
> <property name="pageEvictionMode" value="RANDOM_2_LRU" />
> 
> Cache configuration
> <property name="cacheConfiguration">
> 	<bean class="org.apache.ignite.configuration.CacheConfiguration">
> 		<!-- Set the cache name. -->
> 		<property name="name" value="qa_sqr_txn" />
> 		<!-- Set the cache mode. -->
> 		<property name="cacheMode" value="PARTITIONED" />
> 		<property name="backups" value="0" />
> 		<property name="storeKeepBinary" value="true" />
> 
> 		<property name="affinity">
> 			<bean class="org.apache.ignite.cache.affinity.rendezvous.RendezvousAffinityFunction">
> 				<property name="partitions" value="8192" />
> 			</bean>
> 		</property>
> 
> 		<!-- Configure query entities -->
> 		<property name="queryEntities">
> 			<list>
> 				<bean class="org.apache.ignite.cache.QueryEntity">
> 					<!-- Setting the type of the key -->
> 					<property name="keyType" value="java.lang.String" />
> 					<property name="keyFieldName" value="id" />
> 
> 					<!-- Setting type of the value -->
> 					<property name="valueType" value="com.xxx.Txn" />
> 
> 					<property name="fields">
> 						<map>
> 							<entry key="id" value="java.lang.String" />
> 							<entry key="token" value="java.lang.String" />
> 							<entry key="state" value="java.lang.String " />
> 						</map>
> 					</property>
> 					<!-- 
> 					<property name="indexes">
> 						<list>
> 							<bean class="org.apache.ignite.cache.QueryIndex">
> 								<constructor-arg value="token" />
> 							</bean>
> 						</list>
> 					</property>
> 					 -->
> 				</bean>
> 			</list>
> 		</property>
> 	</bean>
> </property>

Re: Configuring persistence for large dataset on small cluster

Posted by Ilya Kasnacheev <il...@gmail.com>.

Hello!

"Use as little memory as possible" is something Ignite is not optimized
for. As you limit the amount of memory in the data region and increase
amount of data, even all sorts of metadata pages (cache info, free lists,
indexes, etc) will stop fitting in memory and then performance will grind
to a halt or you can even catch IgniteOOM.

I guess that you will have to live with the limit that you have determined.

Regards,
-- 
Ilya Kasnacheev


вс, 14 мар. 2021 г. в 23:15, Sebastien Blind <se...@gmail.com>:

> Hello Apache Ignite friends,
> I am trying to load 1bn records in an ignite cache with persistence
> enabled for a poc. Records only consist of an id, a uuid and a state, and
> I'd like to lookup by id (this is the key in the cache) and by token (sql
> queries for that are fine or another cache keyed by uuid is another option).
>
> The experiment is running locally on a MacOs (16Gb memory) laptop and I'd
> like to use as little memory as possible, and use disk storage as much as
> possible. Everything works ok until ~150M with query index enabled, and
> ~600M without index or side cache, so I'd be curious about what type of
> settings (if any) could help w/ the setup, and what amount of data can a
> single node be expected to handle. Restarting my ignite node and the
> loading process seems to help and more data can be stuffed in the cache.
>
> I've tried to play w/ settings like size, number of partitions etc and to
> find some info on how to control what's onheap/offheap etc, any
> overhead/sizing that could be taken into consideration without much
> success. Maybe the experiment is doomed to fail just based on the specs,
> but it would be good to understand the constraints..  so any help would be
> much appreciated!
>
> Thanks in advance!
> Sebastien
>
> PS:
> JVM is running w/ -Xms4g -Xmx4g -server -XX:MaxMetaspaceSize=256m settings.
>
> Some snippet of the config:
>
> <property name=*"persistenceEnabled"* value=*"true"* />
>
> <property name=*"initialSize"* value=*"#{4L * 1024 * 1024 * 1024}"* />
>
> <property name=*"maxSize" *value=*"#{4L * 1024 * 1024 * 1024}"* />
>
> <property name=*"pageEvictionMode"* value=*"RANDOM_2_LRU"* />
>
> Cache configuration
>
> <property name=*"cacheConfiguration"*>
>
> <bean class=*"org.apache.ignite.configuration.CacheConfiguration"*>
>
> <!-- Set the cache name. -->
>
> <property name=*"name"* value=*"qa_sqr_txn"* />
>
> <!-- Set the cache mode. -->
>
> <property name=*"cacheMode"* value=*"PARTITIONED"* />
>
> <property name=*"backups"* value=*"0"* />
>
> <property name=*"storeKeepBinary"* value=*"true"* />
>
>
> <property name=*"affinity"*>
>
> <bean class=
> *"org.apache.ignite.cache.affinity.rendezvous.RendezvousAffinityFunction"*
> >
>
> <property name=*"partitions"* value=*"8192"* />
>
> </bean>
>
> </property>
>
>
> <!-- Configure query entities -->
>
> <property name=*"queryEntities"*>
>
> <list>
>
> <bean class=*"org.apache.ignite.cache.QueryEntity"*>
>
> <!-- Setting the type of the key -->
>
> <property name=*"keyType"* value=*"java.lang.String"* />
>
> <property name=*"keyFieldName"* value=*"id"* />
>
>
> <!-- Setting type of the value -->
>
> <property name=*"valueType"* value=*"com.xxx.Txn"* />
>
>
> <property name=*"fields"*>
>
> <map>
>
> <entry key=*"id"* value=*"java.lang.String"* />
>
> <entry key=*"token"* value=*"java.lang.String"* />
>
> <entry key=*"state"* value=*"java.lang.String "* />
>
> </map>
>
> </property>
>
> <!--
>
> <property name="indexes">
>
> <list>
>
> <bean class="org.apache.ignite.cache.QueryIndex">
>
> <constructor-arg value="token" />
>
> </bean>
>
> </list>
>
> </property>
>
>  -->
>
> </bean>
>
> </list>
>
> </property>
>
> </bean>
>
> </property>
>