You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@ignite.apache.org by KR Kumar <ig...@gmail.com> on 2019/11/16 03:16:45 UTC

Ignite data loss

Hi Guys - I have a four node cluster with native persistence enabled. Its a
partitioned cache and sync rebalance is enabled. When I restart the cluster
the first node that starts retain the data and all the other nodes data is
deleted and all the ignite data files are turned into 4096 byte files. Am I
missing something? or some configuration that I missing.

Following is the cache configuration:

CacheConfiguration<Long, byte[]> cacheConfig = new CacheConfiguration<Long,
byte[]>();

cacheConfig.setCacheMode(CacheMode.PARTITIONED);

cacheConfig.setRebalanceMode(CacheRebalanceMode.SYNC);

// cacheConfig.setRebalanceDelay(300000);

cacheConfig.setName("eventCache-" + tenantRunId + "-" + tenantId);

cacheConfig.setBackups(1);

cacheConfig.setAtomicityMode(CacheAtomicityMode.ATOMIC);

cacheConfig.setWriteSynchronizationMode(CacheWriteSynchronizationMode.
FULL_SYNC);

IgniteCache<Long, byte[]> cache =
IgniteContextWrapper.getInstance().getEngine()
.getOrCreateCache(cacheConfig);


Here is the configuration of ignite

<property name="configuration">
<bean id="ignite.cfg"
class="org.apache.ignite.configuration.IgniteConfiguration">
<property name="workDirectory" value="${work.space}"></property>
<property name="peerClassLoadingEnabled" value="true" />
<property name="communicationSpi">
<bean class="org.apache.ignite.spi.communication.tcp.TcpCommunicationSpi">
<property name="connectTimeout" value="10000"></property>
<property name="socketWriteTimeout" value="30000"></property>
</bean>
</property>
<property name="failureDetectionTimeout" value="120000"></property>
<property name="rebalanceThreadPoolSize" value="8"></property>
<property name="publicThreadPoolSize" value="64" />
<property name="systemThreadPoolSize" value="32" />
<property name="dataStorageConfiguration">
<bean class="org.apache.ignite.configuration.DataStorageConfiguration">
<property name="defaultDataRegionConfiguration">
<bean class="org.apache.ignite.configuration.DataRegionConfiguration">
<property name="initialSize" value="#{512L *  1024 * 1024}"></property>
<property name="maxSize" value="#{20L * 1024 * 1024 * 1024}" />
<property name="persistenceEnabled" value="true" />
<property name="checkpointPageBufferSize" value="#{1024 * 1024 * 1024}" />
</bean>
</property>
<!-- <property name="checkpointFrequency" value="600000" />
 --> <property name="pageSize" value="#{4 * 1024}" />
<property name="storagePath" value="${grid.data}" />
<property name="walPath" value="${grid.wal}" />
<property name="walMode" value="BACKGROUND" />
<property name="walFlushFrequency" value="10000"></property>
</bean>
</property>

Any quick pointers ??

Thanx and Regards,
KR Kumar

RE: Re: Ignite data loss

Posted by Alexandr Shapkin <le...@gmail.com>.
Hello!



Is it possible that you had disabled WAL before restarting a cluster?

Doing this may cause data loss in some scenarios.



 **From:**[Mikael](mailto:mikael-aronsson@telia.com)  
 **Sent:** Saturday, November 16, 2019 9:58 AM  
 **To:**[user@ignite.apache.org](mailto:user@ignite.apache.org)  
 **Subject:** Re: Ignite data loss



Hi!

So it does not matter what node you restart ? it's always that one that keeps
the data ?

Are all 4 nodes part of the baseline topology ?

I pretty much have the same setup but with 3 nodes and have not had any
problems at all, not sure what it could be ?

If you turn on all logging you will see all it does at startup and can see if
there is something weird going on, it's lots of information but it usually
gives a good indication if there is any problem, nothing else in the logs ? if
the nodes clear everything at startup there should be something in the logs.

Mikael

Den 2019-11-16 kl. 04:16, skrev KR Kumar:

> Hi Guys - I have a four node cluster with native persistence enabled. Its a
partitioned cache and sync rebalance is enabled. When I restart the cluster
the first node that starts retain the data and all the other nodes data is
deleted and all the ignite data files are turned into 4096 byte files. Am I
missing something? or some configuration that I missing.

>

>  
>

> Following is the cache configuration:

>

>  
>

> CacheConfiguration<Long, byte[]> cacheConfig = new CacheConfiguration<Long,
byte[]>();

>

> cacheConfig.setCacheMode(CacheMode.PARTITIONED);

>

> cacheConfig.setRebalanceMode(CacheRebalanceMode.SYNC);

>

> // cacheConfig.setRebalanceDelay(300000);

>

> cacheConfig.setName("eventCache-" \+ tenantRunId \+ "-" \+ tenantId);

>

> cacheConfig.setBackups(1);

>

> cacheConfig.setAtomicityMode(CacheAtomicityMode.ATOMIC);

>

>
cacheConfig.setWriteSynchronizationMode(CacheWriteSynchronizationMode.FULL_SYNC);

>

> IgniteCache<Long, byte[]> cache =
IgniteContextWrapper.getInstance().getEngine()

>

> .getOrCreateCache(cacheConfig);

>

>  
>

>  
>

> Here is the configuration of ignite

>

>  
>

> <property name="configuration">

>

> <bean id="ignite.cfg"
class="org.apache.ignite.configuration.IgniteConfiguration">

>

> <property name="workDirectory" value="${work.space}"></property>

>

> <property name="peerClassLoadingEnabled" value="true" />

>

> <property name="communicationSpi">

>

> <bean class="org.apache.ignite.spi.communication.tcp.TcpCommunicationSpi">

>

> <property name="connectTimeout" value="10000"></property>

>

> <property name="socketWriteTimeout" value="30000"></property>

>

> </bean>

>

> </property>

>

> <property name="failureDetectionTimeout" value="120000"></property>

>

> <property name="rebalanceThreadPoolSize" value="8"></property>

>

> <property name="publicThreadPoolSize" value="64" />

>

> <property name="systemThreadPoolSize" value="32" />

>

> <property name="dataStorageConfiguration">

>

> <bean class="org.apache.ignite.configuration.DataStorageConfiguration">

>

> <property name="defaultDataRegionConfiguration">

>

> <bean class="org.apache.ignite.configuration.DataRegionConfiguration">

>

> <property name="initialSize" value="#{512L *  1024 * 1024}"></property>

>

> <property name="maxSize" value="#{20L * 1024 * 1024 * 1024}" />

>

> <property name="persistenceEnabled" value="true" />

>

> <property name="checkpointPageBufferSize" value="#{1024 * 1024 * 1024}" />

>

> </bean>

>

> </property>

>

> <!--  <property name="checkpointFrequency" value="600000" />

>

>  \--> <property name="pageSize" value="#{4 * 1024}" />

>

> <property name="storagePath" value="${grid.data}" />

>

> <property name="walPath" value="${grid.wal}" />

>

> <property name="walMode" value="BACKGROUND" />

>

> <property name="walFlushFrequency" value="10000"></property>

>

> </bean>

>

> </property>

>

>  
>

> Any quick pointers ??

>

>  
>

> Thanx and Regards,

KR Kumar




Re: Ignite data loss

Posted by Mikael <mi...@telia.com>.
Hi!

So it does not matter what node you restart ? it's always that one that 
keeps the data ?

Are all 4 nodes part of the baseline topology ?

I pretty much have the same setup but with 3 nodes and have not had any 
problems at all, not sure what it could be ?

If you turn on all logging you will see all it does at startup and can 
see if there is something weird going on, it's lots of information but 
it usually gives a good indication if there is any problem, nothing else 
in the logs ? if the nodes clear everything at startup there should be 
something in the logs.

Mikael

Den 2019-11-16 kl. 04:16, skrev KR Kumar:
> Hi Guys - I have a four node cluster with native persistence enabled. 
> Its a partitioned cache and sync rebalance is enabled. When I restart 
> the cluster the first node that starts retain the data and all the 
> other nodes data is deleted and all the ignite data files are turned 
> into 4096 byte files. Am I missing something? or some configuration 
> that I missing.
>
> Following is the cache configuration:
>
> CacheConfiguration<Long, byte[]> cacheConfig = new 
> CacheConfiguration<Long, byte[]>();
>
> cacheConfig.setCacheMode(CacheMode.PARTITIONED);
>
> cacheConfig.setRebalanceMode(CacheRebalanceMode.SYNC);
>
> //cacheConfig.setRebalanceDelay(300000);
>
> cacheConfig.setName("eventCache-"+ tenantRunId+ "-"+ tenantId);
>
> cacheConfig.setBackups(1);
>
> cacheConfig.setAtomicityMode(CacheAtomicityMode.ATOMIC);
>
> cacheConfig.setWriteSynchronizationMode(CacheWriteSynchronizationMode.FULL_SYNC);
>
> IgniteCache<Long, byte[]> cache = 
> IgniteContextWrapper.getInstance().getEngine()
>
> .getOrCreateCache(cacheConfig);
>
>
> Here is the configuration of ignite
>
> <property name="configuration">
> <bean id="ignite.cfg" 
> class="org.apache.ignite.configuration.IgniteConfiguration">
> <property name="workDirectory" value="${work.space}"></property>
> <property name="peerClassLoadingEnabled" value="true" />
> <property name="communicationSpi">
> <bean class="org.apache.ignite.spi.communication.tcp.TcpCommunicationSpi">
> <property name="connectTimeout" value="10000"></property>
> <property name="socketWriteTimeout" value="30000"></property>
> </bean>
> </property>
> <property name="failureDetectionTimeout" value="120000"></property>
> <property name="rebalanceThreadPoolSize" value="8"></property>
> <property name="publicThreadPoolSize" value="64" />
> <property name="systemThreadPoolSize" value="32" />
> <property name="dataStorageConfiguration">
> <bean class="org.apache.ignite.configuration.DataStorageConfiguration">
> <property name="defaultDataRegionConfiguration">
> <bean class="org.apache.ignite.configuration.DataRegionConfiguration">
> <property name="initialSize" value="#{512L *  1024 * 1024}"></property>
> <property name="maxSize" value="#{20L * 1024 * 1024 * 1024}" />
> <property name="persistenceEnabled" value="true" />
> <property name="checkpointPageBufferSize" value="#{1024 * 1024 * 1024}" />
> </bean>
> </property>
> <!-- <property name="checkpointFrequency" value="600000" />
>  --><property name="pageSize" value="#{4 * 1024}" />
> <property name="storagePath" value="${grid.data}" />
> <property name="walPath" value="${grid.wal}" />
> <property name="walMode" value="BACKGROUND" />
> <property name="walFlushFrequency" value="10000"></property>
> </bean>
> </property>
>
> Any quick pointers ??
>
> Thanx and Regards,
> KR Kumar

Re: Ignite data loss

Posted by "krkumar24061975@gmail.com" <kr...@gmail.com>.
Thanks for your reply guys. 

I am not sure if I have really solved the problem but this is how i fixed
it.  Initially i was adding the nodes to the baseline topology through code.
I have removed that and now adding the nodes thru control.sh which case i am
not loosing any data. I do not know the root cause of the issue, but its a
temp. fix that's working for me for now.

The code that i am using to set the baseline topology is  as follows:

//		logger.info("Setting the baseline topology ...");
//	
engine.cluster().setBaselineTopology(engine.cluster().forServers().nodes());
//		logger.info("Baseline topology is set");

Thanx and Regards,
KR Kumar



--
Sent from: http://apache-ignite-users.70518.x6.nabble.com/