You are viewing a plain text version of this content. The canonical link for it is here.

Posted to user@ignite.apache.org by Shiva Kumar <sh...@gmail.com> on 2019/09/20 10:21:35 UTC

index corrupted error : org.apache.ignite.internal.processors.cache.persistence.tree.CorruptedTreeException: Runtime failure on search row

Hi all,
I have deployed 3 node Ignite cluster with native persistence on Kubernetes
and one of the node crashed with below error message,

*org.h2.message.DbException: General error: "class
org.apache.ignite.internal.processors.cache.persistence.tree.CorruptedTreeException:
Runtime failure on search row: Row@8cfe967[ key: epro_model_abcdKey
[idHash=822184780, hash=737706081, NE_ID=, NAME=], val: epro_model_abcd
[idHash=60444003, hash=1539928610, epro_ID=51, LONGITUDE=null,
DELETE_TIME=null, VENDOR=null, CREATE_TIME=2019-09-19T20:38:32.361929Z,
UPDATE_TIME=2019-09-19T20:40:05.821447Z, ADDITIONAL_INFO=null,
VALID_UNTIL=2019-11-18T20:38:32.362036Z, TYPE=null, LATITUDE=null], ver:
GridCacheVersion [topVer=180326822, order=1568925345552, nodeOrder=6] ][
51, 2019-09-19T20:38:32.361929Z, 2019-09-19T20:40:05.821447Z, null,
2019-11-18T20:38:32.362036Z, , , null, null, null, null, null ]"
[50000-197]|*

Please find attached file [index_corruption.txt] for complete backtrace.

It looks like the Index got corrupted, I am not sure what exactly caused
the index to corrupt. Any knows issues related to this?

In my cluster, many applications write into many tables simultaneously and
some queries run on many tables simultaneously and frequently application
deletes unwanted rows[old data] in the tables using *delete from table* SQL
operation.

Re: index corrupted error : org.apache.ignite.internal.processors.cache.persistence.tree.CorruptedTreeException: Runtime failure on search row

Posted by Ilya Kasnacheev <il...@gmail.com>.

Hello!

It's recommended to upgrate to 2.7.6 because it contains persistence
corruption fixes.

Regards,
-- 
Ilya Kasnacheev


чт, 26 сент. 2019 г. в 12:04, Shiva Kumar <sh...@gmail.com>:

> Hi Igor,
> Thanks for the response!
> The version I am using is 2.7.0
> Unfortunately, I do not have logs of all the nodes, but I have much more
> extra logs (along with thread dump) of the node which reported index
> corruption and attached the same.
> Sorry as of now I can't share persistence data here.
> I have 4 cache groups each cache groups having many tables.
>
> Here are all index.bin files under the persistence directory.
>
> [ignite@ignite-cluster-ignite-e-1 persistence]$
> [ignite@ignite-cluster-ignite-esoc-1 persistence]$ find
> /opt/ignite/persistence/ -name index.bin
>
> /opt/ignite/persistence/node00-a6103519-fb67-45fd-8646-2b6d8cfac53e/metastorage/index.bin
>
> /opt/ignite/persistence/node00-a6103519-fb67-45fd-8646-2b6d8cfac53e/cache-ignite-sys-cache/index.bin
>
> /opt/ignite/persistence/node00-a6103519-fb67-45fd-8646-2b6d8cfac53e/cache-PUBLIC/index.bin
>
> /opt/ignite/persistence/node00-a6103519-fb67-45fd-8646-2b6d8cfac53e/cacheGroup-groupEternal/index.bin
>
> /opt/ignite/persistence/node00-a6103519-fb67-45fd-8646-2b6d8cfac53e/cacheGroup-groupmin15/index.bin
>
> /opt/ignite/persistence/node00-a6103519-fb67-45fd-8646-2b6d8cfac53e/cacheGroup-groupmin1/index.bin
>
> /opt/ignite/persistence/node00-a6103519-fb67-45fd-8646-2b6d8cfac53e/cacheGroup-groupmin5/index.bin
> [ignite@ignite-cluster-ignite-e-1 persistence]$
>
>
> In this ticket https://issues.apache.org/jira/browse/IGNITE-11252, the
> steps to recover from index corruption is documented but what exactly
> caused the index corruption in my case is unknown.
>
> I think it would be great If index gets corrupted for some reason then the
> node should delete the index and rebuild it again without shutting down the
> node.
>
>
> On Fri, Sep 20, 2019 at 4:19 PM Igor Belyakov <ig...@gmail.com>
> wrote:
>
>> Hi,
>>
>> Could you please clarify what version of Ignite you're currently using?
>> Also can you attach full logs from all nodes and if it's possible provide
>> your persistence data for the cache with corrupted index tree ("
>> epro_model_abcd")?
>> By default Ii should be in ${IGNITE_HOME}/work/db/{node}/{cache}
>> directory.
>>
>> Regards,
>> Igor
>>
>> On Fri, Sep 20, 2019 at 1:21 PM Shiva Kumar <sh...@gmail.com>
>> wrote:
>>
>>> Hi all,
>>> I have deployed 3 node Ignite cluster with native persistence on
>>> Kubernetes and one of the node crashed with below error message,
>>>
>>> *org.h2.message.DbException: General error: "class
>>> org.apache.ignite.internal.processors.cache.persistence.tree.CorruptedTreeException:
>>> Runtime failure on search row: Row@8cfe967[ key: epro_model_abcdKey
>>> [idHash=822184780, hash=737706081, NE_ID=, NAME=], val: epro_model_abcd
>>> [idHash=60444003, hash=1539928610, epro_ID=51, LONGITUDE=null,
>>> DELETE_TIME=null, VENDOR=null, CREATE_TIME=2019-09-19T20:38:32.361929Z,
>>> UPDATE_TIME=2019-09-19T20:40:05.821447Z, ADDITIONAL_INFO=null,
>>> VALID_UNTIL=2019-11-18T20:38:32.362036Z, TYPE=null, LATITUDE=null], ver:
>>> GridCacheVersion [topVer=180326822, order=1568925345552, nodeOrder=6] ][
>>> 51, 2019-09-19T20:38:32.361929Z, 2019-09-19T20:40:05.821447Z, null,
>>> 2019-11-18T20:38:32.362036Z, , , null, null, null, null, null ]"
>>> [50000-197]|*
>>>
>>> Please find attached file [index_corruption.txt] for complete backtrace.
>>>
>>> It looks like the Index got corrupted, I am not sure what exactly caused
>>> the index to corrupt. Any knows issues related to this?
>>>
>>> In my cluster, many applications write into many tables simultaneously
>>> and some queries run on many tables simultaneously and frequently
>>> application deletes unwanted rows[old data] in the tables using *delete
>>> from table* SQL operation.
>>>
>>>
>>

Re: index corrupted error : org.apache.ignite.internal.processors.cache.persistence.tree.CorruptedTreeException: Runtime failure on search row

Posted by Shiva Kumar <sh...@gmail.com>.

Hi Igor,
Thanks for the response!
The version I am using is 2.7.0
Unfortunately, I do not have logs of all the nodes, but I have much more
extra logs (along with thread dump) of the node which reported index
corruption and attached the same.
Sorry as of now I can't share persistence data here.
I have 4 cache groups each cache groups having many tables.

Here are all index.bin files under the persistence directory.

[ignite@ignite-cluster-ignite-e-1 persistence]$
[ignite@ignite-cluster-ignite-esoc-1 persistence]$ find
/opt/ignite/persistence/ -name index.bin
/opt/ignite/persistence/node00-a6103519-fb67-45fd-8646-2b6d8cfac53e/metastorage/index.bin
/opt/ignite/persistence/node00-a6103519-fb67-45fd-8646-2b6d8cfac53e/cache-ignite-sys-cache/index.bin
/opt/ignite/persistence/node00-a6103519-fb67-45fd-8646-2b6d8cfac53e/cache-PUBLIC/index.bin
/opt/ignite/persistence/node00-a6103519-fb67-45fd-8646-2b6d8cfac53e/cacheGroup-groupEternal/index.bin
/opt/ignite/persistence/node00-a6103519-fb67-45fd-8646-2b6d8cfac53e/cacheGroup-groupmin15/index.bin
/opt/ignite/persistence/node00-a6103519-fb67-45fd-8646-2b6d8cfac53e/cacheGroup-groupmin1/index.bin
/opt/ignite/persistence/node00-a6103519-fb67-45fd-8646-2b6d8cfac53e/cacheGroup-groupmin5/index.bin
[ignite@ignite-cluster-ignite-e-1 persistence]$


In this ticket https://issues.apache.org/jira/browse/IGNITE-11252, the
steps to recover from index corruption is documented but what exactly
caused the index corruption in my case is unknown.

I think it would be great If index gets corrupted for some reason then the
node should delete the index and rebuild it again without shutting down the
node.


On Fri, Sep 20, 2019 at 4:19 PM Igor Belyakov <ig...@gmail.com>
wrote:

> Hi,
>
> Could you please clarify what version of Ignite you're currently using?
> Also can you attach full logs from all nodes and if it's possible provide
> your persistence data for the cache with corrupted index tree ("
> epro_model_abcd")?
> By default Ii should be in ${IGNITE_HOME}/work/db/{node}/{cache} directory.
>
> Regards,
> Igor
>
> On Fri, Sep 20, 2019 at 1:21 PM Shiva Kumar <sh...@gmail.com>
> wrote:
>
>> Hi all,
>> I have deployed 3 node Ignite cluster with native persistence on
>> Kubernetes and one of the node crashed with below error message,
>>
>> *org.h2.message.DbException: General error: "class
>> org.apache.ignite.internal.processors.cache.persistence.tree.CorruptedTreeException:
>> Runtime failure on search row: Row@8cfe967[ key: epro_model_abcdKey
>> [idHash=822184780, hash=737706081, NE_ID=, NAME=], val: epro_model_abcd
>> [idHash=60444003, hash=1539928610, epro_ID=51, LONGITUDE=null,
>> DELETE_TIME=null, VENDOR=null, CREATE_TIME=2019-09-19T20:38:32.361929Z,
>> UPDATE_TIME=2019-09-19T20:40:05.821447Z, ADDITIONAL_INFO=null,
>> VALID_UNTIL=2019-11-18T20:38:32.362036Z, TYPE=null, LATITUDE=null], ver:
>> GridCacheVersion [topVer=180326822, order=1568925345552, nodeOrder=6] ][
>> 51, 2019-09-19T20:38:32.361929Z, 2019-09-19T20:40:05.821447Z, null,
>> 2019-11-18T20:38:32.362036Z, , , null, null, null, null, null ]"
>> [50000-197]|*
>>
>> Please find attached file [index_corruption.txt] for complete backtrace.
>>
>> It looks like the Index got corrupted, I am not sure what exactly caused
>> the index to corrupt. Any knows issues related to this?
>>
>> In my cluster, many applications write into many tables simultaneously
>> and some queries run on many tables simultaneously and frequently
>> application deletes unwanted rows[old data] in the tables using *delete
>> from table* SQL operation.
>>
>>
>

Re: index corrupted error : org.apache.ignite.internal.processors.cache.persistence.tree.CorruptedTreeException: Runtime failure on search row

Posted by Igor Belyakov <ig...@gmail.com>.

Hi,

Could you please clarify what version of Ignite you're currently using?
Also can you attach full logs from all nodes and if it's possible provide
your persistence data for the cache with corrupted index tree ("
epro_model_abcd")?
By default Ii should be in ${IGNITE_HOME}/work/db/{node}/{cache} directory.

Regards,
Igor

On Fri, Sep 20, 2019 at 1:21 PM Shiva Kumar <sh...@gmail.com>
wrote:

> Hi all,
> I have deployed 3 node Ignite cluster with native persistence on
> Kubernetes and one of the node crashed with below error message,
>
> *org.h2.message.DbException: General error: "class
> org.apache.ignite.internal.processors.cache.persistence.tree.CorruptedTreeException:
> Runtime failure on search row: Row@8cfe967[ key: epro_model_abcdKey
> [idHash=822184780, hash=737706081, NE_ID=, NAME=], val: epro_model_abcd
> [idHash=60444003, hash=1539928610, epro_ID=51, LONGITUDE=null,
> DELETE_TIME=null, VENDOR=null, CREATE_TIME=2019-09-19T20:38:32.361929Z,
> UPDATE_TIME=2019-09-19T20:40:05.821447Z, ADDITIONAL_INFO=null,
> VALID_UNTIL=2019-11-18T20:38:32.362036Z, TYPE=null, LATITUDE=null], ver:
> GridCacheVersion [topVer=180326822, order=1568925345552, nodeOrder=6] ][
> 51, 2019-09-19T20:38:32.361929Z, 2019-09-19T20:40:05.821447Z, null,
> 2019-11-18T20:38:32.362036Z, , , null, null, null, null, null ]"
> [50000-197]|*
>
> Please find attached file [index_corruption.txt] for complete backtrace.
>
> It looks like the Index got corrupted, I am not sure what exactly caused
> the index to corrupt. Any knows issues related to this?
>
> In my cluster, many applications write into many tables simultaneously and
> some queries run on many tables simultaneously and frequently application
> deletes unwanted rows[old data] in the tables using *delete from table*
> SQL operation.
>
>