You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@kudu.apache.org by "Todd Lipcon (JIRA)" <ji...@apache.org> on 2018/03/20 06:02:00 UTC
[jira] [Commented] (KUDU-1989) kudu-tserver met checksum mismatch
after node crash and restart.
[ https://issues.apache.org/jira/browse/KUDU-1989?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16405847#comment-16405847 ]
Todd Lipcon commented on KUDU-1989:
-----------------------------------
Saw this issue again on a cluster. The tail of the metadata file looks like:
I0319 22:56:08.389683 71257 pb_util.cc:264] Reading PB with version 2 starting at offset 1955
33 block_id { id: 14957875 } op_type: DELETE timestamp_us: 1514256262030601
I0319 22:56:08.389689 71257 pb_util.cc:264] Reading PB with version 2 starting at offset 1989
Corruption: Data length checksum does not match: Incorrect checksum in file /data/4/kudu/data/a9264be259c44604a82726cdb04b9e09.metadata at offset 1993: Checksum does not match. Expected: 0. Actual: 1214729159
00007a0: fc42 a216 0000 0088 e846 650a 0909 333d .B.......Fe...3=
00007b0: e400 0000 0000 1002 1889 cae3 94d4 a6d8 ................
00007c0: 02b4 cdc4 bf00 0000 0000 0000 0000 0000 ................
00007d0: 0000 0000 0000 0000 0000 0000 0000 0000 ................
00007e0: 0000 0000 0000 00 .......
The '0000' start exactly at the protobuf boundary. The mtime on this file is 2017-12-25 18:44:35.786898579 and 'last' shows it rebooted around that time:
reboot system boot 2.6.32-573.26.1. Mon Dec 25 18:49 - 12:54 (2+18:05)
> kudu-tserver met checksum mismatch after node crash and restart.
> ----------------------------------------------------------------
>
> Key: KUDU-1989
> URL: https://issues.apache.org/jira/browse/KUDU-1989
> Project: Kudu
> Issue Type: Bug
> Components: fs
> Reporter: zhangsong
> Priority: Major
>
> kudu-tserver version: 1.0.0
> 1 firstly node crashed
> 2 when trying to restart the kudu-tserver , found it could not be restarted successfully.
> 3 log content in kudu-tserver.FATAL:
> "
> Log line format: [IWEF]mmdd hh:mm:ss.uuuuuu threadid file:line] msg
> F0421 16:01:09.283123 20127 tablet_server_main.cc:55] Check failed: _s.ok() Bad status: Corruption: Failed to load FS layout: Could not read records from container /export/servers/kudu/1.0-sp/tserver_data/data/a22af504ca16421aad511b14c51130a9: Data length checksum does not match: Incorrect checksum in file /export/servers/kudu/1.0-sp/tserver_data/data/a22af504ca16421aad511b14c51130a9.metadata at offset 753661: Checksum does not match. Expected: 843507848. Actual: 1699145864
> "
> Not sure if this has been reported , create it here.
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)