You are viewing a plain text version of this content. The canonical link for it is here.

Posted to user@hbase.apache.org by shanghaihyj <sh...@163.com> on 2018/05/17 07:35:08 UTC

Got Duplicate Records for the Same Row Key from a Snapshot

When we query a table by a particular row key, there is only one row returned by HBase, which is expected.
However, when we query a snapshot for that same table, by the same particular row key, five duplicate rows are returned.  Why ?




In the log of the master server, we see some snapshot-related error:
===================== ERROR START =====================
ERROR [master:sh-bs-3-b8-namenode-17-208:60000.archivedHFileCleaner] snapshot.SnapshotHFileCleaner: Exception while checking if files were valid, keeping them just in case.
./hbase-root-master-sh-bs-3-b8-namenode-17-208.log.7:org.apache.hadoop.hbase.snapshot.CorruptedSnapshotException: Couldn't read snapshot info from:hdfs://master1.hh:8020/hbase/.hbase-snapshot/.tmp/hb_anchor_original_total_7days_stat_1526423587063/.snapshotinfo
./hbase-root-master-sh-bs-3-b8-namenode-17-208.log.7-   at org.apache.hadoop.hbase.snapshot.SnapshotDescriptionUtils.readSnapshotInfo(SnapshotDescriptionUtils.java:325)
./hbase-root-master-sh-bs-3-b8-namenode-17-208.log.7-   at org.apache.hadoop.hbase.snapshot.SnapshotReferenceUtil.getHFileNames(SnapshotReferenceUtil.java:328)
./hbase-root-master-sh-bs-3-b8-namenode-17-208.log.7-   at org.apache.hadoop.hbase.master.snapshot.SnapshotHFileCleaner$1.filesUnderSnapshot(SnapshotHFileCleaner.java:85)
./hbase-root-master-sh-bs-3-b8-namenode-17-208.log.7-   at org.apache.hadoop.hbase.master.snapshot.SnapshotFileCache.getSnapshotsInProgress(SnapshotFileCache.java:303)
./hbase-root-master-sh-bs-3-b8-namenode-17-208.log.7-   at org.apache.hadoop.hbase.master.snapshot.SnapshotFileCache.getUnreferencedFiles(SnapshotFileCache.java:194)
./hbase-root-master-sh-bs-3-b8-namenode-17-208.log.7-   at org.apache.hadoop.hbase.master.snapshot.SnapshotHFileCleaner.getDeletableFiles(SnapshotHFileCleaner.java:62)
./hbase-root-master-sh-bs-3-b8-namenode-17-208.log.7-   at org.apache.hadoop.hbase.master.cleaner.CleanerChore.checkAndDeleteFiles(CleanerChore.java:233)
./hbase-root-master-sh-bs-3-b8-namenode-17-208.log.7-   at org.apache.hadoop.hbase.master.cleaner.CleanerChore.checkAndDeleteEntries(CleanerChore.java:157)
...
===================== ERROR END =====================
And we find a related issue for this error: https://issues.apache.org/jira/browse/HBASE-16464?attachmentSortBy=fileName


However, there is no proof that the error in the log is related to our problem of having duplicate records from a snapshot.
Our HBase version is 0.98.18-hadoop2.


Could you help give some hint why we are having duplicate records from the snapshot ?

Re:Re: Re:Got Duplicate Records for the Same Row Key from a Snapshot

Posted by shanghaihyj <sh...@163.com>.

Thank you for the reply.


We just found, in the comment of HBaseAdmin#snapshot(SnapshotDescription),
it says "Only a single snapshot should be taken at a time for an instance of HBase, or results may be undefined ".


Currently, we occasionally take snapshots concurrently, which is not correct according to the above comment.
Is this causing the redundant offline regions in our snapshot ? 









At 2018-05-23 01:58:57, "Saad Mufti" <sa...@gmail.com> wrote:
>I am not clear how your snapshot even succeeds if this is the case. The
>snapshot taking procedure includes  a check for consistency at the end and
>throws an exception on problems like this. I would run an hbck command on
>your table to check if there are any consistency errors. It also has repair
>options but you have to be careful with those. But running it in just
>checking mode doesn't change anything and will give you useful feedback.
>
>Hope this helps.
>
>----
>Saad
>
>
>On Fri, May 18, 2018 at 3:56 AM, shanghaihyj <sh...@163.com> wrote:
>
>> We find that the metadata of offline regions are included in the snapshot.
>>
>>
>> When we query a table, offline regions are not considered.
>> When we query a snapshot of this table, offline regions are included.
>> These offline regions refer to the same data in HDFS.  That is why
>> duplicate records are returned from the snapshot.
>>
>>
>> Any suggestion how to handle this gracefully ?
>>
>>
>>
>> At 2018-05-17 19:04:17, "shanghaihyj" <sh...@163.com> wrote:
>> >We are loading data from the HBase table or its snapshot by hbase-rdd (
>> https://github.com/unicredit/hbase-rdd). It uses TableInputFormat /
>> TableSnapshotInputFormat as the underlying input format.
>> >The scaner has max version set to 1.
>> >
>> >
>> >
>> >At 2018-05-17 15:35:08, "shanghaihyj" <sh...@163.com> wrote:
>> >
>> >When we query a table by a particular row key, there is only one row
>> returned by HBase, which is expected.
>> >However, when we query a snapshot for that same table, by the same
>> particular row key, five duplicate rows are returned.  Why ?
>> >
>> >
>> >
>> >
>> >In the log of the master server, we see some snapshot-related error:
>> >===================== ERROR START =====================
>> >ERROR [master:sh-bs-3-b8-namenode-17-208:60000.archivedHFileCleaner]
>> snapshot.SnapshotHFileCleaner: Exception while checking if files were
>> valid, keeping them just in case.
>> >./hbase-root-master-sh-bs-3-b8-namenode-17-208.log.7:org.
>> apache.hadoop.hbase.snapshot.CorruptedSnapshotException: Couldn't read
>> snapshot info from:hdfs://master1.hh:8020/hbase/.hbase-snapshot/.tmp/hb_
>> anchor_original_total_7days_stat_1526423587063/.snapshotinfo
>> >./hbase-root-master-sh-bs-3-b8-namenode-17-208.log.7-   at
>> org.apache.hadoop.hbase.snapshot.SnapshotDescriptionUtils.
>> readSnapshotInfo(SnapshotDescriptionUtils.java:325)
>> >./hbase-root-master-sh-bs-3-b8-namenode-17-208.log.7-   at
>> org.apache.hadoop.hbase.snapshot.SnapshotReferenceUtil.getHFileNames(
>> SnapshotReferenceUtil.java:328)
>> >./hbase-root-master-sh-bs-3-b8-namenode-17-208.log.7-   at
>> org.apache.hadoop.hbase.master.snapshot.SnapshotHFileCleaner$1.
>> filesUnderSnapshot(SnapshotHFileCleaner.java:85)
>> >./hbase-root-master-sh-bs-3-b8-namenode-17-208.log.7-   at
>> org.apache.hadoop.hbase.master.snapshot.SnapshotFileCache.
>> getSnapshotsInProgress(SnapshotFileCache.java:303)
>> >./hbase-root-master-sh-bs-3-b8-namenode-17-208.log.7-   at
>> org.apache.hadoop.hbase.master.snapshot.SnapshotFileCache.
>> getUnreferencedFiles(SnapshotFileCache.java:194)
>> >./hbase-root-master-sh-bs-3-b8-namenode-17-208.log.7-   at
>> org.apache.hadoop.hbase.master.snapshot.SnapshotHFileCleaner.
>> getDeletableFiles(SnapshotHFileCleaner.java:62)
>> >./hbase-root-master-sh-bs-3-b8-namenode-17-208.log.7-   at
>> org.apache.hadoop.hbase.master.cleaner.CleanerChore.checkAndDeleteFiles(
>> CleanerChore.java:233)
>> >./hbase-root-master-sh-bs-3-b8-namenode-17-208.log.7-   at
>> org.apache.hadoop.hbase.master.cleaner.CleanerChore.checkAndDeleteEntries(
>> CleanerChore.java:157)
>> >...
>> >===================== ERROR END =====================
>> >And we find a related issue for this error: https://issues.apache.org/
>> jira/browse/HBASE-16464?attachmentSortBy=fileName
>> >
>> >
>> >However, there is no proof that the error in the log is related to our
>> problem of having duplicate records from a snapshot.
>> >Our HBase version is 0.98.18-hadoop2.
>> >
>> >
>> >Could you help give some hint why we are having duplicate records from
>> the snapshot ?
>> >
>> >
>> >
>> >
>> >
>> >
>> >
>> >
>> >
>> >
>> >
>> >
>> >
>>

Re: Re:Got Duplicate Records for the Same Row Key from a Snapshot

Posted by Saad Mufti <sa...@gmail.com>.

I am not clear how your snapshot even succeeds if this is the case. The
snapshot taking procedure includes  a check for consistency at the end and
throws an exception on problems like this. I would run an hbck command on
your table to check if there are any consistency errors. It also has repair
options but you have to be careful with those. But running it in just
checking mode doesn't change anything and will give you useful feedback.

Hope this helps.

----
Saad


On Fri, May 18, 2018 at 3:56 AM, shanghaihyj <sh...@163.com> wrote:

> We find that the metadata of offline regions are included in the snapshot.
>
>
> When we query a table, offline regions are not considered.
> When we query a snapshot of this table, offline regions are included.
> These offline regions refer to the same data in HDFS.  That is why
> duplicate records are returned from the snapshot.
>
>
> Any suggestion how to handle this gracefully ?
>
>
>
> At 2018-05-17 19:04:17, "shanghaihyj" <sh...@163.com> wrote:
> >We are loading data from the HBase table or its snapshot by hbase-rdd (
> https://github.com/unicredit/hbase-rdd). It uses TableInputFormat /
> TableSnapshotInputFormat as the underlying input format.
> >The scaner has max version set to 1.
> >
> >
> >
> >At 2018-05-17 15:35:08, "shanghaihyj" <sh...@163.com> wrote:
> >
> >When we query a table by a particular row key, there is only one row
> returned by HBase, which is expected.
> >However, when we query a snapshot for that same table, by the same
> particular row key, five duplicate rows are returned.  Why ?
> >
> >
> >
> >
> >In the log of the master server, we see some snapshot-related error:
> >===================== ERROR START =====================
> >ERROR [master:sh-bs-3-b8-namenode-17-208:60000.archivedHFileCleaner]
> snapshot.SnapshotHFileCleaner: Exception while checking if files were
> valid, keeping them just in case.
> >./hbase-root-master-sh-bs-3-b8-namenode-17-208.log.7:org.
> apache.hadoop.hbase.snapshot.CorruptedSnapshotException: Couldn't read
> snapshot info from:hdfs://master1.hh:8020/hbase/.hbase-snapshot/.tmp/hb_
> anchor_original_total_7days_stat_1526423587063/.snapshotinfo
> >./hbase-root-master-sh-bs-3-b8-namenode-17-208.log.7-   at
> org.apache.hadoop.hbase.snapshot.SnapshotDescriptionUtils.
> readSnapshotInfo(SnapshotDescriptionUtils.java:325)
> >./hbase-root-master-sh-bs-3-b8-namenode-17-208.log.7-   at
> org.apache.hadoop.hbase.snapshot.SnapshotReferenceUtil.getHFileNames(
> SnapshotReferenceUtil.java:328)
> >./hbase-root-master-sh-bs-3-b8-namenode-17-208.log.7-   at
> org.apache.hadoop.hbase.master.snapshot.SnapshotHFileCleaner$1.
> filesUnderSnapshot(SnapshotHFileCleaner.java:85)
> >./hbase-root-master-sh-bs-3-b8-namenode-17-208.log.7-   at
> org.apache.hadoop.hbase.master.snapshot.SnapshotFileCache.
> getSnapshotsInProgress(SnapshotFileCache.java:303)
> >./hbase-root-master-sh-bs-3-b8-namenode-17-208.log.7-   at
> org.apache.hadoop.hbase.master.snapshot.SnapshotFileCache.
> getUnreferencedFiles(SnapshotFileCache.java:194)
> >./hbase-root-master-sh-bs-3-b8-namenode-17-208.log.7-   at
> org.apache.hadoop.hbase.master.snapshot.SnapshotHFileCleaner.
> getDeletableFiles(SnapshotHFileCleaner.java:62)
> >./hbase-root-master-sh-bs-3-b8-namenode-17-208.log.7-   at
> org.apache.hadoop.hbase.master.cleaner.CleanerChore.checkAndDeleteFiles(
> CleanerChore.java:233)
> >./hbase-root-master-sh-bs-3-b8-namenode-17-208.log.7-   at
> org.apache.hadoop.hbase.master.cleaner.CleanerChore.checkAndDeleteEntries(
> CleanerChore.java:157)
> >...
> >===================== ERROR END =====================
> >And we find a related issue for this error: https://issues.apache.org/
> jira/browse/HBASE-16464?attachmentSortBy=fileName
> >
> >
> >However, there is no proof that the error in the log is related to our
> problem of having duplicate records from a snapshot.
> >Our HBase version is 0.98.18-hadoop2.
> >
> >
> >Could you help give some hint why we are having duplicate records from
> the snapshot ?
> >
> >
> >
> >
> >
> >
> >
> >
> >
> >
> >
> >
> >
>

Re:Re:Got Duplicate Records for the Same Row Key from a Snapshot

Posted by shanghaihyj <sh...@163.com>.

We find that the metadata of offline regions are included in the snapshot.


When we query a table, offline regions are not considered.
When we query a snapshot of this table, offline regions are included.
These offline regions refer to the same data in HDFS.  That is why duplicate records are returned from the snapshot.


Any suggestion how to handle this gracefully ?



At 2018-05-17 19:04:17, "shanghaihyj" <sh...@163.com> wrote:
>We are loading data from the HBase table or its snapshot by hbase-rdd (https://github.com/unicredit/hbase-rdd). It uses TableInputFormat / TableSnapshotInputFormat as the underlying input format.
>The scaner has max version set to 1.
>
>
>
>At 2018-05-17 15:35:08, "shanghaihyj" <sh...@163.com> wrote:
>
>When we query a table by a particular row key, there is only one row returned by HBase, which is expected.
>However, when we query a snapshot for that same table, by the same particular row key, five duplicate rows are returned.  Why ?
>
>
>
>
>In the log of the master server, we see some snapshot-related error:
>===================== ERROR START =====================
>ERROR [master:sh-bs-3-b8-namenode-17-208:60000.archivedHFileCleaner] snapshot.SnapshotHFileCleaner: Exception while checking if files were valid, keeping them just in case.
>./hbase-root-master-sh-bs-3-b8-namenode-17-208.log.7:org.apache.hadoop.hbase.snapshot.CorruptedSnapshotException: Couldn't read snapshot info from:hdfs://master1.hh:8020/hbase/.hbase-snapshot/.tmp/hb_anchor_original_total_7days_stat_1526423587063/.snapshotinfo
>./hbase-root-master-sh-bs-3-b8-namenode-17-208.log.7-   at org.apache.hadoop.hbase.snapshot.SnapshotDescriptionUtils.readSnapshotInfo(SnapshotDescriptionUtils.java:325)
>./hbase-root-master-sh-bs-3-b8-namenode-17-208.log.7-   at org.apache.hadoop.hbase.snapshot.SnapshotReferenceUtil.getHFileNames(SnapshotReferenceUtil.java:328)
>./hbase-root-master-sh-bs-3-b8-namenode-17-208.log.7-   at org.apache.hadoop.hbase.master.snapshot.SnapshotHFileCleaner$1.filesUnderSnapshot(SnapshotHFileCleaner.java:85)
>./hbase-root-master-sh-bs-3-b8-namenode-17-208.log.7-   at org.apache.hadoop.hbase.master.snapshot.SnapshotFileCache.getSnapshotsInProgress(SnapshotFileCache.java:303)
>./hbase-root-master-sh-bs-3-b8-namenode-17-208.log.7-   at org.apache.hadoop.hbase.master.snapshot.SnapshotFileCache.getUnreferencedFiles(SnapshotFileCache.java:194)
>./hbase-root-master-sh-bs-3-b8-namenode-17-208.log.7-   at org.apache.hadoop.hbase.master.snapshot.SnapshotHFileCleaner.getDeletableFiles(SnapshotHFileCleaner.java:62)
>./hbase-root-master-sh-bs-3-b8-namenode-17-208.log.7-   at org.apache.hadoop.hbase.master.cleaner.CleanerChore.checkAndDeleteFiles(CleanerChore.java:233)
>./hbase-root-master-sh-bs-3-b8-namenode-17-208.log.7-   at org.apache.hadoop.hbase.master.cleaner.CleanerChore.checkAndDeleteEntries(CleanerChore.java:157)
>...
>===================== ERROR END =====================
>And we find a related issue for this error: https://issues.apache.org/jira/browse/HBASE-16464?attachmentSortBy=fileName
>
>
>However, there is no proof that the error in the log is related to our problem of having duplicate records from a snapshot.
>Our HBase version is 0.98.18-hadoop2.
>
>
>Could you help give some hint why we are having duplicate records from the snapshot ?
>
>
>
>
>
>
>
>
>
>
>
>
>

Re:Got Duplicate Records for the Same Row Key from a Snapshot

Posted by shanghaihyj <sh...@163.com>.

We are loading data from the HBase table or its snapshot by hbase-rdd (https://github.com/unicredit/hbase-rdd). It uses TableInputFormat / TableSnapshotInputFormat as the underlying input format.
The scaner has max version set to 1.



At 2018-05-17 15:35:08, "shanghaihyj" <sh...@163.com> wrote:

When we query a table by a particular row key, there is only one row returned by HBase, which is expected.
However, when we query a snapshot for that same table, by the same particular row key, five duplicate rows are returned.  Why ?




In the log of the master server, we see some snapshot-related error:
===================== ERROR START =====================
ERROR [master:sh-bs-3-b8-namenode-17-208:60000.archivedHFileCleaner] snapshot.SnapshotHFileCleaner: Exception while checking if files were valid, keeping them just in case.
./hbase-root-master-sh-bs-3-b8-namenode-17-208.log.7:org.apache.hadoop.hbase.snapshot.CorruptedSnapshotException: Couldn't read snapshot info from:hdfs://master1.hh:8020/hbase/.hbase-snapshot/.tmp/hb_anchor_original_total_7days_stat_1526423587063/.snapshotinfo
./hbase-root-master-sh-bs-3-b8-namenode-17-208.log.7-   at org.apache.hadoop.hbase.snapshot.SnapshotDescriptionUtils.readSnapshotInfo(SnapshotDescriptionUtils.java:325)
./hbase-root-master-sh-bs-3-b8-namenode-17-208.log.7-   at org.apache.hadoop.hbase.snapshot.SnapshotReferenceUtil.getHFileNames(SnapshotReferenceUtil.java:328)
./hbase-root-master-sh-bs-3-b8-namenode-17-208.log.7-   at org.apache.hadoop.hbase.master.snapshot.SnapshotHFileCleaner$1.filesUnderSnapshot(SnapshotHFileCleaner.java:85)
./hbase-root-master-sh-bs-3-b8-namenode-17-208.log.7-   at org.apache.hadoop.hbase.master.snapshot.SnapshotFileCache.getSnapshotsInProgress(SnapshotFileCache.java:303)
./hbase-root-master-sh-bs-3-b8-namenode-17-208.log.7-   at org.apache.hadoop.hbase.master.snapshot.SnapshotFileCache.getUnreferencedFiles(SnapshotFileCache.java:194)
./hbase-root-master-sh-bs-3-b8-namenode-17-208.log.7-   at org.apache.hadoop.hbase.master.snapshot.SnapshotHFileCleaner.getDeletableFiles(SnapshotHFileCleaner.java:62)
./hbase-root-master-sh-bs-3-b8-namenode-17-208.log.7-   at org.apache.hadoop.hbase.master.cleaner.CleanerChore.checkAndDeleteFiles(CleanerChore.java:233)
./hbase-root-master-sh-bs-3-b8-namenode-17-208.log.7-   at org.apache.hadoop.hbase.master.cleaner.CleanerChore.checkAndDeleteEntries(CleanerChore.java:157)
...
===================== ERROR END =====================
And we find a related issue for this error: https://issues.apache.org/jira/browse/HBASE-16464?attachmentSortBy=fileName


However, there is no proof that the error in the log is related to our problem of having duplicate records from a snapshot.
Our HBase version is 0.98.18-hadoop2.


Could you help give some hint why we are having duplicate records from the snapshot ?