You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@hbase.apache.org by "stack (JIRA)" <ji...@apache.org> on 2009/09/01 06:55:32 UTC
[jira] Updated: (HBASE-1784) Missing rows after medium intensity insert

     [ https://issues.apache.org/jira/browse/HBASE-1784?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

stack updated HBASE-1784:
-------------------------

    Attachment: 1784-v2.patch

So, this patch adds distrust of the view returned by BaseScanner (Scanners do not respect row locks in 0.20.0).   Inside in checkAssign, if server address is null, we'll do a new Get to ensure its still null just before we decide to set a region as unassigned.  It also adds an explicit set of the BaseScanner caching to 1 in case caching is changed in hbase-site.xml to avoid client-side configurations effecting the BaseScanner running in the Master.

Patch includes the earlier patch for handling the split message not assigning a region already assigned.

It does not include Andrews' change (Looks like that can be applied independently).

Its tough testing for this scenario.  Patch logs if it comes across the issue where Scan sees null but the Get actually gets value.

Any chance of a review and if it looks good to you, trying it on  your upload Mathias?

Thanks boss.

> Missing rows after medium intensity insert
> ------------------------------------------
>
>                 Key: HBASE-1784
>                 URL: https://issues.apache.org/jira/browse/HBASE-1784
>             Project: Hadoop HBase
>          Issue Type: Bug
>    Affects Versions: 0.20.0
>            Reporter: Jean-Daniel Cryans
>            Priority: Blocker
>             Fix For: 0.20.0
>
>         Attachments: 1784-v2.patch, 1784.patch, DataLoad.java, dbl-assignment-20090831, double-assignment, HBASE-1784-StoreFileScanner-hack.patch, HBASE-1784.log, META.log, processSplitRegion-check-regionIsOpening.patch
>
>
> This bug was uncovered by Mathias in his mail "Issue on data load with 0.20.0-rc2". Basically, somehow, after a medium intensity insert a lot of rows goes missing. Easy way to reproduce : PE. Doing a PE scan or randomRead afterwards won't uncover anything since it doesn't bother about null rows. Simply do a count in the shell, easy to test (I changed my scanner caching in the shell to do it faster).
> I tested some light insertions with force flush/compact/split in the shell and it doesn't break.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.