You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@hbase.apache.org by "Jonathan Hsieh (Created) (JIRA)" <ji...@apache.org> on 2011/10/11 01:06:29 UTC

[jira] [Created] (HBASE-4570) Scan ACID problem with concurrent puts.

Scan ACID problem with concurrent puts.
---------------------------------------

                 Key: HBASE-4570
                 URL: https://issues.apache.org/jira/browse/HBASE-4570
             Project: HBase
          Issue Type: Bug
          Components: client, regionserver
    Affects Versions: 0.90.3, 0.90.1
            Reporter: Jonathan Hsieh


When scanning a table sometimes rows that have multiple column families get split into two rows if there are concurrent writes.  In this particular case we are overwriting the contents of a Get directly back onto itself as a Put.

For example, this is a two cf row (with "f1", "f2", .. "f9" cfs).  It is actually returned as two rows (#55 and #56). Interestingly if the two were merged we would have a single proper row.

Row row0000024461 had time stamps: [55: keyvalues={row0000024461/f0:data/1318200440867/Put/vlen=1000, row0000024461/f0:qual/1318200440867/Put/vlen=10, row0000024461/f1:data/1318200440867/Put/vlen=1000, row0000024461/f1:qual/1318200440867/Put/vlen=10, row0000024461/f2:data/1318200440867/Put/vlen=1000, row0000024461/f2:qual/1318200440867/Put/vlen=10, row0000024461/f3:data/1318200440867/Put/vlen=1000, row0000024461/f3:qual/1318200440867/Put/vlen=10, row0000024461/f4:data/1318200440867/Put/vlen=1000, row0000024461/f4:qual/1318200440867/Put/vlen=10}, 
56: keyvalues={row0000024461/f5:data/1318200440867/Put/vlen=1000, row0000024461/f5:qual/1318200440867/Put/vlen=10, row0000024461/f6:data/1318200440867/Put/vlen=1000, row0000024461/f6:qual/1318200440867/Put/vlen=10, row0000024461/f7:data/1318200440867/Put/vlen=1000, row0000024461/f7:qual/1318200440867/Put/vlen=10, row0000024461/f8:data/1318200440867/Put/vlen=1000, row0000024461/f8:qual/1318200440867/Put/vlen=10, row0000024461/f9:data/1318200440867/Put/vlen=1000, row0000024461/f9:qual/1318200440867/Put/vlen=10}]

I've only tested this on 0.90.1+patches and 0.90.3+patches, but it is consistent and duplicatable.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Resolved] (HBASE-4570) Scan ACID problem with concurrent puts.

Posted by "Todd Lipcon (Resolved) (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HBASE-4570?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Todd Lipcon resolved HBASE-4570.
--------------------------------

       Resolution: Fixed
    Fix Version/s: 0.92.0
         Assignee: Jonathan Hsieh
     Hadoop Flags: Reviewed

Fixed in 90, 92, trunk branches
                
> Scan ACID problem with concurrent puts.
> ---------------------------------------
>
>                 Key: HBASE-4570
>                 URL: https://issues.apache.org/jira/browse/HBASE-4570
>             Project: HBase
>          Issue Type: Bug
>          Components: client, regionserver
>    Affects Versions: 0.90.1, 0.90.3
>            Reporter: Jonathan Hsieh
>            Assignee: Jonathan Hsieh
>             Fix For: 0.92.0, 0.90.5
>
>         Attachments: 4570-instrumentation.tgz, hbase-4570.tgz, hbase-4570.txt
>
>
> When scanning a table sometimes rows that have multiple column families get split into two rows if there are concurrent writes.  In this particular case we are overwriting the contents of a Get directly back onto itself as a Put.
> For example, this is a two cf row (with "f1", "f2", .. "f9" cfs).  It is actually returned as two rows (#55 and #56). Interestingly if the two were merged we would have a single proper row.
> Row row0000024461 had time stamps: [55: keyvalues={row0000024461/f0:data/1318200440867/Put/vlen=1000, row0000024461/f0:qual/1318200440867/Put/vlen=10, row0000024461/f1:data/1318200440867/Put/vlen=1000, row0000024461/f1:qual/1318200440867/Put/vlen=10, row0000024461/f2:data/1318200440867/Put/vlen=1000, row0000024461/f2:qual/1318200440867/Put/vlen=10, row0000024461/f3:data/1318200440867/Put/vlen=1000, row0000024461/f3:qual/1318200440867/Put/vlen=10, row0000024461/f4:data/1318200440867/Put/vlen=1000, row0000024461/f4:qual/1318200440867/Put/vlen=10}, 
> 56: keyvalues={row0000024461/f5:data/1318200440867/Put/vlen=1000, row0000024461/f5:qual/1318200440867/Put/vlen=10, row0000024461/f6:data/1318200440867/Put/vlen=1000, row0000024461/f6:qual/1318200440867/Put/vlen=10, row0000024461/f7:data/1318200440867/Put/vlen=1000, row0000024461/f7:qual/1318200440867/Put/vlen=10, row0000024461/f8:data/1318200440867/Put/vlen=1000, row0000024461/f8:qual/1318200440867/Put/vlen=10, row0000024461/f9:data/1318200440867/Put/vlen=1000, row0000024461/f9:qual/1318200440867/Put/vlen=10}]
> I've only tested this on 0.90.1+patches and 0.90.3+patches, but it is consistent and duplicatable.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HBASE-4570) Scan ACID problem with concurrent puts.

Posted by "Jonathan Ellis (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-4570?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13128033#comment-13128033 ] 

Jonathan Ellis commented on HBASE-4570:
---------------------------------------

bq. we could use an AtomicFieldUpdater with lazySet to put the cost only on the write side

I thought that (a) ARFU requires that its target be volatile still, and (b) that the point of lazySet was to allow cheaper writes, with no effect on reads.
                
> Scan ACID problem with concurrent puts.
> ---------------------------------------
>
>                 Key: HBASE-4570
>                 URL: https://issues.apache.org/jira/browse/HBASE-4570
>             Project: HBase
>          Issue Type: Bug
>          Components: client, regionserver
>    Affects Versions: 0.90.1, 0.90.3
>            Reporter: Jonathan Hsieh
>         Attachments: 4570-instrumentation.tgz, hbase-4570.tgz, hbase-4570.txt
>
>
> When scanning a table sometimes rows that have multiple column families get split into two rows if there are concurrent writes.  In this particular case we are overwriting the contents of a Get directly back onto itself as a Put.
> For example, this is a two cf row (with "f1", "f2", .. "f9" cfs).  It is actually returned as two rows (#55 and #56). Interestingly if the two were merged we would have a single proper row.
> Row row0000024461 had time stamps: [55: keyvalues={row0000024461/f0:data/1318200440867/Put/vlen=1000, row0000024461/f0:qual/1318200440867/Put/vlen=10, row0000024461/f1:data/1318200440867/Put/vlen=1000, row0000024461/f1:qual/1318200440867/Put/vlen=10, row0000024461/f2:data/1318200440867/Put/vlen=1000, row0000024461/f2:qual/1318200440867/Put/vlen=10, row0000024461/f3:data/1318200440867/Put/vlen=1000, row0000024461/f3:qual/1318200440867/Put/vlen=10, row0000024461/f4:data/1318200440867/Put/vlen=1000, row0000024461/f4:qual/1318200440867/Put/vlen=10}, 
> 56: keyvalues={row0000024461/f5:data/1318200440867/Put/vlen=1000, row0000024461/f5:qual/1318200440867/Put/vlen=10, row0000024461/f6:data/1318200440867/Put/vlen=1000, row0000024461/f6:qual/1318200440867/Put/vlen=10, row0000024461/f7:data/1318200440867/Put/vlen=1000, row0000024461/f7:qual/1318200440867/Put/vlen=10, row0000024461/f8:data/1318200440867/Put/vlen=1000, row0000024461/f8:qual/1318200440867/Put/vlen=10, row0000024461/f9:data/1318200440867/Put/vlen=1000, row0000024461/f9:qual/1318200440867/Put/vlen=10}]
> I've only tested this on 0.90.1+patches and 0.90.3+patches, but it is consistent and duplicatable.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HBASE-4570) Scan ACID problem with concurrent puts.

Posted by "Ted Yu (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-4570?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13128031#comment-13128031 ] 

Ted Yu commented on HBASE-4570:
-------------------------------

Using patch for 4485 (including 2856, without variable length memstoreTS) combined with hbase-4570.txt, I still got:
{code}
Tests in error: 
  testScanAtomicity(org.apache.hadoop.hbase.TestAcidGuarantees): Deferred
  testMixedAtomicity(org.apache.hadoop.hbase.TestAcidGuarantees): Deferred
...
TestAcidGuarantees failed, iteration: 3
{code}
But this is some progress - previously TestAcidGuarantees failed every time.
                
> Scan ACID problem with concurrent puts.
> ---------------------------------------
>
>                 Key: HBASE-4570
>                 URL: https://issues.apache.org/jira/browse/HBASE-4570
>             Project: HBase
>          Issue Type: Bug
>          Components: client, regionserver
>    Affects Versions: 0.90.1, 0.90.3
>            Reporter: Jonathan Hsieh
>         Attachments: 4570-instrumentation.tgz, hbase-4570.tgz, hbase-4570.txt
>
>
> When scanning a table sometimes rows that have multiple column families get split into two rows if there are concurrent writes.  In this particular case we are overwriting the contents of a Get directly back onto itself as a Put.
> For example, this is a two cf row (with "f1", "f2", .. "f9" cfs).  It is actually returned as two rows (#55 and #56). Interestingly if the two were merged we would have a single proper row.
> Row row0000024461 had time stamps: [55: keyvalues={row0000024461/f0:data/1318200440867/Put/vlen=1000, row0000024461/f0:qual/1318200440867/Put/vlen=10, row0000024461/f1:data/1318200440867/Put/vlen=1000, row0000024461/f1:qual/1318200440867/Put/vlen=10, row0000024461/f2:data/1318200440867/Put/vlen=1000, row0000024461/f2:qual/1318200440867/Put/vlen=10, row0000024461/f3:data/1318200440867/Put/vlen=1000, row0000024461/f3:qual/1318200440867/Put/vlen=10, row0000024461/f4:data/1318200440867/Put/vlen=1000, row0000024461/f4:qual/1318200440867/Put/vlen=10}, 
> 56: keyvalues={row0000024461/f5:data/1318200440867/Put/vlen=1000, row0000024461/f5:qual/1318200440867/Put/vlen=10, row0000024461/f6:data/1318200440867/Put/vlen=1000, row0000024461/f6:qual/1318200440867/Put/vlen=10, row0000024461/f7:data/1318200440867/Put/vlen=1000, row0000024461/f7:qual/1318200440867/Put/vlen=10, row0000024461/f8:data/1318200440867/Put/vlen=1000, row0000024461/f8:qual/1318200440867/Put/vlen=10, row0000024461/f9:data/1318200440867/Put/vlen=1000, row0000024461/f9:qual/1318200440867/Put/vlen=10}]
> I've only tested this on 0.90.1+patches and 0.90.3+patches, but it is consistent and duplicatable.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HBASE-4570) Scan ACID problem with concurrent puts.

Posted by "Jonathan Hsieh (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-4570?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13127162#comment-13127162 ] 

Jonathan Hsieh commented on HBASE-4570:
---------------------------------------

I can still run these and see acid failues on today's trunk with git hash b45dfec.  

I've also tried on a build that applies HBASE-2856 v11 (https://reviews.apache.org/r/2224/diff/#index_header) it also still has the same problem.  



                
> Scan ACID problem with concurrent puts.
> ---------------------------------------
>
>                 Key: HBASE-4570
>                 URL: https://issues.apache.org/jira/browse/HBASE-4570
>             Project: HBase
>          Issue Type: Bug
>          Components: client, regionserver
>    Affects Versions: 0.90.1, 0.90.3
>            Reporter: Jonathan Hsieh
>         Attachments: hbase-4570.tgz
>
>
> When scanning a table sometimes rows that have multiple column families get split into two rows if there are concurrent writes.  In this particular case we are overwriting the contents of a Get directly back onto itself as a Put.
> For example, this is a two cf row (with "f1", "f2", .. "f9" cfs).  It is actually returned as two rows (#55 and #56). Interestingly if the two were merged we would have a single proper row.
> Row row0000024461 had time stamps: [55: keyvalues={row0000024461/f0:data/1318200440867/Put/vlen=1000, row0000024461/f0:qual/1318200440867/Put/vlen=10, row0000024461/f1:data/1318200440867/Put/vlen=1000, row0000024461/f1:qual/1318200440867/Put/vlen=10, row0000024461/f2:data/1318200440867/Put/vlen=1000, row0000024461/f2:qual/1318200440867/Put/vlen=10, row0000024461/f3:data/1318200440867/Put/vlen=1000, row0000024461/f3:qual/1318200440867/Put/vlen=10, row0000024461/f4:data/1318200440867/Put/vlen=1000, row0000024461/f4:qual/1318200440867/Put/vlen=10}, 
> 56: keyvalues={row0000024461/f5:data/1318200440867/Put/vlen=1000, row0000024461/f5:qual/1318200440867/Put/vlen=10, row0000024461/f6:data/1318200440867/Put/vlen=1000, row0000024461/f6:qual/1318200440867/Put/vlen=10, row0000024461/f7:data/1318200440867/Put/vlen=1000, row0000024461/f7:qual/1318200440867/Put/vlen=10, row0000024461/f8:data/1318200440867/Put/vlen=1000, row0000024461/f8:qual/1318200440867/Put/vlen=10, row0000024461/f9:data/1318200440867/Put/vlen=1000, row0000024461/f9:qual/1318200440867/Put/vlen=10}]
> I've only tested this on 0.90.1+patches and 0.90.3+patches, but it is consistent and duplicatable.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HBASE-4570) Scan ACID problem with concurrent puts.

Posted by "Todd Lipcon (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-4570?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13129026#comment-13129026 ] 

Todd Lipcon commented on HBASE-4570:
------------------------------------

Cool, I will commit this to 90, 92, and trunk momentarily.
                
> Scan ACID problem with concurrent puts.
> ---------------------------------------
>
>                 Key: HBASE-4570
>                 URL: https://issues.apache.org/jira/browse/HBASE-4570
>             Project: HBase
>          Issue Type: Bug
>          Components: client, regionserver
>    Affects Versions: 0.90.1, 0.90.3
>            Reporter: Jonathan Hsieh
>             Fix For: 0.90.5
>
>         Attachments: 4570-instrumentation.tgz, hbase-4570.tgz, hbase-4570.txt
>
>
> When scanning a table sometimes rows that have multiple column families get split into two rows if there are concurrent writes.  In this particular case we are overwriting the contents of a Get directly back onto itself as a Put.
> For example, this is a two cf row (with "f1", "f2", .. "f9" cfs).  It is actually returned as two rows (#55 and #56). Interestingly if the two were merged we would have a single proper row.
> Row row0000024461 had time stamps: [55: keyvalues={row0000024461/f0:data/1318200440867/Put/vlen=1000, row0000024461/f0:qual/1318200440867/Put/vlen=10, row0000024461/f1:data/1318200440867/Put/vlen=1000, row0000024461/f1:qual/1318200440867/Put/vlen=10, row0000024461/f2:data/1318200440867/Put/vlen=1000, row0000024461/f2:qual/1318200440867/Put/vlen=10, row0000024461/f3:data/1318200440867/Put/vlen=1000, row0000024461/f3:qual/1318200440867/Put/vlen=10, row0000024461/f4:data/1318200440867/Put/vlen=1000, row0000024461/f4:qual/1318200440867/Put/vlen=10}, 
> 56: keyvalues={row0000024461/f5:data/1318200440867/Put/vlen=1000, row0000024461/f5:qual/1318200440867/Put/vlen=10, row0000024461/f6:data/1318200440867/Put/vlen=1000, row0000024461/f6:qual/1318200440867/Put/vlen=10, row0000024461/f7:data/1318200440867/Put/vlen=1000, row0000024461/f7:qual/1318200440867/Put/vlen=10, row0000024461/f8:data/1318200440867/Put/vlen=1000, row0000024461/f8:qual/1318200440867/Put/vlen=10, row0000024461/f9:data/1318200440867/Put/vlen=1000, row0000024461/f9:qual/1318200440867/Put/vlen=10}]
> I've only tested this on 0.90.1+patches and 0.90.3+patches, but it is consistent and duplicatable.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Updated] (HBASE-4570) Scan ACID problem with concurrent puts.

Posted by "Jonathan Hsieh (Updated) (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HBASE-4570?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Jonathan Hsieh updated HBASE-4570:
----------------------------------

    Attachment: hbase-4570.tgz

I've attached a file with some standalone programs that generate data, scan+count, and scan+twiddle.  It includes instructions on how to duplicate the problem.

I've tried duplicating the problem in a unit test but have not been able to reproduce it as reliably.
                
> Scan ACID problem with concurrent puts.
> ---------------------------------------
>
>                 Key: HBASE-4570
>                 URL: https://issues.apache.org/jira/browse/HBASE-4570
>             Project: HBase
>          Issue Type: Bug
>          Components: client, regionserver
>    Affects Versions: 0.90.1, 0.90.3
>            Reporter: Jonathan Hsieh
>         Attachments: hbase-4570.tgz
>
>
> When scanning a table sometimes rows that have multiple column families get split into two rows if there are concurrent writes.  In this particular case we are overwriting the contents of a Get directly back onto itself as a Put.
> For example, this is a two cf row (with "f1", "f2", .. "f9" cfs).  It is actually returned as two rows (#55 and #56). Interestingly if the two were merged we would have a single proper row.
> Row row0000024461 had time stamps: [55: keyvalues={row0000024461/f0:data/1318200440867/Put/vlen=1000, row0000024461/f0:qual/1318200440867/Put/vlen=10, row0000024461/f1:data/1318200440867/Put/vlen=1000, row0000024461/f1:qual/1318200440867/Put/vlen=10, row0000024461/f2:data/1318200440867/Put/vlen=1000, row0000024461/f2:qual/1318200440867/Put/vlen=10, row0000024461/f3:data/1318200440867/Put/vlen=1000, row0000024461/f3:qual/1318200440867/Put/vlen=10, row0000024461/f4:data/1318200440867/Put/vlen=1000, row0000024461/f4:qual/1318200440867/Put/vlen=10}, 
> 56: keyvalues={row0000024461/f5:data/1318200440867/Put/vlen=1000, row0000024461/f5:qual/1318200440867/Put/vlen=10, row0000024461/f6:data/1318200440867/Put/vlen=1000, row0000024461/f6:qual/1318200440867/Put/vlen=10, row0000024461/f7:data/1318200440867/Put/vlen=1000, row0000024461/f7:qual/1318200440867/Put/vlen=10, row0000024461/f8:data/1318200440867/Put/vlen=1000, row0000024461/f8:qual/1318200440867/Put/vlen=10, row0000024461/f9:data/1318200440867/Put/vlen=1000, row0000024461/f9:qual/1318200440867/Put/vlen=10}]
> I've only tested this on 0.90.1+patches and 0.90.3+patches, but it is consistent and duplicatable.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HBASE-4570) Scan ACID problem with concurrent puts.

Posted by "Todd Lipcon (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-4570?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13128008#comment-13128008 ] 

Todd Lipcon commented on HBASE-4570:
------------------------------------

Jon and I spent the afternoon with his test cases. We've found the issue - it's a nice one!

In KeyValue, we have the following code:
{code}
  public byte [] getRow() {
    if (rowCache == null) {
      int o = getRowOffset();
      short l = getRowLength();
      rowCache = new byte[l];
      System.arraycopy(getBuffer(), o, rowCache, 0, l);
    }
    return rowCache;
  }
{code}
which is called extensively by KeyValueHeaps throughout the scanner code. In the case of scanning MemStore, an individual KeyValue ends up as {{next}} in multiple MemStoreScanners. Then, if multiple threads call {{getRow}} at the same time, we see the following race:
- Thread 1 sees {{rowCache}} as null, and initializes {{rowCache = new byte[...]}}
- Thread 2 sees {{rowCache}} as non-null, and returns a byte array of all 0s
- Thread 1 initializes the row with {{arrayCopy}}, and returns the right result

The byte array returned to Thread 2 is modified while it's working with it, so depending on the interleaving of events, it can cause an invalid heap, or invalid results, or a weird split row like Jon was seeing, etc.

The fix is pretty simple - we need to declare {{rowCache}} volatile, and initialize it in a temporary variable before overwriting the volatile reference. If this is too slow, we could use an AtomicFieldUpdater with {{lazySet}} to put the cost only on the write side, but I don't think it really matters.

                
> Scan ACID problem with concurrent puts.
> ---------------------------------------
>
>                 Key: HBASE-4570
>                 URL: https://issues.apache.org/jira/browse/HBASE-4570
>             Project: HBase
>          Issue Type: Bug
>          Components: client, regionserver
>    Affects Versions: 0.90.1, 0.90.3
>            Reporter: Jonathan Hsieh
>         Attachments: 4570-instrumentation.tgz, hbase-4570.tgz
>
>
> When scanning a table sometimes rows that have multiple column families get split into two rows if there are concurrent writes.  In this particular case we are overwriting the contents of a Get directly back onto itself as a Put.
> For example, this is a two cf row (with "f1", "f2", .. "f9" cfs).  It is actually returned as two rows (#55 and #56). Interestingly if the two were merged we would have a single proper row.
> Row row0000024461 had time stamps: [55: keyvalues={row0000024461/f0:data/1318200440867/Put/vlen=1000, row0000024461/f0:qual/1318200440867/Put/vlen=10, row0000024461/f1:data/1318200440867/Put/vlen=1000, row0000024461/f1:qual/1318200440867/Put/vlen=10, row0000024461/f2:data/1318200440867/Put/vlen=1000, row0000024461/f2:qual/1318200440867/Put/vlen=10, row0000024461/f3:data/1318200440867/Put/vlen=1000, row0000024461/f3:qual/1318200440867/Put/vlen=10, row0000024461/f4:data/1318200440867/Put/vlen=1000, row0000024461/f4:qual/1318200440867/Put/vlen=10}, 
> 56: keyvalues={row0000024461/f5:data/1318200440867/Put/vlen=1000, row0000024461/f5:qual/1318200440867/Put/vlen=10, row0000024461/f6:data/1318200440867/Put/vlen=1000, row0000024461/f6:qual/1318200440867/Put/vlen=10, row0000024461/f7:data/1318200440867/Put/vlen=1000, row0000024461/f7:qual/1318200440867/Put/vlen=10, row0000024461/f8:data/1318200440867/Put/vlen=1000, row0000024461/f8:qual/1318200440867/Put/vlen=10, row0000024461/f9:data/1318200440867/Put/vlen=1000, row0000024461/f9:qual/1318200440867/Put/vlen=10}]
> I've only tested this on 0.90.1+patches and 0.90.3+patches, but it is consistent and duplicatable.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HBASE-4570) Scan ACID problem with concurrent puts.

Posted by "Jonathan Hsieh (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-4570?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13128867#comment-13128867 ] 

Jonathan Hsieh commented on HBASE-4570:
---------------------------------------

@stack I've done testing on trunk and an 0.90 branch and the symptoms encountered with the testing programs is fixed.  Would be great to get on 0.90, 0.92 and trunk.  Thanks!
                
> Scan ACID problem with concurrent puts.
> ---------------------------------------
>
>                 Key: HBASE-4570
>                 URL: https://issues.apache.org/jira/browse/HBASE-4570
>             Project: HBase
>          Issue Type: Bug
>          Components: client, regionserver
>    Affects Versions: 0.90.1, 0.90.3
>            Reporter: Jonathan Hsieh
>         Attachments: 4570-instrumentation.tgz, hbase-4570.tgz, hbase-4570.txt
>
>
> When scanning a table sometimes rows that have multiple column families get split into two rows if there are concurrent writes.  In this particular case we are overwriting the contents of a Get directly back onto itself as a Put.
> For example, this is a two cf row (with "f1", "f2", .. "f9" cfs).  It is actually returned as two rows (#55 and #56). Interestingly if the two were merged we would have a single proper row.
> Row row0000024461 had time stamps: [55: keyvalues={row0000024461/f0:data/1318200440867/Put/vlen=1000, row0000024461/f0:qual/1318200440867/Put/vlen=10, row0000024461/f1:data/1318200440867/Put/vlen=1000, row0000024461/f1:qual/1318200440867/Put/vlen=10, row0000024461/f2:data/1318200440867/Put/vlen=1000, row0000024461/f2:qual/1318200440867/Put/vlen=10, row0000024461/f3:data/1318200440867/Put/vlen=1000, row0000024461/f3:qual/1318200440867/Put/vlen=10, row0000024461/f4:data/1318200440867/Put/vlen=1000, row0000024461/f4:qual/1318200440867/Put/vlen=10}, 
> 56: keyvalues={row0000024461/f5:data/1318200440867/Put/vlen=1000, row0000024461/f5:qual/1318200440867/Put/vlen=10, row0000024461/f6:data/1318200440867/Put/vlen=1000, row0000024461/f6:qual/1318200440867/Put/vlen=10, row0000024461/f7:data/1318200440867/Put/vlen=1000, row0000024461/f7:qual/1318200440867/Put/vlen=10, row0000024461/f8:data/1318200440867/Put/vlen=1000, row0000024461/f8:qual/1318200440867/Put/vlen=10, row0000024461/f9:data/1318200440867/Put/vlen=1000, row0000024461/f9:qual/1318200440867/Put/vlen=10}]
> I've only tested this on 0.90.1+patches and 0.90.3+patches, but it is consistent and duplicatable.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HBASE-4570) Scan ACID problem with concurrent puts.

Posted by "Jonathan Hsieh (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-4570?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13127887#comment-13127887 ] 

Jonathan Hsieh commented on HBASE-4570:
---------------------------------------

@Ted

I have a strange situation where just with the fixes (first two patches, no instrumentation) I still get a lot of the failures in my test setup.  However with extra instrumentation failure seem to go away (runs a long time without encountering problems).  Note in my table setup, I have 10 cf's each with 2 cols so the instrumentation is written to always expect 20 KVs.  I have two process -- one that does a filtered scan and twiddle, and another that just dues a filtered scan and count.

I ran TestAcidGuarantees in a loop on the instrumented version.  It eventually failed :(

{code}
Tests in error:
  testScanAtomicity(org.apache.hadoop.hbase.TestAcidGuarantees): Deferred
  testMixedAtomicity(org.apache.hadoop.hbase.TestAcidGuarantees): org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation@54697123 closed
{code}

With the instrumented version TestAcidGuarentees still fails -- 
It took about 10th iterations before this happened.

{code}
Tests run: 3, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 127.479 sec
Tests run: 3, Failures: 0, Errors: 0, Skipped: 0
Tests run: 3, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 121.662 sec
Tests run: 3, Failures: 0, Errors: 0, Skipped: 0
Tests run: 3, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 117.508 sec
Tests run: 3, Failures: 0, Errors: 0, Skipped: 0
Tests run: 3, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 124.208 sec
Tests run: 3, Failures: 0, Errors: 0, Skipped: 0
Tests run: 3, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 121.513 sec
Tests run: 3, Failures: 0, Errors: 0, Skipped: 0
Tests run: 3, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 120.472 sec
Tests run: 3, Failures: 0, Errors: 0, Skipped: 0
Tests run: 3, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 117.869 sec
Tests run: 3, Failures: 0, Errors: 0, Skipped: 0
Tests run: 3, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 120.435 sec
Tests run: 3, Failures: 0, Errors: 0, Skipped: 0
Tests run: 3, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 118.946 sec
Tests run: 3, Failures: 0, Errors: 0, Skipped: 0
Tests run: 3, Failures: 0, Errors: 2, Skipped: 0, Time elapsed: 85.81 sec <<< FAILURE!
Tests run: 3, Failures: 0, Errors: 2, Skipped: 0
{code}

                
> Scan ACID problem with concurrent puts.
> ---------------------------------------
>
>                 Key: HBASE-4570
>                 URL: https://issues.apache.org/jira/browse/HBASE-4570
>             Project: HBase
>          Issue Type: Bug
>          Components: client, regionserver
>    Affects Versions: 0.90.1, 0.90.3
>            Reporter: Jonathan Hsieh
>         Attachments: 4570-instrumentation.tgz, hbase-4570.tgz
>
>
> When scanning a table sometimes rows that have multiple column families get split into two rows if there are concurrent writes.  In this particular case we are overwriting the contents of a Get directly back onto itself as a Put.
> For example, this is a two cf row (with "f1", "f2", .. "f9" cfs).  It is actually returned as two rows (#55 and #56). Interestingly if the two were merged we would have a single proper row.
> Row row0000024461 had time stamps: [55: keyvalues={row0000024461/f0:data/1318200440867/Put/vlen=1000, row0000024461/f0:qual/1318200440867/Put/vlen=10, row0000024461/f1:data/1318200440867/Put/vlen=1000, row0000024461/f1:qual/1318200440867/Put/vlen=10, row0000024461/f2:data/1318200440867/Put/vlen=1000, row0000024461/f2:qual/1318200440867/Put/vlen=10, row0000024461/f3:data/1318200440867/Put/vlen=1000, row0000024461/f3:qual/1318200440867/Put/vlen=10, row0000024461/f4:data/1318200440867/Put/vlen=1000, row0000024461/f4:qual/1318200440867/Put/vlen=10}, 
> 56: keyvalues={row0000024461/f5:data/1318200440867/Put/vlen=1000, row0000024461/f5:qual/1318200440867/Put/vlen=10, row0000024461/f6:data/1318200440867/Put/vlen=1000, row0000024461/f6:qual/1318200440867/Put/vlen=10, row0000024461/f7:data/1318200440867/Put/vlen=1000, row0000024461/f7:qual/1318200440867/Put/vlen=10, row0000024461/f8:data/1318200440867/Put/vlen=1000, row0000024461/f8:qual/1318200440867/Put/vlen=10, row0000024461/f9:data/1318200440867/Put/vlen=1000, row0000024461/f9:qual/1318200440867/Put/vlen=10}]
> I've only tested this on 0.90.1+patches and 0.90.3+patches, but it is consistent and duplicatable.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HBASE-4570) Scan ACID problem with concurrent puts.

Posted by "Ted Yu (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-4570?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13127780#comment-13127780 ] 

Ted Yu commented on HBASE-4570:
-------------------------------

@Jonathan:
Can you either post the combined patch or run TestAcidGuarantees in a loop ?
Your findings may give us clue for pushing 2856 forward.

Thanks
                
> Scan ACID problem with concurrent puts.
> ---------------------------------------
>
>                 Key: HBASE-4570
>                 URL: https://issues.apache.org/jira/browse/HBASE-4570
>             Project: HBase
>          Issue Type: Bug
>          Components: client, regionserver
>    Affects Versions: 0.90.1, 0.90.3
>            Reporter: Jonathan Hsieh
>         Attachments: hbase-4570.tgz
>
>
> When scanning a table sometimes rows that have multiple column families get split into two rows if there are concurrent writes.  In this particular case we are overwriting the contents of a Get directly back onto itself as a Put.
> For example, this is a two cf row (with "f1", "f2", .. "f9" cfs).  It is actually returned as two rows (#55 and #56). Interestingly if the two were merged we would have a single proper row.
> Row row0000024461 had time stamps: [55: keyvalues={row0000024461/f0:data/1318200440867/Put/vlen=1000, row0000024461/f0:qual/1318200440867/Put/vlen=10, row0000024461/f1:data/1318200440867/Put/vlen=1000, row0000024461/f1:qual/1318200440867/Put/vlen=10, row0000024461/f2:data/1318200440867/Put/vlen=1000, row0000024461/f2:qual/1318200440867/Put/vlen=10, row0000024461/f3:data/1318200440867/Put/vlen=1000, row0000024461/f3:qual/1318200440867/Put/vlen=10, row0000024461/f4:data/1318200440867/Put/vlen=1000, row0000024461/f4:qual/1318200440867/Put/vlen=10}, 
> 56: keyvalues={row0000024461/f5:data/1318200440867/Put/vlen=1000, row0000024461/f5:qual/1318200440867/Put/vlen=10, row0000024461/f6:data/1318200440867/Put/vlen=1000, row0000024461/f6:qual/1318200440867/Put/vlen=10, row0000024461/f7:data/1318200440867/Put/vlen=1000, row0000024461/f7:qual/1318200440867/Put/vlen=10, row0000024461/f8:data/1318200440867/Put/vlen=1000, row0000024461/f8:qual/1318200440867/Put/vlen=10, row0000024461/f9:data/1318200440867/Put/vlen=1000, row0000024461/f9:qual/1318200440867/Put/vlen=10}]
> I've only tested this on 0.90.1+patches and 0.90.3+patches, but it is consistent and duplicatable.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HBASE-4570) Scan ACID problem with concurrent puts.

Posted by "Todd Lipcon (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-4570?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13128016#comment-13128016 ] 

Todd Lipcon commented on HBASE-4570:
------------------------------------

(btw, the unsafe comparator was probably just a red herring - it's faster, so the race is more likely, but pretty sure the above is the true case)
                
> Scan ACID problem with concurrent puts.
> ---------------------------------------
>
>                 Key: HBASE-4570
>                 URL: https://issues.apache.org/jira/browse/HBASE-4570
>             Project: HBase
>          Issue Type: Bug
>          Components: client, regionserver
>    Affects Versions: 0.90.1, 0.90.3
>            Reporter: Jonathan Hsieh
>         Attachments: 4570-instrumentation.tgz, hbase-4570.tgz, hbase-4570.txt
>
>
> When scanning a table sometimes rows that have multiple column families get split into two rows if there are concurrent writes.  In this particular case we are overwriting the contents of a Get directly back onto itself as a Put.
> For example, this is a two cf row (with "f1", "f2", .. "f9" cfs).  It is actually returned as two rows (#55 and #56). Interestingly if the two were merged we would have a single proper row.
> Row row0000024461 had time stamps: [55: keyvalues={row0000024461/f0:data/1318200440867/Put/vlen=1000, row0000024461/f0:qual/1318200440867/Put/vlen=10, row0000024461/f1:data/1318200440867/Put/vlen=1000, row0000024461/f1:qual/1318200440867/Put/vlen=10, row0000024461/f2:data/1318200440867/Put/vlen=1000, row0000024461/f2:qual/1318200440867/Put/vlen=10, row0000024461/f3:data/1318200440867/Put/vlen=1000, row0000024461/f3:qual/1318200440867/Put/vlen=10, row0000024461/f4:data/1318200440867/Put/vlen=1000, row0000024461/f4:qual/1318200440867/Put/vlen=10}, 
> 56: keyvalues={row0000024461/f5:data/1318200440867/Put/vlen=1000, row0000024461/f5:qual/1318200440867/Put/vlen=10, row0000024461/f6:data/1318200440867/Put/vlen=1000, row0000024461/f6:qual/1318200440867/Put/vlen=10, row0000024461/f7:data/1318200440867/Put/vlen=1000, row0000024461/f7:qual/1318200440867/Put/vlen=10, row0000024461/f8:data/1318200440867/Put/vlen=1000, row0000024461/f8:qual/1318200440867/Put/vlen=10, row0000024461/f9:data/1318200440867/Put/vlen=1000, row0000024461/f9:qual/1318200440867/Put/vlen=10}]
> I've only tested this on 0.90.1+patches and 0.90.3+patches, but it is consistent and duplicatable.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Updated] (HBASE-4570) Scan ACID problem with concurrent puts.

Posted by "Ted Yu (Updated) (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HBASE-4570?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Ted Yu updated HBASE-4570:
--------------------------

    Fix Version/s: 0.90.5
    
> Scan ACID problem with concurrent puts.
> ---------------------------------------
>
>                 Key: HBASE-4570
>                 URL: https://issues.apache.org/jira/browse/HBASE-4570
>             Project: HBase
>          Issue Type: Bug
>          Components: client, regionserver
>    Affects Versions: 0.90.1, 0.90.3
>            Reporter: Jonathan Hsieh
>             Fix For: 0.90.5
>
>         Attachments: 4570-instrumentation.tgz, hbase-4570.tgz, hbase-4570.txt
>
>
> When scanning a table sometimes rows that have multiple column families get split into two rows if there are concurrent writes.  In this particular case we are overwriting the contents of a Get directly back onto itself as a Put.
> For example, this is a two cf row (with "f1", "f2", .. "f9" cfs).  It is actually returned as two rows (#55 and #56). Interestingly if the two were merged we would have a single proper row.
> Row row0000024461 had time stamps: [55: keyvalues={row0000024461/f0:data/1318200440867/Put/vlen=1000, row0000024461/f0:qual/1318200440867/Put/vlen=10, row0000024461/f1:data/1318200440867/Put/vlen=1000, row0000024461/f1:qual/1318200440867/Put/vlen=10, row0000024461/f2:data/1318200440867/Put/vlen=1000, row0000024461/f2:qual/1318200440867/Put/vlen=10, row0000024461/f3:data/1318200440867/Put/vlen=1000, row0000024461/f3:qual/1318200440867/Put/vlen=10, row0000024461/f4:data/1318200440867/Put/vlen=1000, row0000024461/f4:qual/1318200440867/Put/vlen=10}, 
> 56: keyvalues={row0000024461/f5:data/1318200440867/Put/vlen=1000, row0000024461/f5:qual/1318200440867/Put/vlen=10, row0000024461/f6:data/1318200440867/Put/vlen=1000, row0000024461/f6:qual/1318200440867/Put/vlen=10, row0000024461/f7:data/1318200440867/Put/vlen=1000, row0000024461/f7:qual/1318200440867/Put/vlen=10, row0000024461/f8:data/1318200440867/Put/vlen=1000, row0000024461/f8:qual/1318200440867/Put/vlen=10, row0000024461/f9:data/1318200440867/Put/vlen=1000, row0000024461/f9:qual/1318200440867/Put/vlen=10}]
> I've only tested this on 0.90.1+patches and 0.90.3+patches, but it is consistent and duplicatable.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HBASE-4570) Scan ACID problem with concurrent puts.

Posted by "Jonathan Hsieh (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-4570?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13126001#comment-13126001 ] 

Jonathan Hsieh commented on HBASE-4570:
---------------------------------------

rephrase: I have not been able to duplicate this in a unit test yet.  

This test seems scenario is similar to TestAcidGuarentees (HBASE-2856) but uses filters and seems a little focused on this particular symptom.
                
> Scan ACID problem with concurrent puts.
> ---------------------------------------
>
>                 Key: HBASE-4570
>                 URL: https://issues.apache.org/jira/browse/HBASE-4570
>             Project: HBase
>          Issue Type: Bug
>          Components: client, regionserver
>    Affects Versions: 0.90.1, 0.90.3
>            Reporter: Jonathan Hsieh
>         Attachments: hbase-4570.tgz
>
>
> When scanning a table sometimes rows that have multiple column families get split into two rows if there are concurrent writes.  In this particular case we are overwriting the contents of a Get directly back onto itself as a Put.
> For example, this is a two cf row (with "f1", "f2", .. "f9" cfs).  It is actually returned as two rows (#55 and #56). Interestingly if the two were merged we would have a single proper row.
> Row row0000024461 had time stamps: [55: keyvalues={row0000024461/f0:data/1318200440867/Put/vlen=1000, row0000024461/f0:qual/1318200440867/Put/vlen=10, row0000024461/f1:data/1318200440867/Put/vlen=1000, row0000024461/f1:qual/1318200440867/Put/vlen=10, row0000024461/f2:data/1318200440867/Put/vlen=1000, row0000024461/f2:qual/1318200440867/Put/vlen=10, row0000024461/f3:data/1318200440867/Put/vlen=1000, row0000024461/f3:qual/1318200440867/Put/vlen=10, row0000024461/f4:data/1318200440867/Put/vlen=1000, row0000024461/f4:qual/1318200440867/Put/vlen=10}, 
> 56: keyvalues={row0000024461/f5:data/1318200440867/Put/vlen=1000, row0000024461/f5:qual/1318200440867/Put/vlen=10, row0000024461/f6:data/1318200440867/Put/vlen=1000, row0000024461/f6:qual/1318200440867/Put/vlen=10, row0000024461/f7:data/1318200440867/Put/vlen=1000, row0000024461/f7:qual/1318200440867/Put/vlen=10, row0000024461/f8:data/1318200440867/Put/vlen=1000, row0000024461/f8:qual/1318200440867/Put/vlen=10, row0000024461/f9:data/1318200440867/Put/vlen=1000, row0000024461/f9:qual/1318200440867/Put/vlen=10}]
> I've only tested this on 0.90.1+patches and 0.90.3+patches, but it is consistent and duplicatable.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HBASE-4570) Scan ACID problem with concurrent puts.

Posted by "stack (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-4570?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13126644#comment-13126644 ] 

stack commented on HBASE-4570:
------------------------------

Good on you Jon.  Keep digging (smile).
                
> Scan ACID problem with concurrent puts.
> ---------------------------------------
>
>                 Key: HBASE-4570
>                 URL: https://issues.apache.org/jira/browse/HBASE-4570
>             Project: HBase
>          Issue Type: Bug
>          Components: client, regionserver
>    Affects Versions: 0.90.1, 0.90.3
>            Reporter: Jonathan Hsieh
>         Attachments: hbase-4570.tgz
>
>
> When scanning a table sometimes rows that have multiple column families get split into two rows if there are concurrent writes.  In this particular case we are overwriting the contents of a Get directly back onto itself as a Put.
> For example, this is a two cf row (with "f1", "f2", .. "f9" cfs).  It is actually returned as two rows (#55 and #56). Interestingly if the two were merged we would have a single proper row.
> Row row0000024461 had time stamps: [55: keyvalues={row0000024461/f0:data/1318200440867/Put/vlen=1000, row0000024461/f0:qual/1318200440867/Put/vlen=10, row0000024461/f1:data/1318200440867/Put/vlen=1000, row0000024461/f1:qual/1318200440867/Put/vlen=10, row0000024461/f2:data/1318200440867/Put/vlen=1000, row0000024461/f2:qual/1318200440867/Put/vlen=10, row0000024461/f3:data/1318200440867/Put/vlen=1000, row0000024461/f3:qual/1318200440867/Put/vlen=10, row0000024461/f4:data/1318200440867/Put/vlen=1000, row0000024461/f4:qual/1318200440867/Put/vlen=10}, 
> 56: keyvalues={row0000024461/f5:data/1318200440867/Put/vlen=1000, row0000024461/f5:qual/1318200440867/Put/vlen=10, row0000024461/f6:data/1318200440867/Put/vlen=1000, row0000024461/f6:qual/1318200440867/Put/vlen=10, row0000024461/f7:data/1318200440867/Put/vlen=1000, row0000024461/f7:qual/1318200440867/Put/vlen=10, row0000024461/f8:data/1318200440867/Put/vlen=1000, row0000024461/f8:qual/1318200440867/Put/vlen=10, row0000024461/f9:data/1318200440867/Put/vlen=1000, row0000024461/f9:qual/1318200440867/Put/vlen=10}]
> I've only tested this on 0.90.1+patches and 0.90.3+patches, but it is consistent and duplicatable.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HBASE-4570) Scan ACID problem with concurrent puts.

Posted by "Jonathan Hsieh (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-4570?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13127774#comment-13127774 ] 

Jonathan Hsieh commented on HBASE-4570:
---------------------------------------

The way this is setup, I can't tell if problem will never happen, but I can detect if it ever does.

I'm still experimenting on trunk and will move to previous versions when I feel confident with this potential root cause.  I'm using a combo of HBASE-2856 on trunk and reverting to the java comparator -- it might the combo of the two that is required. 

                
> Scan ACID problem with concurrent puts.
> ---------------------------------------
>
>                 Key: HBASE-4570
>                 URL: https://issues.apache.org/jira/browse/HBASE-4570
>             Project: HBase
>          Issue Type: Bug
>          Components: client, regionserver
>    Affects Versions: 0.90.1, 0.90.3
>            Reporter: Jonathan Hsieh
>         Attachments: hbase-4570.tgz
>
>
> When scanning a table sometimes rows that have multiple column families get split into two rows if there are concurrent writes.  In this particular case we are overwriting the contents of a Get directly back onto itself as a Put.
> For example, this is a two cf row (with "f1", "f2", .. "f9" cfs).  It is actually returned as two rows (#55 and #56). Interestingly if the two were merged we would have a single proper row.
> Row row0000024461 had time stamps: [55: keyvalues={row0000024461/f0:data/1318200440867/Put/vlen=1000, row0000024461/f0:qual/1318200440867/Put/vlen=10, row0000024461/f1:data/1318200440867/Put/vlen=1000, row0000024461/f1:qual/1318200440867/Put/vlen=10, row0000024461/f2:data/1318200440867/Put/vlen=1000, row0000024461/f2:qual/1318200440867/Put/vlen=10, row0000024461/f3:data/1318200440867/Put/vlen=1000, row0000024461/f3:qual/1318200440867/Put/vlen=10, row0000024461/f4:data/1318200440867/Put/vlen=1000, row0000024461/f4:qual/1318200440867/Put/vlen=10}, 
> 56: keyvalues={row0000024461/f5:data/1318200440867/Put/vlen=1000, row0000024461/f5:qual/1318200440867/Put/vlen=10, row0000024461/f6:data/1318200440867/Put/vlen=1000, row0000024461/f6:qual/1318200440867/Put/vlen=10, row0000024461/f7:data/1318200440867/Put/vlen=1000, row0000024461/f7:qual/1318200440867/Put/vlen=10, row0000024461/f8:data/1318200440867/Put/vlen=1000, row0000024461/f8:qual/1318200440867/Put/vlen=10, row0000024461/f9:data/1318200440867/Put/vlen=1000, row0000024461/f9:qual/1318200440867/Put/vlen=10}]
> I've only tested this on 0.90.1+patches and 0.90.3+patches, but it is consistent and duplicatable.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HBASE-4570) Scan ACID problem with concurrent puts.

Posted by "Jonathan Hsieh (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-4570?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13126369#comment-13126369 ] 

Jonathan Hsieh commented on HBASE-4570:
---------------------------------------

Ran the unit test version of this test and it did not fail as the separate programs did after 3-4 hours.


                
> Scan ACID problem with concurrent puts.
> ---------------------------------------
>
>                 Key: HBASE-4570
>                 URL: https://issues.apache.org/jira/browse/HBASE-4570
>             Project: HBase
>          Issue Type: Bug
>          Components: client, regionserver
>    Affects Versions: 0.90.1, 0.90.3
>            Reporter: Jonathan Hsieh
>         Attachments: hbase-4570.tgz
>
>
> When scanning a table sometimes rows that have multiple column families get split into two rows if there are concurrent writes.  In this particular case we are overwriting the contents of a Get directly back onto itself as a Put.
> For example, this is a two cf row (with "f1", "f2", .. "f9" cfs).  It is actually returned as two rows (#55 and #56). Interestingly if the two were merged we would have a single proper row.
> Row row0000024461 had time stamps: [55: keyvalues={row0000024461/f0:data/1318200440867/Put/vlen=1000, row0000024461/f0:qual/1318200440867/Put/vlen=10, row0000024461/f1:data/1318200440867/Put/vlen=1000, row0000024461/f1:qual/1318200440867/Put/vlen=10, row0000024461/f2:data/1318200440867/Put/vlen=1000, row0000024461/f2:qual/1318200440867/Put/vlen=10, row0000024461/f3:data/1318200440867/Put/vlen=1000, row0000024461/f3:qual/1318200440867/Put/vlen=10, row0000024461/f4:data/1318200440867/Put/vlen=1000, row0000024461/f4:qual/1318200440867/Put/vlen=10}, 
> 56: keyvalues={row0000024461/f5:data/1318200440867/Put/vlen=1000, row0000024461/f5:qual/1318200440867/Put/vlen=10, row0000024461/f6:data/1318200440867/Put/vlen=1000, row0000024461/f6:qual/1318200440867/Put/vlen=10, row0000024461/f7:data/1318200440867/Put/vlen=1000, row0000024461/f7:qual/1318200440867/Put/vlen=10, row0000024461/f8:data/1318200440867/Put/vlen=1000, row0000024461/f8:qual/1318200440867/Put/vlen=10, row0000024461/f9:data/1318200440867/Put/vlen=1000, row0000024461/f9:qual/1318200440867/Put/vlen=10}]
> I've only tested this on 0.90.1+patches and 0.90.3+patches, but it is consistent and duplicatable.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Updated] (HBASE-4570) Scan ACID problem with concurrent puts.

Posted by "Jonathan Hsieh (Updated) (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HBASE-4570?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Jonathan Hsieh updated HBASE-4570:
----------------------------------

    Attachment: 4570-instrumentation.tgz

4570-instrumentation.tgz includes a few incremental patches -- the first applies v11 of HBASE-2856, the second comments out the use of sun.misc.Unsafe, and other add instrumentation around the RS's internal scanner's next and row delimiting functions.  
                
> Scan ACID problem with concurrent puts.
> ---------------------------------------
>
>                 Key: HBASE-4570
>                 URL: https://issues.apache.org/jira/browse/HBASE-4570
>             Project: HBase
>          Issue Type: Bug
>          Components: client, regionserver
>    Affects Versions: 0.90.1, 0.90.3
>            Reporter: Jonathan Hsieh
>         Attachments: 4570-instrumentation.tgz, hbase-4570.tgz
>
>
> When scanning a table sometimes rows that have multiple column families get split into two rows if there are concurrent writes.  In this particular case we are overwriting the contents of a Get directly back onto itself as a Put.
> For example, this is a two cf row (with "f1", "f2", .. "f9" cfs).  It is actually returned as two rows (#55 and #56). Interestingly if the two were merged we would have a single proper row.
> Row row0000024461 had time stamps: [55: keyvalues={row0000024461/f0:data/1318200440867/Put/vlen=1000, row0000024461/f0:qual/1318200440867/Put/vlen=10, row0000024461/f1:data/1318200440867/Put/vlen=1000, row0000024461/f1:qual/1318200440867/Put/vlen=10, row0000024461/f2:data/1318200440867/Put/vlen=1000, row0000024461/f2:qual/1318200440867/Put/vlen=10, row0000024461/f3:data/1318200440867/Put/vlen=1000, row0000024461/f3:qual/1318200440867/Put/vlen=10, row0000024461/f4:data/1318200440867/Put/vlen=1000, row0000024461/f4:qual/1318200440867/Put/vlen=10}, 
> 56: keyvalues={row0000024461/f5:data/1318200440867/Put/vlen=1000, row0000024461/f5:qual/1318200440867/Put/vlen=10, row0000024461/f6:data/1318200440867/Put/vlen=1000, row0000024461/f6:qual/1318200440867/Put/vlen=10, row0000024461/f7:data/1318200440867/Put/vlen=1000, row0000024461/f7:qual/1318200440867/Put/vlen=10, row0000024461/f8:data/1318200440867/Put/vlen=1000, row0000024461/f8:qual/1318200440867/Put/vlen=10, row0000024461/f9:data/1318200440867/Put/vlen=1000, row0000024461/f9:qual/1318200440867/Put/vlen=10}]
> I've only tested this on 0.90.1+patches and 0.90.3+patches, but it is consistent and duplicatable.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HBASE-4570) Scan ACID problem with concurrent puts.

Posted by "Todd Lipcon (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-4570?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13127725#comment-13127725 ] 

Todd Lipcon commented on HBASE-4570:
------------------------------------

woah, that's interesting... but I thought you could reproduce this on 0.90.4 where the UNSAFE_COMPARER doesn't exist?
                
> Scan ACID problem with concurrent puts.
> ---------------------------------------
>
>                 Key: HBASE-4570
>                 URL: https://issues.apache.org/jira/browse/HBASE-4570
>             Project: HBase
>          Issue Type: Bug
>          Components: client, regionserver
>    Affects Versions: 0.90.1, 0.90.3
>            Reporter: Jonathan Hsieh
>         Attachments: hbase-4570.tgz
>
>
> When scanning a table sometimes rows that have multiple column families get split into two rows if there are concurrent writes.  In this particular case we are overwriting the contents of a Get directly back onto itself as a Put.
> For example, this is a two cf row (with "f1", "f2", .. "f9" cfs).  It is actually returned as two rows (#55 and #56). Interestingly if the two were merged we would have a single proper row.
> Row row0000024461 had time stamps: [55: keyvalues={row0000024461/f0:data/1318200440867/Put/vlen=1000, row0000024461/f0:qual/1318200440867/Put/vlen=10, row0000024461/f1:data/1318200440867/Put/vlen=1000, row0000024461/f1:qual/1318200440867/Put/vlen=10, row0000024461/f2:data/1318200440867/Put/vlen=1000, row0000024461/f2:qual/1318200440867/Put/vlen=10, row0000024461/f3:data/1318200440867/Put/vlen=1000, row0000024461/f3:qual/1318200440867/Put/vlen=10, row0000024461/f4:data/1318200440867/Put/vlen=1000, row0000024461/f4:qual/1318200440867/Put/vlen=10}, 
> 56: keyvalues={row0000024461/f5:data/1318200440867/Put/vlen=1000, row0000024461/f5:qual/1318200440867/Put/vlen=10, row0000024461/f6:data/1318200440867/Put/vlen=1000, row0000024461/f6:qual/1318200440867/Put/vlen=10, row0000024461/f7:data/1318200440867/Put/vlen=1000, row0000024461/f7:qual/1318200440867/Put/vlen=10, row0000024461/f8:data/1318200440867/Put/vlen=1000, row0000024461/f8:qual/1318200440867/Put/vlen=10, row0000024461/f9:data/1318200440867/Put/vlen=1000, row0000024461/f9:qual/1318200440867/Put/vlen=10}]
> I've only tested this on 0.90.1+patches and 0.90.3+patches, but it is consistent and duplicatable.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HBASE-4570) Scan ACID problem with concurrent puts.

Posted by "Hudson (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-4570?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13129067#comment-13129067 ] 

Hudson commented on HBASE-4570:
-------------------------------

Integrated in HBase-TRUNK #2331 (See [https://builds.apache.org/job/HBase-TRUNK/2331/])
    HBASE-4570. Fix a race condition that could cause inconsistent results from scans during concurrent writes.

todd : 
Files : 
* /hbase/trunk/CHANGES.txt
* /hbase/trunk/src/main/java/org/apache/hadoop/hbase/KeyValue.java

                
> Scan ACID problem with concurrent puts.
> ---------------------------------------
>
>                 Key: HBASE-4570
>                 URL: https://issues.apache.org/jira/browse/HBASE-4570
>             Project: HBase
>          Issue Type: Bug
>          Components: client, regionserver
>    Affects Versions: 0.90.1, 0.90.3
>            Reporter: Jonathan Hsieh
>            Assignee: Jonathan Hsieh
>             Fix For: 0.92.0, 0.90.5
>
>         Attachments: 4570-instrumentation.tgz, hbase-4570.tgz, hbase-4570.txt
>
>
> When scanning a table sometimes rows that have multiple column families get split into two rows if there are concurrent writes.  In this particular case we are overwriting the contents of a Get directly back onto itself as a Put.
> For example, this is a two cf row (with "f1", "f2", .. "f9" cfs).  It is actually returned as two rows (#55 and #56). Interestingly if the two were merged we would have a single proper row.
> Row row0000024461 had time stamps: [55: keyvalues={row0000024461/f0:data/1318200440867/Put/vlen=1000, row0000024461/f0:qual/1318200440867/Put/vlen=10, row0000024461/f1:data/1318200440867/Put/vlen=1000, row0000024461/f1:qual/1318200440867/Put/vlen=10, row0000024461/f2:data/1318200440867/Put/vlen=1000, row0000024461/f2:qual/1318200440867/Put/vlen=10, row0000024461/f3:data/1318200440867/Put/vlen=1000, row0000024461/f3:qual/1318200440867/Put/vlen=10, row0000024461/f4:data/1318200440867/Put/vlen=1000, row0000024461/f4:qual/1318200440867/Put/vlen=10}, 
> 56: keyvalues={row0000024461/f5:data/1318200440867/Put/vlen=1000, row0000024461/f5:qual/1318200440867/Put/vlen=10, row0000024461/f6:data/1318200440867/Put/vlen=1000, row0000024461/f6:qual/1318200440867/Put/vlen=10, row0000024461/f7:data/1318200440867/Put/vlen=1000, row0000024461/f7:qual/1318200440867/Put/vlen=10, row0000024461/f8:data/1318200440867/Put/vlen=1000, row0000024461/f8:qual/1318200440867/Put/vlen=10, row0000024461/f9:data/1318200440867/Put/vlen=1000, row0000024461/f9:qual/1318200440867/Put/vlen=10}]
> I've only tested this on 0.90.1+patches and 0.90.3+patches, but it is consistent and duplicatable.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Updated] (HBASE-4570) Scan ACID problem with concurrent puts.

Posted by "Todd Lipcon (Updated) (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HBASE-4570?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Todd Lipcon updated HBASE-4570:
-------------------------------

    Attachment: hbase-4570.txt
    
> Scan ACID problem with concurrent puts.
> ---------------------------------------
>
>                 Key: HBASE-4570
>                 URL: https://issues.apache.org/jira/browse/HBASE-4570
>             Project: HBase
>          Issue Type: Bug
>          Components: client, regionserver
>    Affects Versions: 0.90.1, 0.90.3
>            Reporter: Jonathan Hsieh
>         Attachments: 4570-instrumentation.tgz, hbase-4570.tgz, hbase-4570.txt
>
>
> When scanning a table sometimes rows that have multiple column families get split into two rows if there are concurrent writes.  In this particular case we are overwriting the contents of a Get directly back onto itself as a Put.
> For example, this is a two cf row (with "f1", "f2", .. "f9" cfs).  It is actually returned as two rows (#55 and #56). Interestingly if the two were merged we would have a single proper row.
> Row row0000024461 had time stamps: [55: keyvalues={row0000024461/f0:data/1318200440867/Put/vlen=1000, row0000024461/f0:qual/1318200440867/Put/vlen=10, row0000024461/f1:data/1318200440867/Put/vlen=1000, row0000024461/f1:qual/1318200440867/Put/vlen=10, row0000024461/f2:data/1318200440867/Put/vlen=1000, row0000024461/f2:qual/1318200440867/Put/vlen=10, row0000024461/f3:data/1318200440867/Put/vlen=1000, row0000024461/f3:qual/1318200440867/Put/vlen=10, row0000024461/f4:data/1318200440867/Put/vlen=1000, row0000024461/f4:qual/1318200440867/Put/vlen=10}, 
> 56: keyvalues={row0000024461/f5:data/1318200440867/Put/vlen=1000, row0000024461/f5:qual/1318200440867/Put/vlen=10, row0000024461/f6:data/1318200440867/Put/vlen=1000, row0000024461/f6:qual/1318200440867/Put/vlen=10, row0000024461/f7:data/1318200440867/Put/vlen=1000, row0000024461/f7:qual/1318200440867/Put/vlen=10, row0000024461/f8:data/1318200440867/Put/vlen=1000, row0000024461/f8:qual/1318200440867/Put/vlen=10, row0000024461/f9:data/1318200440867/Put/vlen=1000, row0000024461/f9:qual/1318200440867/Put/vlen=10}]
> I've only tested this on 0.90.1+patches and 0.90.3+patches, but it is consistent and duplicatable.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HBASE-4570) Scan ACID problem with concurrent puts.

Posted by "Jonathan Hsieh (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-4570?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13127711#comment-13127711 ] 

Jonathan Hsieh commented on HBASE-4570:
---------------------------------------

Current experiment seems to indicate that Bytes.equals, when it uses the UNSAFE_COMPARER class doesn't always tell the truth, and causes scan rows to get chopped up into two rows.  I've modified code to use the PureJavaComparer and the described problem hasn't appeared yet (runing for 30 mins or so).  
                
> Scan ACID problem with concurrent puts.
> ---------------------------------------
>
>                 Key: HBASE-4570
>                 URL: https://issues.apache.org/jira/browse/HBASE-4570
>             Project: HBase
>          Issue Type: Bug
>          Components: client, regionserver
>    Affects Versions: 0.90.1, 0.90.3
>            Reporter: Jonathan Hsieh
>         Attachments: hbase-4570.tgz
>
>
> When scanning a table sometimes rows that have multiple column families get split into two rows if there are concurrent writes.  In this particular case we are overwriting the contents of a Get directly back onto itself as a Put.
> For example, this is a two cf row (with "f1", "f2", .. "f9" cfs).  It is actually returned as two rows (#55 and #56). Interestingly if the two were merged we would have a single proper row.
> Row row0000024461 had time stamps: [55: keyvalues={row0000024461/f0:data/1318200440867/Put/vlen=1000, row0000024461/f0:qual/1318200440867/Put/vlen=10, row0000024461/f1:data/1318200440867/Put/vlen=1000, row0000024461/f1:qual/1318200440867/Put/vlen=10, row0000024461/f2:data/1318200440867/Put/vlen=1000, row0000024461/f2:qual/1318200440867/Put/vlen=10, row0000024461/f3:data/1318200440867/Put/vlen=1000, row0000024461/f3:qual/1318200440867/Put/vlen=10, row0000024461/f4:data/1318200440867/Put/vlen=1000, row0000024461/f4:qual/1318200440867/Put/vlen=10}, 
> 56: keyvalues={row0000024461/f5:data/1318200440867/Put/vlen=1000, row0000024461/f5:qual/1318200440867/Put/vlen=10, row0000024461/f6:data/1318200440867/Put/vlen=1000, row0000024461/f6:qual/1318200440867/Put/vlen=10, row0000024461/f7:data/1318200440867/Put/vlen=1000, row0000024461/f7:qual/1318200440867/Put/vlen=10, row0000024461/f8:data/1318200440867/Put/vlen=1000, row0000024461/f8:qual/1318200440867/Put/vlen=10, row0000024461/f9:data/1318200440867/Put/vlen=1000, row0000024461/f9:qual/1318200440867/Put/vlen=10}]
> I've only tested this on 0.90.1+patches and 0.90.3+patches, but it is consistent and duplicatable.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HBASE-4570) Scan ACID problem with concurrent puts.

Posted by "Ted Yu (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-4570?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13127995#comment-13127995 ] 

Ted Yu commented on HBASE-4570:
-------------------------------

I reproduced the above test failure using patch for 4485 (including 2856) combined with 0002-Only-use-safe-java-comparator-don-t-use-sun.misc.Uns.patch
                
> Scan ACID problem with concurrent puts.
> ---------------------------------------
>
>                 Key: HBASE-4570
>                 URL: https://issues.apache.org/jira/browse/HBASE-4570
>             Project: HBase
>          Issue Type: Bug
>          Components: client, regionserver
>    Affects Versions: 0.90.1, 0.90.3
>            Reporter: Jonathan Hsieh
>         Attachments: 4570-instrumentation.tgz, hbase-4570.tgz
>
>
> When scanning a table sometimes rows that have multiple column families get split into two rows if there are concurrent writes.  In this particular case we are overwriting the contents of a Get directly back onto itself as a Put.
> For example, this is a two cf row (with "f1", "f2", .. "f9" cfs).  It is actually returned as two rows (#55 and #56). Interestingly if the two were merged we would have a single proper row.
> Row row0000024461 had time stamps: [55: keyvalues={row0000024461/f0:data/1318200440867/Put/vlen=1000, row0000024461/f0:qual/1318200440867/Put/vlen=10, row0000024461/f1:data/1318200440867/Put/vlen=1000, row0000024461/f1:qual/1318200440867/Put/vlen=10, row0000024461/f2:data/1318200440867/Put/vlen=1000, row0000024461/f2:qual/1318200440867/Put/vlen=10, row0000024461/f3:data/1318200440867/Put/vlen=1000, row0000024461/f3:qual/1318200440867/Put/vlen=10, row0000024461/f4:data/1318200440867/Put/vlen=1000, row0000024461/f4:qual/1318200440867/Put/vlen=10}, 
> 56: keyvalues={row0000024461/f5:data/1318200440867/Put/vlen=1000, row0000024461/f5:qual/1318200440867/Put/vlen=10, row0000024461/f6:data/1318200440867/Put/vlen=1000, row0000024461/f6:qual/1318200440867/Put/vlen=10, row0000024461/f7:data/1318200440867/Put/vlen=1000, row0000024461/f7:qual/1318200440867/Put/vlen=10, row0000024461/f8:data/1318200440867/Put/vlen=1000, row0000024461/f8:qual/1318200440867/Put/vlen=10, row0000024461/f9:data/1318200440867/Put/vlen=1000, row0000024461/f9:qual/1318200440867/Put/vlen=10}]
> I've only tested this on 0.90.1+patches and 0.90.3+patches, but it is consistent and duplicatable.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HBASE-4570) Scan ACID problem with concurrent puts.

Posted by "Todd Lipcon (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-4570?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13128065#comment-13128065 ] 

Todd Lipcon commented on HBASE-4570:
------------------------------------

bq. I thought that (a) ARFU requires that its target be volatile still, and (b) that the point of lazySet was to allow cheaper writes, with no effect on reads.

I don't think it requires a volatile target - it just treats the target as having part of the volatile semantics for the particular update in question. The trick here would be that we don't need an up-to-date read whenever we read the field in order for lazy initialization to work. If a second thread recomputes the same array copy, that's fine. We only need to make sure that the writes happen in the correct order (ie the reference to the byte array isn't published before the byte array itself has been copied)
                
> Scan ACID problem with concurrent puts.
> ---------------------------------------
>
>                 Key: HBASE-4570
>                 URL: https://issues.apache.org/jira/browse/HBASE-4570
>             Project: HBase
>          Issue Type: Bug
>          Components: client, regionserver
>    Affects Versions: 0.90.1, 0.90.3
>            Reporter: Jonathan Hsieh
>         Attachments: 4570-instrumentation.tgz, hbase-4570.tgz, hbase-4570.txt
>
>
> When scanning a table sometimes rows that have multiple column families get split into two rows if there are concurrent writes.  In this particular case we are overwriting the contents of a Get directly back onto itself as a Put.
> For example, this is a two cf row (with "f1", "f2", .. "f9" cfs).  It is actually returned as two rows (#55 and #56). Interestingly if the two were merged we would have a single proper row.
> Row row0000024461 had time stamps: [55: keyvalues={row0000024461/f0:data/1318200440867/Put/vlen=1000, row0000024461/f0:qual/1318200440867/Put/vlen=10, row0000024461/f1:data/1318200440867/Put/vlen=1000, row0000024461/f1:qual/1318200440867/Put/vlen=10, row0000024461/f2:data/1318200440867/Put/vlen=1000, row0000024461/f2:qual/1318200440867/Put/vlen=10, row0000024461/f3:data/1318200440867/Put/vlen=1000, row0000024461/f3:qual/1318200440867/Put/vlen=10, row0000024461/f4:data/1318200440867/Put/vlen=1000, row0000024461/f4:qual/1318200440867/Put/vlen=10}, 
> 56: keyvalues={row0000024461/f5:data/1318200440867/Put/vlen=1000, row0000024461/f5:qual/1318200440867/Put/vlen=10, row0000024461/f6:data/1318200440867/Put/vlen=1000, row0000024461/f6:qual/1318200440867/Put/vlen=10, row0000024461/f7:data/1318200440867/Put/vlen=1000, row0000024461/f7:qual/1318200440867/Put/vlen=10, row0000024461/f8:data/1318200440867/Put/vlen=1000, row0000024461/f8:qual/1318200440867/Put/vlen=10, row0000024461/f9:data/1318200440867/Put/vlen=1000, row0000024461/f9:qual/1318200440867/Put/vlen=10}]
> I've only tested this on 0.90.1+patches and 0.90.3+patches, but it is consistent and duplicatable.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HBASE-4570) Scan ACID problem with concurrent puts.

Posted by "stack (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-4570?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13128073#comment-13128073 ] 

stack commented on HBASE-4570:
------------------------------

@Jon and Todd -- Nice find.  I'm +1 on applying patch as is.  If 'too slow', we can come back around later with AtomicFieldUpdater kung-fu
                
> Scan ACID problem with concurrent puts.
> ---------------------------------------
>
>                 Key: HBASE-4570
>                 URL: https://issues.apache.org/jira/browse/HBASE-4570
>             Project: HBase
>          Issue Type: Bug
>          Components: client, regionserver
>    Affects Versions: 0.90.1, 0.90.3
>            Reporter: Jonathan Hsieh
>         Attachments: 4570-instrumentation.tgz, hbase-4570.tgz, hbase-4570.txt
>
>
> When scanning a table sometimes rows that have multiple column families get split into two rows if there are concurrent writes.  In this particular case we are overwriting the contents of a Get directly back onto itself as a Put.
> For example, this is a two cf row (with "f1", "f2", .. "f9" cfs).  It is actually returned as two rows (#55 and #56). Interestingly if the two were merged we would have a single proper row.
> Row row0000024461 had time stamps: [55: keyvalues={row0000024461/f0:data/1318200440867/Put/vlen=1000, row0000024461/f0:qual/1318200440867/Put/vlen=10, row0000024461/f1:data/1318200440867/Put/vlen=1000, row0000024461/f1:qual/1318200440867/Put/vlen=10, row0000024461/f2:data/1318200440867/Put/vlen=1000, row0000024461/f2:qual/1318200440867/Put/vlen=10, row0000024461/f3:data/1318200440867/Put/vlen=1000, row0000024461/f3:qual/1318200440867/Put/vlen=10, row0000024461/f4:data/1318200440867/Put/vlen=1000, row0000024461/f4:qual/1318200440867/Put/vlen=10}, 
> 56: keyvalues={row0000024461/f5:data/1318200440867/Put/vlen=1000, row0000024461/f5:qual/1318200440867/Put/vlen=10, row0000024461/f6:data/1318200440867/Put/vlen=1000, row0000024461/f6:qual/1318200440867/Put/vlen=10, row0000024461/f7:data/1318200440867/Put/vlen=1000, row0000024461/f7:qual/1318200440867/Put/vlen=10, row0000024461/f8:data/1318200440867/Put/vlen=1000, row0000024461/f8:qual/1318200440867/Put/vlen=10, row0000024461/f9:data/1318200440867/Put/vlen=1000, row0000024461/f9:qual/1318200440867/Put/vlen=10}]
> I've only tested this on 0.90.1+patches and 0.90.3+patches, but it is consistent and duplicatable.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira