You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@hbase.apache.org by "Nicholas Telford (Commented) (JIRA)" <ji...@apache.org> on 2012/02/01 17:34:58 UTC

[jira] [Commented] (HBASE-4966) Put/Delete values cannot be tested with MRUnit

    [ https://issues.apache.org/jira/browse/HBASE-4966?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13197938#comment-13197938 ] 

Nicholas Telford commented on HBASE-4966:
-----------------------------------------

I'm working on a sensible implementation and I have a question.

Currently, KeyValue#equals(Object) returns true if both KeyValues have the same row, irrespective of all other fields (family, qualifier, value, ts etc.).

This appears to be for the convenience case of using List<KeyValue>#contains(KeyValue) to check for an existing KeyValue for a row.

The problem I have with this is that it violates the method contract of Object#hashCode() which states: 

bq. If two objects are equal according to the equals(Object) method, then calling the hashCode method on each of the two objects must produce the same integer result. 

Since the {{KeyValue#hashCode()}} implementation is derived from {{KeyValue#getBuffer()}}, two KVs with the same key but different values would be considered equal but yield different hashCodes.

I can probably work around this, and I imagine it's out of the scope of this ticket to change it, but wouldn't it be a better idea to derive equality from all the KV fields and encapsulate the common use case for {{List<KeyValue>#contains(KeyValue)}} somewhere else? Perhaps a sub-class of List that simply provides this useful facility:

{code:java}
class KVList extends ArrayList<KeyValue> {
  public boolean containsRow(byte[] row) {
    for (KeyValue kv : this) {
      if (Bytes.equals(kv.getRow(), row)) {
        return true;
      }
    }
  }
}
{code}
                
> Put/Delete values cannot be tested with MRUnit
> ----------------------------------------------
>
>                 Key: HBASE-4966
>                 URL: https://issues.apache.org/jira/browse/HBASE-4966
>             Project: HBase
>          Issue Type: Bug
>          Components: client, mapreduce
>    Affects Versions: 0.90.4
>            Reporter: Nicholas Telford
>            Priority: Minor
>
> When using the IdentityTableReducer, which expects input values of either a Put or Delete object, testing with MRUnit the Mapper with MRUnit is not possible because neither Put nor Delete implement equals().
> We should implement equals() on both such that equality means:
> * Both objects are of the same class (in this case, Put or Delete)
> * Both objects are for the same key.
> * Both objects contain an equal set of KeyValues (applicable only to Put)
> KeyValue.equals() appears to already be implemented, but only checks for equality of row key, column family and column qualifier - two KeyValues can be considered "equal" if they contain different values. This won't work for testing.
> Instead, the Put.equals() and Delete.equals() implementations should do a "deep" equality check on their KeyValues, like this:
> {code:java}
> myKv.equals(theirKv) && Bytes.equals(myKv.getValue(), theirKv.getValue());
> {code}
> NOTE: This would impact any code that relies on the existing "identity" implementation of Put.equals() and Delete.equals(), therefore cannot be guaranteed to be backwards-compatible.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira