You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@metamodel.apache.org by "Gerard Dellemann (JIRA)" <ji...@apache.org> on 2018/06/19 09:54:00 UTC

[jira] [Created] (METAMODEL-1187) What to do with a input value NULL, when updating a HBase table

Gerard Dellemann created METAMODEL-1187:
-------------------------------------------

             Summary: What to do with a input value NULL, when updating a HBase table
                 Key: METAMODEL-1187
                 URL: https://issues.apache.org/jira/browse/METAMODEL-1187
             Project: Apache MetaModel
          Issue Type: Improvement
            Reporter: Gerard Dellemann


With HBase there's no difference between inserting and updating a row. Updating is done by inserting on a row with the same rowKey. HBase will then overwrite the existing cells if the column family and qualifier match.

With HBase you don't execute the insertion of a cell if the value is NULL.

What do we do if a cell has a value (e.g. address:housnr = 1) and it get's updated by a input column that has NULL as value? In the current implementation, we don't execute the insertion. Leaving the cell with a old value (e.g. address:housnr = 1) and a older timestamp then the other cells if they do get updated. This will probably result in unexpected behaviour for someone reading the table after the insertion. You set something to NULL, then you don't expect a value to still exist after that.

I see 4 possible options:
1. We could delete the cell from HBase to match how inserting a row works. However then the qualifier also doesn't exist anymore and then you probably can't go back to a older version of the value if the columnFamily uses versions.
2. We could also set the value to an empty byte array.
3. We could also change the Read functionality, to only show rows with the latest timestamp.
4. We keep it this way.

What's the best solution here?



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)