You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@gora.apache.org by "Alfonso Nishikawa (Jira)" <ji...@apache.org> on 2022/02/09 21:06:00 UTC

[jira] [Assigned] (GORA-391) Arrays persisted in HBase don't shrink automatically

     [ https://issues.apache.org/jira/browse/GORA-391?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Alfonso Nishikawa reassigned GORA-391:
--------------------------------------

    Assignee:     (was: Alfonso Nishikawa)

> Arrays persisted in HBase don't shrink automatically
> ----------------------------------------------------
>
>                 Key: GORA-391
>                 URL: https://issues.apache.org/jira/browse/GORA-391
>             Project: Apache Gora
>          Issue Type: Bug
>          Components: gora-hbase
>    Affects Versions: 0.4, 0.5
>            Reporter: Alfonso Nishikawa
>            Priority: Minor
>              Labels: arrays, maps
>             Fix For: 1.0
>
>
> Fields defined as arrays can grow and be updated, but don't shrink when an element is deleted.
> See the code involved: [https://github.com/apache/gora/blob/master/gora-hbase/src/main/java/org/apache/gora/hbase/store/HBaseStore.java#L312]
> The workaround is:
> # Define the field as a nullable array: ['null', ...array...]
> # Set the field to null and persist  -> the array will be deleted
> # Set the field to the new array and persist -> the array will be persisted with the new size
> Comment from Renato:
> bq.You are right, the array can not be shrinked at the moment and yes, it is wrong having to write the whole array back if you just want to change a single element. The column qualifier used for each item is the original index that means if your original array had 10 elements, then you'd have 10 column qualfiers to store those 10 items. But if then you delete the third element, Gora will end up with 9 actual elements (without the third), but there will be a 10th element inside HBase :( and when modifying a specific element, we will end up rewriting all of the elements :( Maybe we should do the same thing, we do with the maps and rewrite them all into HBase. At least it will work correctly.
> Maybe the best solution would be an adaptative persistency: if a big percentage of the field is persisted, overwrite everything. If a small percentage of the field is persisted, update in a diff maner (addings, deletions, updates). This proposed approach seems too much complex, so the solution to implement is the one found in maps: delete all elements and write them again.
> {panel:bgColor=#FFFFCE} (!) This will be horrible with arrays with big elements and only one update, but it is the same as it is being done by now. Same for maps. {panel}



--
This message was sent by Atlassian Jira
(v8.20.1#820001)