You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@hawq.apache.org by "Michael Andre Pearce (IG) (JIRA)" <ji...@apache.org> on 2016/06/10 23:52:21 UTC
[jira] [Comment Edited] (HAWQ-304) Support update and delete on non-heap tables

    [ https://issues.apache.org/jira/browse/HAWQ-304?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15325523#comment-15325523 ] 

Michael Andre Pearce (IG) edited comment on HAWQ-304 at 6/10/16 11:51 PM:
--------------------------------------------------------------------------

Hi Guys, 

Whilst we wait for this feature, we've made a very rudimentary work around that manages to give the effect/simulate a mutable table (update and delete). 

Ive attached a sample so others can re-use if they wish.

For simulating UPDATE we just append and simply have a bigserial incrementing as update_version for each row key. At query time we select the latest version by key.

For simulating DELETE we have a field that marks if the latest update_version for a key is actually a delete. At query time we remove deleted fields from the select.

For clean-ness we wrap the above in a view which hides the extra columns and keeps normal SELECT queries clean from the logic a little.

We also have a compacting phase we run separately a CTAS which re-creates the table using a SELECT on the view, essentially giving the latest and removing previous iterations and change depth.

It seems this works relatively ok, and for updates/deletes works quite fast too.

Issues we have is that as no way to lock during the compaction if a query occurs when we run that, we can get undesired effects. We manage this by running at a quiet period. Any idea's how we could make this safer? And or faster/less intensive? 

Also we wonder if there is a more efficient way of doing the select in the view to get latest row version for a key?

Cheers 
Mike





was (Author: michael.andre.pearce):
Hi Guys, 

Whilst we wait for this feature, we've made a very rudimentary work around that manages to give the effect/simulate a mutable table (update and delete). 

Ive attached a sample so others can re-use if they wish.

For simulating UPDATE we just append and simply re have a bigserial incrementing as update_version and for each row key. At query time we select the latest version by key.

For simulating DELETE we have a field that marks if the latest version is actually a delete. At query time we remove deleted fields from the select.

For clean-ness we wrap the above in a view which hides the extra columns and keeps normal SELECT queries clean from the logic a little.

We also have a compacting phase we run separately a CTAS which re-creates the table using a SELECT on the view, essentially giving the latest and removing previous iterations and change depth.

It seems this works relatively ok, and for updates/deletes works quite fast too.

Issues we have is that as no way to lock during the compaction if a query occurs when we run that, we can get undesired effects. We manage this by running at a quiet period. Any idea's how we could make this safer? And or faster/less intensive? 

Also we wonder if there is a more efficient way of doing the select in the view to get latest row version for a key?

Cheers 
Mike




> Support update and delete on non-heap tables
> --------------------------------------------
>
>                 Key: HAWQ-304
>                 URL: https://issues.apache.org/jira/browse/HAWQ-304
>             Project: Apache HAWQ
>          Issue Type: New Feature
>          Components: Storage
>            Reporter: Lei Chang
>             Fix For: 3.0.0
>
>         Attachments: mutable_table.sql
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)