You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@phoenix.apache.org by "James Taylor (JIRA)" <ji...@apache.org> on 2014/02/11 04:06:19 UTC
[jira] [Commented] (PHOENIX-6) Support on duplicate key ignore construct

    [ https://issues.apache.org/jira/browse/PHOENIX-6?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13897489#comment-13897489 ] 

James Taylor commented on PHOENIX-6:
------------------------------------

This could be done in two stages: first for UPSERT VALUES and next for UPSERT SELECT. Here's one way this could be approached:
- add an ON DUPLICATE KEY IGNORE clause to UPSERT in the sql grammar.
- pass this through the UpsertStatement as a new ignoreDuplicateKeys boolean
- modify UpsertCompiler to pass this boolean into MutationState
- modify MutationState to create a different operation than Put. Unfortunately checkAndPut is not batch-able, so you need to either create a new class that implements org.apache.hadoop.hbase.client.Row or you might be able to "borrow" the Append operation (as Phoenix doesn't support this operation). The former would be better (and [~lhofhansl] mentioned to me before that this would not be difficult).
- modify the Indexer.preBatchMutate to look for instances of your new class - you'd want to collect all these up and turn them into a region.checkAndPut operations instead. You could test for the existence of our empty key value (column family is dependent on the table through the SchemaUtil.getEmptyColumnFamily(ptable) method and column qualifier of QueryConstant.EMPTY_COLUMN_BYTES). Not sure if when you do a checkAndPut if the regular Put coprocessor will fire if the check passes ([~jesse_yates] might know), but if it does, that'd be good, because then the index maintenance code would kick in which is what you want. If not, you'll need to get the row lock yourself, do the region.get() to see if the row exists and then do the region.put() if it doesn't (see SequenceRegionObserver for an example).

> Support on duplicate key ignore construct
> -----------------------------------------
>
>                 Key: PHOENIX-6
>                 URL: https://issues.apache.org/jira/browse/PHOENIX-6
>             Project: Phoenix
>          Issue Type: New Feature
>            Reporter: James Taylor
>
> To support inserting a new row only if it doesn't already exist, we should support the "on duplicate key ignore" construct (or it's SQL standard equivalent) for UPSERT.
> See this discussion for more detail: https://groups.google.com/d/msg/phoenix-hbase-user/Bof-TLrbTGg/68bnc8ZcWe0J



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)