You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@phoenix.apache.org by "James Taylor (JIRA)" <ji...@apache.org> on 2015/12/08 07:09:10 UTC
[jira] [Commented] (PHOENIX-2415) Support ROW_TIMESTAMP with transactional tables

    [ https://issues.apache.org/jira/browse/PHOENIX-2415?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15046432#comment-15046432 ] 

James Taylor commented on PHOENIX-2415:
---------------------------------------

I think it may not make sense to checkpoint for each statement as then the cell timestamps with and without usage of ROW_TIMESTAMP would be different. Instead, I think the following will be required:
- make sure that the value for the row key exactly matches the cell timestamp (which may already be the case).
- raise an error if an attempt is made to manually provide a value for the ROW_TIMESTAMP column in an UPSERT statement.
- ensure that date arithmetic takes into account that a ROW_TIMESTAMP column value is in nanoseconds instead of milliseconds. I think if we take any reference to the ROW_TIMESTAMP column and resolve it to {{column/1000000}} in ExpressionCompiler.resolveColumn(), that might do the trick.

> Support ROW_TIMESTAMP with transactional tables
> -----------------------------------------------
>
>                 Key: PHOENIX-2415
>                 URL: https://issues.apache.org/jira/browse/PHOENIX-2415
>             Project: Phoenix
>          Issue Type: Sub-task
>            Reporter: James Taylor
>            Assignee: Thomas D'Silva
>
> Currently transactional tables don't use the ROW_TIMESTAMP optimization, based on this code in BaseQueryPlan:
> {code}
> +        if (!table.isTransactional()) {
> +	                // Get the time range of row_timestamp column
> +	        TimeRange rowTimestampRange = context.getScanRanges().getRowTimestampRange();
> +	        // Get the already existing time range on the scan.
> +	        TimeRange scanTimeRange = scan.getTimeRange();
> +	        Long scn = connection.getSCN();
> +	        if (scn == null) {
> +	            scn = context.getCurrentTime();
> +	        }
> +	        try {
> +	            TimeRange timeRangeToUse = ScanUtil.intersectTimeRange(rowTimestampRange, scanTimeRange, scn);
> +	            if (timeRangeToUse == null) {
> +	                return ResultIterator.EMPTY_ITERATOR;
> +	            }
> +	            scan.setTimeRange(timeRangeToUse.getMin(), timeRangeToUse.getMax());
> +	        } catch (IOException e) {
> +	            throw new RuntimeException(e);
> +	        }
> +	    }
> {code}
> Instead, we should allow optimization, but disallow manually setting the ROW_TIMESTAMP column on UPSERT commands for transactional tables. We can use the write pointer of the transaction as the current time and ensure that a checkpoint is done prior to statement execution on a table with ROW_TIMESTAMP to ensure that time is moving forward when a new statement is executed. Here in this code, we'd just set the scn to HConstants.LATEST_TIMESTAMP for transactional tables. Would be good to have a test too, to make sure Tephra doesn't reset the min time range on the scan.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)