You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@phoenix.apache.org by "Nick Dimiduk (JIRA)" <ji...@apache.org> on 2016/01/12 20:05:39 UTC

[jira] [Commented] (PHOENIX-2067) Sort order incorrect for variable length DESC columns

    [ https://issues.apache.org/jira/browse/PHOENIX-2067?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15094529#comment-15094529 ] 

Nick Dimiduk commented on PHOENIX-2067:
---------------------------------------

I think this patch broke PK constraints for NOT NULL, non-fixed-width VARCHAR columns.

{noformat}
            // If some non null pk values aren't set, then throw
            if (i < nColumns) {
                PColumn column = columns.get(i);
                if (column.getDataType().isFixedWidth() || !column.isNullable()) {
                    throw new ConstraintViolationException(name.getString() + "." + column.getName().getString() + " may not be null");
                }
            }
{noformat}

The constraint we're managing is that of non-null for PK columns. For VARCHAR, fixed-width meets this criteria because there's no available representation for NULL. I assume variable-length columns can represent null (though I don't know the encoding details off the top of my head), so we must look for the `NOT NULL` constraint added in schema.

I think the if condition should be
{noformat}
if (!column.getDataType.isFixedWidth() || column.isNullable()) { throw... }
{noformat}

Also, can this be folded into a single {{column.isNullable()}} -- shouldn't that method check the datatype on the caller's behalf? Or is there a scenario where we want to know what additional constraints the schema defined vs. what the datatype offers?

> Sort order incorrect for variable length DESC columns
> -----------------------------------------------------
>
>                 Key: PHOENIX-2067
>                 URL: https://issues.apache.org/jira/browse/PHOENIX-2067
>             Project: Phoenix
>          Issue Type: Bug
>    Affects Versions: 4.4.0
>         Environment: HBase 0.98.6-cdh5.3.0
> jdk1.7.0_67 x64
> CentOS release 6.4 (2.6.32-358.el6.x86_64)
>            Reporter: Mykola Komarnytskyy
>            Assignee: James Taylor
>             Fix For: 4.5.0
>
>         Attachments: PHOENIX-2067-tests.patch, PHOENIX-2067-tests2.patch, PHOENIX-2067_4.4.1.patch, PHOENIX-2067_addendum.patch, PHOENIX-2067_array_addendum.patch, PHOENIX-2067_array_addendum_v2.patch, PHOENIX-2067_v1.patch, PHOENIX-2067_v2.patch, PHOENIX-2067_v3.patch
>
>
> Steps to reproduce:
> 1. Create a table: 
> CREATE TABLE mytable (id BIGINT not null PRIMARY KEY, timestamp BIGINT, log_message varchar) IMMUTABLE_ROWS=true, SALT_BUCKETS=16;
> 2. Create two indexes:
> CREATE INDEX mytable_index_search ON mytable(timestamp,id) INCLUDE (log_message) SALT_BUCKETS=16;
> CREATE INDEX mytable_index_search_desc ON mytable(timestamp DESC,id DESC) INCLUDE (log_message) SALT_BUCKETS=16;
> 3. Upsert values:
> UPSERT INTO mytable VALUES(1, 1434983826018, 'message1');
> UPSERT INTO mytable VALUES(2, 1434983826100, 'message2');
> UPSERT INTO mytable VALUES(3, 1434983826101, 'message3');
> UPSERT INTO mytable VALUES(4, 1434983826202, 'message4');
> 4. Sort DESC by timestamp:
> select timestamp,id,log_message from mytable ORDER BY timestamp DESC;
> Failure: data is sorted incorrectly. In case when we have two longs which  are different only by last two digits (e.g. 1434983826155, 1434983826100)  and one of the long ends with '00' we receive incorrect order. 
> Sorting result:
> 1434983826202
> 1434983826100
> 1434983826101
> 1434983826018



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)