You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@phoenix.apache.org by "Squires, Tom (ELS-LON)" <to...@elsevier.com> on 2016/08/23 17:52:26 UTC

Can't find some of our Phoenix primary keys in our HBase table when looking for row keys

Hi there,

If we scan our HBase table via hbase shell - limiting to just one result - we get:

hbase(main):003:0* scan 'SCHEMA.DOCUMENTS', { LIMIT => 1 }
ROW                                                           COLUMN+CELL
\x00000000cd-ba0c-39cb-9490-39ca6448d6e4                      column=0:ADDED, timestamp=1471470328985, value=\x80\x00\x01GXU\xBA\x18
\x00000000cd-ba0c-39cb-9490-39ca6448d6e4                      column=0:CONFIRMED, timestamp=1471470328985, value=\x81
\x00000000cd-ba0c-39cb-9490-39ca6448d6e4                      column=0:DELETIONPENDING, timestamp=1471470328985, value=\x80
\x00000000cd-ba0c-39cb-9490-39ca6448d6e4                      column=0:ID, timestamp=1471470328985, value=\x80\x00\x00\x01\x85\xDE\xE4\xD9
\x00000000cd-ba0c-39cb-9490-39ca6448d6e4                      column=0:IMPORTER_ID, timestamp=1471470328985, value=\x80\x00\x00\x18
\x00000000cd-ba0c-39cb-9490-39ca6448d6e4                      column=0:ISAUTHOR, timestamp=1471470328985, value=\x80
\x00000000cd-ba0c-39cb-9490-39ca6448d6e4                      column=0:ISREAD, timestamp=1471470328985, value=\x80
\x00000000cd-ba0c-39cb-9490-39ca6448d6e4                      column=0:ISSTARRED, timestamp=1471470328985, value=\x80
\x00000000cd-ba0c-39cb-9490-39ca6448d6e4                      column=0:MODIFIED, timestamp=1471470328985, value=\x80\x00\x01GXW-0\x00\x00\x00\x00
\x00000000cd-ba0c-39cb-9490-39ca6448d6e4                      column=0:MONTH, timestamp=1471470328985, value=\x80\x00\x00\x09
\x00000000cd-ba0c-39cb-9490-39ca6448d6e4                      column=0:ONLYREFERENCE, timestamp=1471470328985, value=\x80
\x00000000cd-ba0c-39cb-9490-39ca6448d6e4                      column=0:PROFILE_ID, timestamp=1471470328985, value=\xC4`\x09NH
\x00000000cd-ba0c-39cb-9490-39ca6448d6e4                      column=0:PUBLISHED_IN, timestamp=1471470328985, value=A journal.
\x00000000cd-ba0c-39cb-9490-39ca6448d6e4                      column=0:TITLE, timestamp=1471470328985, value=A title.
\x00000000cd-ba0c-39cb-9490-39ca6448d6e4                      column=0:TYPE_ID, timestamp=1471470328985, value=\x80\x00\x00\x01
\x00000000cd-ba0c-39cb-9490-39ca6448d6e4                      column=0:UNIQUE_ID, timestamp=1471470328985, value=fdc82767-106f-47bc-91b5-0df5d90fba72
\x00000000cd-ba0c-39cb-9490-39ca6448d6e4                      column=0:_0, timestamp=1471470328985, value=x
1 row(s) in 0.2570 seconds

We're able to get this row key from hbase shell directly:

hbase(main):006:0> get 'SCHEMA.DOCUMENTS', "\x00000000cd-ba0c-39cb-9490-39ca6448d6e4"
COLUMN                                                        CELL
0:ADDED                                                       timestamp=1471470328985, value=\x80\x00\x01GXU\xBA\x18
0:CONFIRMED                                                   timestamp=1471470328985, value=\x81
0:DELETIONPENDING                                             timestamp=1471470328985, value=\x80
0:ID                                                          timestamp=1471470328985, value=\x80\x00\x00\x01\x85\xDE\xE4\xD9
0:IMPORTER_ID                                                 timestamp=1471470328985, value=\x80\x00\x00\x18
0:ISAUTHOR                                                    timestamp=1471470328985, value=\x80
0:ISREAD                                                      timestamp=1471470328985, value=\x80
0:ISSTARRED                                                   timestamp=1471470328985, value=\x80
0:MODIFIED                                                    timestamp=1471470328985, value=\x80\x00\x01GXW-0\x00\x00\x00\x00
0:MONTH                                                       timestamp=1471470328985, value=\x80\x00\x00\x09
0:ONLYREFERENCE                                               timestamp=1471470328985, value=\x80
0:PROFILE_ID                                                  timestamp=1471470328985, value=\xC4`\x09NH
0:PUBLISHED_IN                                                timestamp=1471470328985, value=A journal.
0:TITLE                                                       timestamp=1471470328985, value=A title.
0:TYPE_ID                                                     timestamp=1471470328985, value=\x80\x00\x00\x01
0:UNIQUE_ID                                                   timestamp=1471470328985, value=fdc82767-106f-47bc-91b5-0df5d90fba72
0:YEAR                                                        timestamp=1471470328985, value=\x80\x00\x07\xDD
0:_0                                                          timestamp=1471470328985, value=x
18 row(s) in 0.0750 seconds

We are also able to look for this row key in Phoenix via a primary key lookup (same HBase cluster), and we find it successfully:

0: jdbc:phoenix:localhost:2181> select uuid from SCHEMA.DOCUMENTS where uuid = '000000cd-ba0c-39cb-9490-39ca6448d6e4';
+---------------------------------------+
|                 UUID                  |
+---------------------------------------+
| 000000cd-ba0c-39cb-9490-39ca6448d6e4  |
+---------------------------------------+
1 row selected (0.085 seconds)

However, if we pick a random row from Phoenix:

0: jdbc:phoenix:localhost:2181> select uuid from SCHEMA.DOCUMENTS limit 1;
+---------------------------------------+
|                 UUID                  |
+---------------------------------------+
| 0061b1a3-4e08-3b24-9fe3-a5d8c2e35455  |
+---------------------------------------+
1 row selected (0.064 seconds)

And look it up in HBase:

hbase(main):005:0> get 'SCHEMA.DOCUMENTS', "\x000061b1a3-4e08-3b24-9fe3-a5d8c2e35455"
COLUMN                                                         CELL
0 row(s) in 0.0040 seconds

We don't get anything back. We are connected to the HBase master for all of the above queries. Are we doing something silly?  Is there something we're missing?

Thanks,
Tom



________________________________

Elsevier Limited. Registered Office: The Boulevard, Langford Lane, Kidlington, Oxford, OX5 1GB, United Kingdom, Registration No. 1982084, Registered in England and Wales.

Re: Can't find some of our Phoenix primary keys in our HBase table when looking for row keys

Posted by "Squires, Tom (ELS-LON)" <to...@elsevier.com>.
Thanks Gabriel, you're right: we are using 10 salt buckets in our table definition. If I scan the HBase table in reverse, I see that our row keys are prefixed with \x09:


hbase(main):015:0> scan 'SCHEMA.DOCUMENTS', { LIMIT => 1, REVERSED => TRUE }
ROW                                                            COLUMN+CELL
 \x09fffffeb0-1298-3fab-a60a-a2895dd590d2                      column=0:ADDED, timestamp=1471477212093, value=\x80\x00\x01L\xD1\x88\x9E\xF0
 \x09fffffeb0-1298-3fab-a60a-a2895dd590d2                      column=0:CONFIRMED, timestamp=1471477212093, value=\x81
 \x09fffffeb0-1298-3fab-a60a-a2895dd590d2                      column=0:DELETIONPENDING, timestamp=1471477212093, value=\x81
 \x09fffffeb0-1298-3fab-a60a-a2895dd590d2                      column=0:ID, timestamp=1471477212093, value=\x80\x00\x00\x01\xB6y\x97e
 \x09fffffeb0-1298-3fab-a60a-a2895dd590d2                      column=0:ISAUTHOR, timestamp=1471477212093, value=\x80
 \x09fffffeb0-1298-3fab-a60a-a2895dd590d2                      column=0:ISREAD, timestamp=1471477212093, value=\x80
 \x09fffffeb0-1298-3fab-a60a-a2895dd590d2                      column=0:ISSTARRED, timestamp=1471477212093, value=\x80
 \x09fffffeb0-1298-3fab-a60a-a2895dd590d2                      column=0:MODIFIED, timestamp=1471477212093, value=\x80\x00\x01L\xD2\xA1ux\x00\x00\x00\x00
 \x09fffffeb0-1298-3fab-a60a-a2895dd590d2                      column=0:ONLYREFERENCE, timestamp=1471477212093, value=\x80
 \x09fffffeb0-1298-3fab-a60a-a2895dd590d2                      column=0:PROFILE_ID, timestamp=1471477212093, value=\xC4\x1624
 \x09fffffeb0-1298-3fab-a60a-a2895dd590d2                      column=0:TITLE, timestamp=1471477212093, value=Another title.
 \x09fffffeb0-1298-3fab-a60a-a2895dd590d2                      column=0:TYPE_ID, timestamp=1471477212093, value=\x80\x00\x00\x01
 \x09fffffeb0-1298-3fab-a60a-a2895dd590d2                      column=0:UNIQUE_ID, timestamp=1471477212093, value=0ba2e206-6705-439a-947e-064ea670a214
 \x09fffffeb0-1298-3fab-a60a-a2895dd590d2                      column=0:YEAR, timestamp=1471477212093, value=\x80\x00\x07\xDF
 \x09fffffeb0-1298-3fab-a60a-a2895dd590d2                      column=0:_0, timestamp=1471477212093, value=x
1 row(s) in 0.0330 seconds


Hopefully this thread helps others in future.


Regards,

Tom

________________________________
From: Gabriel Reid <ga...@gmail.com>
Sent: 23 August 2016 20:02
To: user@phoenix.apache.org
Subject: Re: Can't find some of our Phoenix primary keys in our HBase table when looking for row keys

Hi Tom,

What's the primary key definition of your table? Does it have salted row keys?

In the first example (the one that works) I see a leading byte on the
row key, which makes me think that you're using salting. In the second
example (the one that isn't working) I see the leading "\x00" being
added, but my guess (certainly if you're using salted row keys) is
that this isn't the correct salt byte for that row key.

If you are using salted row keys (or a composite primary key), then
the first byte of the row key that you use to look up the record in
HBase will have to be correctly filled in to take this into account.

- Gabriel

On Tue, Aug 23, 2016 at 7:52 PM, Squires, Tom (ELS-LON)
<to...@elsevier.com> wrote:
> Hi there,
>
> If we scan our HBase table via hbase shell - limiting to just one result -
> we get:
>
> hbase(main):003:0* scan 'SCHEMA.DOCUMENTS', { LIMIT => 1 }
> ROW                                                           COLUMN+CELL
> \x00000000cd-ba0c-39cb-9490-39ca6448d6e4
> column=0:ADDED, timestamp=1471470328985, value=\x80\x00\x01GXU\xBA\x18
> \x00000000cd-ba0c-39cb-9490-39ca6448d6e4
> column=0:CONFIRMED, timestamp=1471470328985, value=\x81
> \x00000000cd-ba0c-39cb-9490-39ca6448d6e4
> column=0:DELETIONPENDING, timestamp=1471470328985, value=\x80
> \x00000000cd-ba0c-39cb-9490-39ca6448d6e4                      column=0:ID,
> timestamp=1471470328985, value=\x80\x00\x00\x01\x85\xDE\xE4\xD9
> \x00000000cd-ba0c-39cb-9490-39ca6448d6e4
> column=0:IMPORTER_ID, timestamp=1471470328985, value=\x80\x00\x00\x18
> \x00000000cd-ba0c-39cb-9490-39ca6448d6e4
> column=0:ISAUTHOR, timestamp=1471470328985, value=\x80
> \x00000000cd-ba0c-39cb-9490-39ca6448d6e4
> column=0:ISREAD, timestamp=1471470328985, value=\x80
> \x00000000cd-ba0c-39cb-9490-39ca6448d6e4
> column=0:ISSTARRED, timestamp=1471470328985, value=\x80
> \x00000000cd-ba0c-39cb-9490-39ca6448d6e4
> column=0:MODIFIED, timestamp=1471470328985,
> value=\x80\x00\x01GXW-0\x00\x00\x00\x00
> \x00000000cd-ba0c-39cb-9490-39ca6448d6e4
> column=0:MONTH, timestamp=1471470328985, value=\x80\x00\x00\x09
> \x00000000cd-ba0c-39cb-9490-39ca6448d6e4
> column=0:ONLYREFERENCE, timestamp=1471470328985, value=\x80
> \x00000000cd-ba0c-39cb-9490-39ca6448d6e4
> column=0:PROFILE_ID, timestamp=1471470328985, value=\xC4`\x09NH
> \x00000000cd-ba0c-39cb-9490-39ca6448d6e4
> column=0:PUBLISHED_IN, timestamp=1471470328985, value=A journal.
> \x00000000cd-ba0c-39cb-9490-39ca6448d6e4
> column=0:TITLE, timestamp=1471470328985, value=A title.
> \x00000000cd-ba0c-39cb-9490-39ca6448d6e4
> column=0:TYPE_ID, timestamp=1471470328985, value=\x80\x00\x00\x01
> \x00000000cd-ba0c-39cb-9490-39ca6448d6e4
> column=0:UNIQUE_ID, timestamp=1471470328985,
> value=fdc82767-106f-47bc-91b5-0df5d90fba72
> \x00000000cd-ba0c-39cb-9490-39ca6448d6e4                      column=0:_0,
> timestamp=1471470328985, value=x
> 1 row(s) in 0.2570 seconds
>
> We're able to get this row key from hbase shell directly:
>
> hbase(main):006:0> get 'SCHEMA.DOCUMENTS',
> "\x00000000cd-ba0c-39cb-9490-39ca6448d6e4"
> COLUMN                                                        CELL
> 0:ADDED
> timestamp=1471470328985, value=\x80\x00\x01GXU\xBA\x18
> 0:CONFIRMED
> timestamp=1471470328985, value=\x81
> 0:DELETIONPENDING
> timestamp=1471470328985, value=\x80
> 0:ID
> timestamp=1471470328985, value=\x80\x00\x00\x01\x85\xDE\xE4\xD9
> 0:IMPORTER_ID
> timestamp=1471470328985, value=\x80\x00\x00\x18
> 0:ISAUTHOR
> timestamp=1471470328985, value=\x80
> 0:ISREAD
> timestamp=1471470328985, value=\x80
> 0:ISSTARRED
> timestamp=1471470328985, value=\x80
> 0:MODIFIED
> timestamp=1471470328985, value=\x80\x00\x01GXW-0\x00\x00\x00\x00
> 0:MONTH
> timestamp=1471470328985, value=\x80\x00\x00\x09
> 0:ONLYREFERENCE
> timestamp=1471470328985, value=\x80
> 0:PROFILE_ID
> timestamp=1471470328985, value=\xC4`\x09NH
> 0:PUBLISHED_IN
> timestamp=1471470328985, value=A journal.
> 0:TITLE
> timestamp=1471470328985, value=A title.
> 0:TYPE_ID
> timestamp=1471470328985, value=\x80\x00\x00\x01
> 0:UNIQUE_ID
> timestamp=1471470328985, value=fdc82767-106f-47bc-91b5-0df5d90fba72
> 0:YEAR
> timestamp=1471470328985, value=\x80\x00\x07\xDD
> 0:_0
> timestamp=1471470328985, value=x
> 18 row(s) in 0.0750 seconds
>
> We are also able to look for this row key in Phoenix via a primary key
> lookup (same HBase cluster), and we find it successfully:
>
> 0: jdbc:phoenix:localhost:2181> select uuid from SCHEMA.DOCUMENTS where uuid
> = '000000cd-ba0c-39cb-9490-39ca6448d6e4';
> +---------------------------------------+
> |                 UUID                  |
> +---------------------------------------+
> | 000000cd-ba0c-39cb-9490-39ca6448d6e4  |
> +---------------------------------------+
> 1 row selected (0.085 seconds)
>
> However, if we pick a random row from Phoenix:
>
> 0: jdbc:phoenix:localhost:2181> select uuid from SCHEMA.DOCUMENTS limit 1;
> +---------------------------------------+
> |                 UUID                  |
> +---------------------------------------+
> | 0061b1a3-4e08-3b24-9fe3-a5d8c2e35455  |
> +---------------------------------------+
> 1 row selected (0.064 seconds)
>
> And look it up in HBase:
>
> hbase(main):005:0> get 'SCHEMA.DOCUMENTS',
> "\x000061b1a3-4e08-3b24-9fe3-a5d8c2e35455"
> COLUMN                                                         CELL
> 0 row(s) in 0.0040 seconds
>
> We don't get anything back. We are connected to the HBase master for all of
> the above queries. Are we doing something silly?  Is there something we're
> missing?
>
> Thanks,
> Tom
>
>
>
> ________________________________
>
> Elsevier Limited. Registered Office: The Boulevard, Langford Lane,
> Kidlington, Oxford, OX5 1GB, United Kingdom, Registration No. 1982084,
> Registered in England and Wales.

________________________________

Elsevier Limited. Registered Office: The Boulevard, Langford Lane, Kidlington, Oxford, OX5 1GB, United Kingdom, Registration No. 1982084, Registered in England and Wales.

Re: Can't find some of our Phoenix primary keys in our HBase table when looking for row keys

Posted by Gabriel Reid <ga...@gmail.com>.
Hi Tom,

What's the primary key definition of your table? Does it have salted row keys?

In the first example (the one that works) I see a leading byte on the
row key, which makes me think that you're using salting. In the second
example (the one that isn't working) I see the leading "\x00" being
added, but my guess (certainly if you're using salted row keys) is
that this isn't the correct salt byte for that row key.

If you are using salted row keys (or a composite primary key), then
the first byte of the row key that you use to look up the record in
HBase will have to be correctly filled in to take this into account.

- Gabriel

On Tue, Aug 23, 2016 at 7:52 PM, Squires, Tom (ELS-LON)
<to...@elsevier.com> wrote:
> Hi there,
>
> If we scan our HBase table via hbase shell - limiting to just one result -
> we get:
>
> hbase(main):003:0* scan 'SCHEMA.DOCUMENTS', { LIMIT => 1 }
> ROW                                                           COLUMN+CELL
> \x00000000cd-ba0c-39cb-9490-39ca6448d6e4
> column=0:ADDED, timestamp=1471470328985, value=\x80\x00\x01GXU\xBA\x18
> \x00000000cd-ba0c-39cb-9490-39ca6448d6e4
> column=0:CONFIRMED, timestamp=1471470328985, value=\x81
> \x00000000cd-ba0c-39cb-9490-39ca6448d6e4
> column=0:DELETIONPENDING, timestamp=1471470328985, value=\x80
> \x00000000cd-ba0c-39cb-9490-39ca6448d6e4                      column=0:ID,
> timestamp=1471470328985, value=\x80\x00\x00\x01\x85\xDE\xE4\xD9
> \x00000000cd-ba0c-39cb-9490-39ca6448d6e4
> column=0:IMPORTER_ID, timestamp=1471470328985, value=\x80\x00\x00\x18
> \x00000000cd-ba0c-39cb-9490-39ca6448d6e4
> column=0:ISAUTHOR, timestamp=1471470328985, value=\x80
> \x00000000cd-ba0c-39cb-9490-39ca6448d6e4
> column=0:ISREAD, timestamp=1471470328985, value=\x80
> \x00000000cd-ba0c-39cb-9490-39ca6448d6e4
> column=0:ISSTARRED, timestamp=1471470328985, value=\x80
> \x00000000cd-ba0c-39cb-9490-39ca6448d6e4
> column=0:MODIFIED, timestamp=1471470328985,
> value=\x80\x00\x01GXW-0\x00\x00\x00\x00
> \x00000000cd-ba0c-39cb-9490-39ca6448d6e4
> column=0:MONTH, timestamp=1471470328985, value=\x80\x00\x00\x09
> \x00000000cd-ba0c-39cb-9490-39ca6448d6e4
> column=0:ONLYREFERENCE, timestamp=1471470328985, value=\x80
> \x00000000cd-ba0c-39cb-9490-39ca6448d6e4
> column=0:PROFILE_ID, timestamp=1471470328985, value=\xC4`\x09NH
> \x00000000cd-ba0c-39cb-9490-39ca6448d6e4
> column=0:PUBLISHED_IN, timestamp=1471470328985, value=A journal.
> \x00000000cd-ba0c-39cb-9490-39ca6448d6e4
> column=0:TITLE, timestamp=1471470328985, value=A title.
> \x00000000cd-ba0c-39cb-9490-39ca6448d6e4
> column=0:TYPE_ID, timestamp=1471470328985, value=\x80\x00\x00\x01
> \x00000000cd-ba0c-39cb-9490-39ca6448d6e4
> column=0:UNIQUE_ID, timestamp=1471470328985,
> value=fdc82767-106f-47bc-91b5-0df5d90fba72
> \x00000000cd-ba0c-39cb-9490-39ca6448d6e4                      column=0:_0,
> timestamp=1471470328985, value=x
> 1 row(s) in 0.2570 seconds
>
> We're able to get this row key from hbase shell directly:
>
> hbase(main):006:0> get 'SCHEMA.DOCUMENTS',
> "\x00000000cd-ba0c-39cb-9490-39ca6448d6e4"
> COLUMN                                                        CELL
> 0:ADDED
> timestamp=1471470328985, value=\x80\x00\x01GXU\xBA\x18
> 0:CONFIRMED
> timestamp=1471470328985, value=\x81
> 0:DELETIONPENDING
> timestamp=1471470328985, value=\x80
> 0:ID
> timestamp=1471470328985, value=\x80\x00\x00\x01\x85\xDE\xE4\xD9
> 0:IMPORTER_ID
> timestamp=1471470328985, value=\x80\x00\x00\x18
> 0:ISAUTHOR
> timestamp=1471470328985, value=\x80
> 0:ISREAD
> timestamp=1471470328985, value=\x80
> 0:ISSTARRED
> timestamp=1471470328985, value=\x80
> 0:MODIFIED
> timestamp=1471470328985, value=\x80\x00\x01GXW-0\x00\x00\x00\x00
> 0:MONTH
> timestamp=1471470328985, value=\x80\x00\x00\x09
> 0:ONLYREFERENCE
> timestamp=1471470328985, value=\x80
> 0:PROFILE_ID
> timestamp=1471470328985, value=\xC4`\x09NH
> 0:PUBLISHED_IN
> timestamp=1471470328985, value=A journal.
> 0:TITLE
> timestamp=1471470328985, value=A title.
> 0:TYPE_ID
> timestamp=1471470328985, value=\x80\x00\x00\x01
> 0:UNIQUE_ID
> timestamp=1471470328985, value=fdc82767-106f-47bc-91b5-0df5d90fba72
> 0:YEAR
> timestamp=1471470328985, value=\x80\x00\x07\xDD
> 0:_0
> timestamp=1471470328985, value=x
> 18 row(s) in 0.0750 seconds
>
> We are also able to look for this row key in Phoenix via a primary key
> lookup (same HBase cluster), and we find it successfully:
>
> 0: jdbc:phoenix:localhost:2181> select uuid from SCHEMA.DOCUMENTS where uuid
> = '000000cd-ba0c-39cb-9490-39ca6448d6e4';
> +---------------------------------------+
> |                 UUID                  |
> +---------------------------------------+
> | 000000cd-ba0c-39cb-9490-39ca6448d6e4  |
> +---------------------------------------+
> 1 row selected (0.085 seconds)
>
> However, if we pick a random row from Phoenix:
>
> 0: jdbc:phoenix:localhost:2181> select uuid from SCHEMA.DOCUMENTS limit 1;
> +---------------------------------------+
> |                 UUID                  |
> +---------------------------------------+
> | 0061b1a3-4e08-3b24-9fe3-a5d8c2e35455  |
> +---------------------------------------+
> 1 row selected (0.064 seconds)
>
> And look it up in HBase:
>
> hbase(main):005:0> get 'SCHEMA.DOCUMENTS',
> "\x000061b1a3-4e08-3b24-9fe3-a5d8c2e35455"
> COLUMN                                                         CELL
> 0 row(s) in 0.0040 seconds
>
> We don't get anything back. We are connected to the HBase master for all of
> the above queries. Are we doing something silly?  Is there something we're
> missing?
>
> Thanks,
> Tom
>
>
>
> ________________________________
>
> Elsevier Limited. Registered Office: The Boulevard, Langford Lane,
> Kidlington, Oxford, OX5 1GB, United Kingdom, Registration No. 1982084,
> Registered in England and Wales.