You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@phoenix.apache.org by "Squires, Tom (ELS-LON)" <to...@elsevier.com> on 2016/08/23 17:52:26 UTC
Can't find some of our Phoenix primary keys in our HBase table when
looking for row keys
Hi there,
If we scan our HBase table via hbase shell - limiting to just one result - we get:
hbase(main):003:0* scan 'SCHEMA.DOCUMENTS', { LIMIT => 1 }
ROW COLUMN+CELL
\x00000000cd-ba0c-39cb-9490-39ca6448d6e4 column=0:ADDED, timestamp=1471470328985, value=\x80\x00\x01GXU\xBA\x18
\x00000000cd-ba0c-39cb-9490-39ca6448d6e4 column=0:CONFIRMED, timestamp=1471470328985, value=\x81
\x00000000cd-ba0c-39cb-9490-39ca6448d6e4 column=0:DELETIONPENDING, timestamp=1471470328985, value=\x80
\x00000000cd-ba0c-39cb-9490-39ca6448d6e4 column=0:ID, timestamp=1471470328985, value=\x80\x00\x00\x01\x85\xDE\xE4\xD9
\x00000000cd-ba0c-39cb-9490-39ca6448d6e4 column=0:IMPORTER_ID, timestamp=1471470328985, value=\x80\x00\x00\x18
\x00000000cd-ba0c-39cb-9490-39ca6448d6e4 column=0:ISAUTHOR, timestamp=1471470328985, value=\x80
\x00000000cd-ba0c-39cb-9490-39ca6448d6e4 column=0:ISREAD, timestamp=1471470328985, value=\x80
\x00000000cd-ba0c-39cb-9490-39ca6448d6e4 column=0:ISSTARRED, timestamp=1471470328985, value=\x80
\x00000000cd-ba0c-39cb-9490-39ca6448d6e4 column=0:MODIFIED, timestamp=1471470328985, value=\x80\x00\x01GXW-0\x00\x00\x00\x00
\x00000000cd-ba0c-39cb-9490-39ca6448d6e4 column=0:MONTH, timestamp=1471470328985, value=\x80\x00\x00\x09
\x00000000cd-ba0c-39cb-9490-39ca6448d6e4 column=0:ONLYREFERENCE, timestamp=1471470328985, value=\x80
\x00000000cd-ba0c-39cb-9490-39ca6448d6e4 column=0:PROFILE_ID, timestamp=1471470328985, value=\xC4`\x09NH
\x00000000cd-ba0c-39cb-9490-39ca6448d6e4 column=0:PUBLISHED_IN, timestamp=1471470328985, value=A journal.
\x00000000cd-ba0c-39cb-9490-39ca6448d6e4 column=0:TITLE, timestamp=1471470328985, value=A title.
\x00000000cd-ba0c-39cb-9490-39ca6448d6e4 column=0:TYPE_ID, timestamp=1471470328985, value=\x80\x00\x00\x01
\x00000000cd-ba0c-39cb-9490-39ca6448d6e4 column=0:UNIQUE_ID, timestamp=1471470328985, value=fdc82767-106f-47bc-91b5-0df5d90fba72
\x00000000cd-ba0c-39cb-9490-39ca6448d6e4 column=0:_0, timestamp=1471470328985, value=x
1 row(s) in 0.2570 seconds
We're able to get this row key from hbase shell directly:
hbase(main):006:0> get 'SCHEMA.DOCUMENTS', "\x00000000cd-ba0c-39cb-9490-39ca6448d6e4"
COLUMN CELL
0:ADDED timestamp=1471470328985, value=\x80\x00\x01GXU\xBA\x18
0:CONFIRMED timestamp=1471470328985, value=\x81
0:DELETIONPENDING timestamp=1471470328985, value=\x80
0:ID timestamp=1471470328985, value=\x80\x00\x00\x01\x85\xDE\xE4\xD9
0:IMPORTER_ID timestamp=1471470328985, value=\x80\x00\x00\x18
0:ISAUTHOR timestamp=1471470328985, value=\x80
0:ISREAD timestamp=1471470328985, value=\x80
0:ISSTARRED timestamp=1471470328985, value=\x80
0:MODIFIED timestamp=1471470328985, value=\x80\x00\x01GXW-0\x00\x00\x00\x00
0:MONTH timestamp=1471470328985, value=\x80\x00\x00\x09
0:ONLYREFERENCE timestamp=1471470328985, value=\x80
0:PROFILE_ID timestamp=1471470328985, value=\xC4`\x09NH
0:PUBLISHED_IN timestamp=1471470328985, value=A journal.
0:TITLE timestamp=1471470328985, value=A title.
0:TYPE_ID timestamp=1471470328985, value=\x80\x00\x00\x01
0:UNIQUE_ID timestamp=1471470328985, value=fdc82767-106f-47bc-91b5-0df5d90fba72
0:YEAR timestamp=1471470328985, value=\x80\x00\x07\xDD
0:_0 timestamp=1471470328985, value=x
18 row(s) in 0.0750 seconds
We are also able to look for this row key in Phoenix via a primary key lookup (same HBase cluster), and we find it successfully:
0: jdbc:phoenix:localhost:2181> select uuid from SCHEMA.DOCUMENTS where uuid = '000000cd-ba0c-39cb-9490-39ca6448d6e4';
+---------------------------------------+
| UUID |
+---------------------------------------+
| 000000cd-ba0c-39cb-9490-39ca6448d6e4 |
+---------------------------------------+
1 row selected (0.085 seconds)
However, if we pick a random row from Phoenix:
0: jdbc:phoenix:localhost:2181> select uuid from SCHEMA.DOCUMENTS limit 1;
+---------------------------------------+
| UUID |
+---------------------------------------+
| 0061b1a3-4e08-3b24-9fe3-a5d8c2e35455 |
+---------------------------------------+
1 row selected (0.064 seconds)
And look it up in HBase:
hbase(main):005:0> get 'SCHEMA.DOCUMENTS', "\x000061b1a3-4e08-3b24-9fe3-a5d8c2e35455"
COLUMN CELL
0 row(s) in 0.0040 seconds
We don't get anything back. We are connected to the HBase master for all of the above queries. Are we doing something silly? Is there something we're missing?
Thanks,
Tom
________________________________
Elsevier Limited. Registered Office: The Boulevard, Langford Lane, Kidlington, Oxford, OX5 1GB, United Kingdom, Registration No. 1982084, Registered in England and Wales.
Re: Can't find some of our Phoenix primary keys in our HBase table
when looking for row keys
Posted by "Squires, Tom (ELS-LON)" <to...@elsevier.com>.
Thanks Gabriel, you're right: we are using 10 salt buckets in our table definition. If I scan the HBase table in reverse, I see that our row keys are prefixed with \x09:
hbase(main):015:0> scan 'SCHEMA.DOCUMENTS', { LIMIT => 1, REVERSED => TRUE }
ROW COLUMN+CELL
\x09fffffeb0-1298-3fab-a60a-a2895dd590d2 column=0:ADDED, timestamp=1471477212093, value=\x80\x00\x01L\xD1\x88\x9E\xF0
\x09fffffeb0-1298-3fab-a60a-a2895dd590d2 column=0:CONFIRMED, timestamp=1471477212093, value=\x81
\x09fffffeb0-1298-3fab-a60a-a2895dd590d2 column=0:DELETIONPENDING, timestamp=1471477212093, value=\x81
\x09fffffeb0-1298-3fab-a60a-a2895dd590d2 column=0:ID, timestamp=1471477212093, value=\x80\x00\x00\x01\xB6y\x97e
\x09fffffeb0-1298-3fab-a60a-a2895dd590d2 column=0:ISAUTHOR, timestamp=1471477212093, value=\x80
\x09fffffeb0-1298-3fab-a60a-a2895dd590d2 column=0:ISREAD, timestamp=1471477212093, value=\x80
\x09fffffeb0-1298-3fab-a60a-a2895dd590d2 column=0:ISSTARRED, timestamp=1471477212093, value=\x80
\x09fffffeb0-1298-3fab-a60a-a2895dd590d2 column=0:MODIFIED, timestamp=1471477212093, value=\x80\x00\x01L\xD2\xA1ux\x00\x00\x00\x00
\x09fffffeb0-1298-3fab-a60a-a2895dd590d2 column=0:ONLYREFERENCE, timestamp=1471477212093, value=\x80
\x09fffffeb0-1298-3fab-a60a-a2895dd590d2 column=0:PROFILE_ID, timestamp=1471477212093, value=\xC4\x1624
\x09fffffeb0-1298-3fab-a60a-a2895dd590d2 column=0:TITLE, timestamp=1471477212093, value=Another title.
\x09fffffeb0-1298-3fab-a60a-a2895dd590d2 column=0:TYPE_ID, timestamp=1471477212093, value=\x80\x00\x00\x01
\x09fffffeb0-1298-3fab-a60a-a2895dd590d2 column=0:UNIQUE_ID, timestamp=1471477212093, value=0ba2e206-6705-439a-947e-064ea670a214
\x09fffffeb0-1298-3fab-a60a-a2895dd590d2 column=0:YEAR, timestamp=1471477212093, value=\x80\x00\x07\xDF
\x09fffffeb0-1298-3fab-a60a-a2895dd590d2 column=0:_0, timestamp=1471477212093, value=x
1 row(s) in 0.0330 seconds
Hopefully this thread helps others in future.
Regards,
Tom
________________________________
From: Gabriel Reid <ga...@gmail.com>
Sent: 23 August 2016 20:02
To: user@phoenix.apache.org
Subject: Re: Can't find some of our Phoenix primary keys in our HBase table when looking for row keys
Hi Tom,
What's the primary key definition of your table? Does it have salted row keys?
In the first example (the one that works) I see a leading byte on the
row key, which makes me think that you're using salting. In the second
example (the one that isn't working) I see the leading "\x00" being
added, but my guess (certainly if you're using salted row keys) is
that this isn't the correct salt byte for that row key.
If you are using salted row keys (or a composite primary key), then
the first byte of the row key that you use to look up the record in
HBase will have to be correctly filled in to take this into account.
- Gabriel
On Tue, Aug 23, 2016 at 7:52 PM, Squires, Tom (ELS-LON)
<to...@elsevier.com> wrote:
> Hi there,
>
> If we scan our HBase table via hbase shell - limiting to just one result -
> we get:
>
> hbase(main):003:0* scan 'SCHEMA.DOCUMENTS', { LIMIT => 1 }
> ROW COLUMN+CELL
> \x00000000cd-ba0c-39cb-9490-39ca6448d6e4
> column=0:ADDED, timestamp=1471470328985, value=\x80\x00\x01GXU\xBA\x18
> \x00000000cd-ba0c-39cb-9490-39ca6448d6e4
> column=0:CONFIRMED, timestamp=1471470328985, value=\x81
> \x00000000cd-ba0c-39cb-9490-39ca6448d6e4
> column=0:DELETIONPENDING, timestamp=1471470328985, value=\x80
> \x00000000cd-ba0c-39cb-9490-39ca6448d6e4 column=0:ID,
> timestamp=1471470328985, value=\x80\x00\x00\x01\x85\xDE\xE4\xD9
> \x00000000cd-ba0c-39cb-9490-39ca6448d6e4
> column=0:IMPORTER_ID, timestamp=1471470328985, value=\x80\x00\x00\x18
> \x00000000cd-ba0c-39cb-9490-39ca6448d6e4
> column=0:ISAUTHOR, timestamp=1471470328985, value=\x80
> \x00000000cd-ba0c-39cb-9490-39ca6448d6e4
> column=0:ISREAD, timestamp=1471470328985, value=\x80
> \x00000000cd-ba0c-39cb-9490-39ca6448d6e4
> column=0:ISSTARRED, timestamp=1471470328985, value=\x80
> \x00000000cd-ba0c-39cb-9490-39ca6448d6e4
> column=0:MODIFIED, timestamp=1471470328985,
> value=\x80\x00\x01GXW-0\x00\x00\x00\x00
> \x00000000cd-ba0c-39cb-9490-39ca6448d6e4
> column=0:MONTH, timestamp=1471470328985, value=\x80\x00\x00\x09
> \x00000000cd-ba0c-39cb-9490-39ca6448d6e4
> column=0:ONLYREFERENCE, timestamp=1471470328985, value=\x80
> \x00000000cd-ba0c-39cb-9490-39ca6448d6e4
> column=0:PROFILE_ID, timestamp=1471470328985, value=\xC4`\x09NH
> \x00000000cd-ba0c-39cb-9490-39ca6448d6e4
> column=0:PUBLISHED_IN, timestamp=1471470328985, value=A journal.
> \x00000000cd-ba0c-39cb-9490-39ca6448d6e4
> column=0:TITLE, timestamp=1471470328985, value=A title.
> \x00000000cd-ba0c-39cb-9490-39ca6448d6e4
> column=0:TYPE_ID, timestamp=1471470328985, value=\x80\x00\x00\x01
> \x00000000cd-ba0c-39cb-9490-39ca6448d6e4
> column=0:UNIQUE_ID, timestamp=1471470328985,
> value=fdc82767-106f-47bc-91b5-0df5d90fba72
> \x00000000cd-ba0c-39cb-9490-39ca6448d6e4 column=0:_0,
> timestamp=1471470328985, value=x
> 1 row(s) in 0.2570 seconds
>
> We're able to get this row key from hbase shell directly:
>
> hbase(main):006:0> get 'SCHEMA.DOCUMENTS',
> "\x00000000cd-ba0c-39cb-9490-39ca6448d6e4"
> COLUMN CELL
> 0:ADDED
> timestamp=1471470328985, value=\x80\x00\x01GXU\xBA\x18
> 0:CONFIRMED
> timestamp=1471470328985, value=\x81
> 0:DELETIONPENDING
> timestamp=1471470328985, value=\x80
> 0:ID
> timestamp=1471470328985, value=\x80\x00\x00\x01\x85\xDE\xE4\xD9
> 0:IMPORTER_ID
> timestamp=1471470328985, value=\x80\x00\x00\x18
> 0:ISAUTHOR
> timestamp=1471470328985, value=\x80
> 0:ISREAD
> timestamp=1471470328985, value=\x80
> 0:ISSTARRED
> timestamp=1471470328985, value=\x80
> 0:MODIFIED
> timestamp=1471470328985, value=\x80\x00\x01GXW-0\x00\x00\x00\x00
> 0:MONTH
> timestamp=1471470328985, value=\x80\x00\x00\x09
> 0:ONLYREFERENCE
> timestamp=1471470328985, value=\x80
> 0:PROFILE_ID
> timestamp=1471470328985, value=\xC4`\x09NH
> 0:PUBLISHED_IN
> timestamp=1471470328985, value=A journal.
> 0:TITLE
> timestamp=1471470328985, value=A title.
> 0:TYPE_ID
> timestamp=1471470328985, value=\x80\x00\x00\x01
> 0:UNIQUE_ID
> timestamp=1471470328985, value=fdc82767-106f-47bc-91b5-0df5d90fba72
> 0:YEAR
> timestamp=1471470328985, value=\x80\x00\x07\xDD
> 0:_0
> timestamp=1471470328985, value=x
> 18 row(s) in 0.0750 seconds
>
> We are also able to look for this row key in Phoenix via a primary key
> lookup (same HBase cluster), and we find it successfully:
>
> 0: jdbc:phoenix:localhost:2181> select uuid from SCHEMA.DOCUMENTS where uuid
> = '000000cd-ba0c-39cb-9490-39ca6448d6e4';
> +---------------------------------------+
> | UUID |
> +---------------------------------------+
> | 000000cd-ba0c-39cb-9490-39ca6448d6e4 |
> +---------------------------------------+
> 1 row selected (0.085 seconds)
>
> However, if we pick a random row from Phoenix:
>
> 0: jdbc:phoenix:localhost:2181> select uuid from SCHEMA.DOCUMENTS limit 1;
> +---------------------------------------+
> | UUID |
> +---------------------------------------+
> | 0061b1a3-4e08-3b24-9fe3-a5d8c2e35455 |
> +---------------------------------------+
> 1 row selected (0.064 seconds)
>
> And look it up in HBase:
>
> hbase(main):005:0> get 'SCHEMA.DOCUMENTS',
> "\x000061b1a3-4e08-3b24-9fe3-a5d8c2e35455"
> COLUMN CELL
> 0 row(s) in 0.0040 seconds
>
> We don't get anything back. We are connected to the HBase master for all of
> the above queries. Are we doing something silly? Is there something we're
> missing?
>
> Thanks,
> Tom
>
>
>
> ________________________________
>
> Elsevier Limited. Registered Office: The Boulevard, Langford Lane,
> Kidlington, Oxford, OX5 1GB, United Kingdom, Registration No. 1982084,
> Registered in England and Wales.
________________________________
Elsevier Limited. Registered Office: The Boulevard, Langford Lane, Kidlington, Oxford, OX5 1GB, United Kingdom, Registration No. 1982084, Registered in England and Wales.
Re: Can't find some of our Phoenix primary keys in our HBase table
when looking for row keys
Posted by Gabriel Reid <ga...@gmail.com>.
Hi Tom,
What's the primary key definition of your table? Does it have salted row keys?
In the first example (the one that works) I see a leading byte on the
row key, which makes me think that you're using salting. In the second
example (the one that isn't working) I see the leading "\x00" being
added, but my guess (certainly if you're using salted row keys) is
that this isn't the correct salt byte for that row key.
If you are using salted row keys (or a composite primary key), then
the first byte of the row key that you use to look up the record in
HBase will have to be correctly filled in to take this into account.
- Gabriel
On Tue, Aug 23, 2016 at 7:52 PM, Squires, Tom (ELS-LON)
<to...@elsevier.com> wrote:
> Hi there,
>
> If we scan our HBase table via hbase shell - limiting to just one result -
> we get:
>
> hbase(main):003:0* scan 'SCHEMA.DOCUMENTS', { LIMIT => 1 }
> ROW COLUMN+CELL
> \x00000000cd-ba0c-39cb-9490-39ca6448d6e4
> column=0:ADDED, timestamp=1471470328985, value=\x80\x00\x01GXU\xBA\x18
> \x00000000cd-ba0c-39cb-9490-39ca6448d6e4
> column=0:CONFIRMED, timestamp=1471470328985, value=\x81
> \x00000000cd-ba0c-39cb-9490-39ca6448d6e4
> column=0:DELETIONPENDING, timestamp=1471470328985, value=\x80
> \x00000000cd-ba0c-39cb-9490-39ca6448d6e4 column=0:ID,
> timestamp=1471470328985, value=\x80\x00\x00\x01\x85\xDE\xE4\xD9
> \x00000000cd-ba0c-39cb-9490-39ca6448d6e4
> column=0:IMPORTER_ID, timestamp=1471470328985, value=\x80\x00\x00\x18
> \x00000000cd-ba0c-39cb-9490-39ca6448d6e4
> column=0:ISAUTHOR, timestamp=1471470328985, value=\x80
> \x00000000cd-ba0c-39cb-9490-39ca6448d6e4
> column=0:ISREAD, timestamp=1471470328985, value=\x80
> \x00000000cd-ba0c-39cb-9490-39ca6448d6e4
> column=0:ISSTARRED, timestamp=1471470328985, value=\x80
> \x00000000cd-ba0c-39cb-9490-39ca6448d6e4
> column=0:MODIFIED, timestamp=1471470328985,
> value=\x80\x00\x01GXW-0\x00\x00\x00\x00
> \x00000000cd-ba0c-39cb-9490-39ca6448d6e4
> column=0:MONTH, timestamp=1471470328985, value=\x80\x00\x00\x09
> \x00000000cd-ba0c-39cb-9490-39ca6448d6e4
> column=0:ONLYREFERENCE, timestamp=1471470328985, value=\x80
> \x00000000cd-ba0c-39cb-9490-39ca6448d6e4
> column=0:PROFILE_ID, timestamp=1471470328985, value=\xC4`\x09NH
> \x00000000cd-ba0c-39cb-9490-39ca6448d6e4
> column=0:PUBLISHED_IN, timestamp=1471470328985, value=A journal.
> \x00000000cd-ba0c-39cb-9490-39ca6448d6e4
> column=0:TITLE, timestamp=1471470328985, value=A title.
> \x00000000cd-ba0c-39cb-9490-39ca6448d6e4
> column=0:TYPE_ID, timestamp=1471470328985, value=\x80\x00\x00\x01
> \x00000000cd-ba0c-39cb-9490-39ca6448d6e4
> column=0:UNIQUE_ID, timestamp=1471470328985,
> value=fdc82767-106f-47bc-91b5-0df5d90fba72
> \x00000000cd-ba0c-39cb-9490-39ca6448d6e4 column=0:_0,
> timestamp=1471470328985, value=x
> 1 row(s) in 0.2570 seconds
>
> We're able to get this row key from hbase shell directly:
>
> hbase(main):006:0> get 'SCHEMA.DOCUMENTS',
> "\x00000000cd-ba0c-39cb-9490-39ca6448d6e4"
> COLUMN CELL
> 0:ADDED
> timestamp=1471470328985, value=\x80\x00\x01GXU\xBA\x18
> 0:CONFIRMED
> timestamp=1471470328985, value=\x81
> 0:DELETIONPENDING
> timestamp=1471470328985, value=\x80
> 0:ID
> timestamp=1471470328985, value=\x80\x00\x00\x01\x85\xDE\xE4\xD9
> 0:IMPORTER_ID
> timestamp=1471470328985, value=\x80\x00\x00\x18
> 0:ISAUTHOR
> timestamp=1471470328985, value=\x80
> 0:ISREAD
> timestamp=1471470328985, value=\x80
> 0:ISSTARRED
> timestamp=1471470328985, value=\x80
> 0:MODIFIED
> timestamp=1471470328985, value=\x80\x00\x01GXW-0\x00\x00\x00\x00
> 0:MONTH
> timestamp=1471470328985, value=\x80\x00\x00\x09
> 0:ONLYREFERENCE
> timestamp=1471470328985, value=\x80
> 0:PROFILE_ID
> timestamp=1471470328985, value=\xC4`\x09NH
> 0:PUBLISHED_IN
> timestamp=1471470328985, value=A journal.
> 0:TITLE
> timestamp=1471470328985, value=A title.
> 0:TYPE_ID
> timestamp=1471470328985, value=\x80\x00\x00\x01
> 0:UNIQUE_ID
> timestamp=1471470328985, value=fdc82767-106f-47bc-91b5-0df5d90fba72
> 0:YEAR
> timestamp=1471470328985, value=\x80\x00\x07\xDD
> 0:_0
> timestamp=1471470328985, value=x
> 18 row(s) in 0.0750 seconds
>
> We are also able to look for this row key in Phoenix via a primary key
> lookup (same HBase cluster), and we find it successfully:
>
> 0: jdbc:phoenix:localhost:2181> select uuid from SCHEMA.DOCUMENTS where uuid
> = '000000cd-ba0c-39cb-9490-39ca6448d6e4';
> +---------------------------------------+
> | UUID |
> +---------------------------------------+
> | 000000cd-ba0c-39cb-9490-39ca6448d6e4 |
> +---------------------------------------+
> 1 row selected (0.085 seconds)
>
> However, if we pick a random row from Phoenix:
>
> 0: jdbc:phoenix:localhost:2181> select uuid from SCHEMA.DOCUMENTS limit 1;
> +---------------------------------------+
> | UUID |
> +---------------------------------------+
> | 0061b1a3-4e08-3b24-9fe3-a5d8c2e35455 |
> +---------------------------------------+
> 1 row selected (0.064 seconds)
>
> And look it up in HBase:
>
> hbase(main):005:0> get 'SCHEMA.DOCUMENTS',
> "\x000061b1a3-4e08-3b24-9fe3-a5d8c2e35455"
> COLUMN CELL
> 0 row(s) in 0.0040 seconds
>
> We don't get anything back. We are connected to the HBase master for all of
> the above queries. Are we doing something silly? Is there something we're
> missing?
>
> Thanks,
> Tom
>
>
>
> ________________________________
>
> Elsevier Limited. Registered Office: The Boulevard, Langford Lane,
> Kidlington, Oxford, OX5 1GB, United Kingdom, Registration No. 1982084,
> Registered in England and Wales.