You are viewing a plain text version of this content. The canonical link for it is here.

Posted to user@hbase.apache.org by ri...@laposte.net on 2013/04/29 17:03:05 UTC

Read access pattern

Hi,

I have a rowkey defined by :
        getMD5AsHex(Bytes.toBytes(myObjectId)) + String.format("%19d\n", (Long.MAX_VALUE - changeDate.getTime()));

How could I get the previous and next row for a given rowkey ?
For instance, I have the following ordered keys :

00003db1b6c1e7e7d2ece41ff2184f76*9223370673172227807
00003db1b6c1e7e7d2ece41ff2184f76*9223370674468022807
>00003db1b6c1e7e7d2ece41ff2184f76*9223370674468862807
00003db1b6c1e7e7d2ece41ff2184f76*9223370674984237807
00003db1b6c1e7e7d2ece41ff2184f76*9223370674987271807

If I choose the rowkey : 00003db1b6c1e7e7d2ece41ff2184f76*9223370674468862807, what would be the correct scan to get the previous and next key ?
Result would be :
00003db1b6c1e7e7d2ece41ff2184f76*9223370674468022807
00003db1b6c1e7e7d2ece41ff2184f76*9223370674984237807

Thank you !
R.

Une messagerie gratuite, garantie à vie et des services en plus, ça vous tente ?
Je crée ma boîte mail www.laposte.net

RE: Read access pattern

Posted by ri...@laposte.net.

Yes, I see, but this is quite expensive as the table is huge

-----Message d'origine-----
De : Jean-Marc Spaggiari [mailto:jean-marc@spaggiari.org] 
Envoyé : lundi 29 avril 2013 20:04
À : user@hbase.apache.org; ricla@laposte.net
Objet : Re: Read access pattern

HBASE-4811 is what you should be looking for, but it's not even close to be implemented yet...

One option will be to have 2 tables, each in a reserved order. So scanning forward in each will give you the key just after which at the end will give you the key before the and the after...

2013/4/29  <ri...@laposte.net>:
>
> Thanx for the quick answer.
>
>> For the next key, I think you can simply use your current key as your 
>> scanner first key. You will then find the one which is just after.
>> Then you will have to verify the MD5 hash to make sure it's still for 
>> the same object.
> Right, this is basically easy.
>
>> First, if you know that you are storing data about every 10 seconds, 
>> set the startRow with something like
>> getMD5AsHex(Bytes.toBytes(myObjectId)) + String.format("%19d\n", 
>> (Long.MAX_VALUE - (changeDate.getTime() - 60000))) then ready the few 
>> lines you will have until you find your current line, and keep the 
>> last one.
>
> Actually it is impossible to know the timerange for which there will 
> be a next entry
>
>>
>> Else, if you don't know, you will have to start with 
>> scan.setStartRow(getMD5AsHex(Bytes.toBytes(myObjectId))); but you 
>> might have to skip MANY lines before finding the right one. Do I 
>> don't really recommend that.
>
> ouch, obviously not very efficient. I assume even with a filter ?
>> Message du 29/04/13 18:18
>> De : "Jean-Marc Spaggiari"
>> A : user@hbase.apache.org
>> Copie à :
>> Objet : Re: Read access pattern
>>
>> Hum.
>>
>> For the next key, I think you can simply use your current key as your 
>> scanner first key. You will then find the one which is just after.
>> Then you will have to verify the MD5 hash to make sure it's still for 
>> the same object.
>>
>> scan.setStartRow(getMD5AsHex(Bytes.toBytes(myObjectId)) + 
>> String.format("%19d\n", (Long.MAX_VALUE - changeDate.getTime())));
>>
>> If you want to find the one just before, quickly, I see 2 options.
>>
>> First, if you know that you are storing data about every 10 seconds, 
>> set the startRow with something like
>> getMD5AsHex(Bytes.toBytes(myObjectId)) + String.format("%19d\n", 
>> (Long.MAX_VALUE - (changeDate.getTime() - 60000))) then ready the few 
>> lines you will have until you find your current line, and keep the 
>> last one.
>>
>> Else, if you don't know, you will have to start with 
>> scan.setStartRow(getMD5AsHex(Bytes.toBytes(myObjectId))); but you 
>> might have to skip MANY lines before finding the right one. Do I 
>> don't really recommend that.
>>
>> JM
>>
>> 2013/4/29 Shahab Yunus :
>> > I think you cannot use the scanner simply to to a range scan here 
>> > as your keys are not monotonically increasing. You need to apply 
>> > logic to decode/reverse your mechanism that you have used to hash 
>> > your keys at the time of writing. You might want to check out the 
>> > SemaText library which does distributed scans and seem to handle 
>> > the scenarios that you want to implement.
>> > http://blog.sematext.com/2012/04/09/hbasewd-avoid-regionserver-hots
>> > potting-despite-writing-records-with-sequential-keys/
>> >
>> >
>> > On Mon, Apr 29, 2013 at 11:03 AM, wrote:
>> >
>> >> Hi,
>> >>
>> >> I have a rowkey defined by :
>> >> getMD5AsHex(Bytes.toBytes(myObjectId)) + String.format("%19d\n", 
>> >> (Long.MAX_VALUE - changeDate.getTime()));
>> >>
>> >> How could I get the previous and next row for a given rowkey ?
>> >> For instance, I have the following ordered keys :
>> >>
>> >> 00003db1b6c1e7e7d2ece41ff2184f76*9223370673172227807
>> >> 00003db1b6c1e7e7d2ece41ff2184f76*9223370674468022807
>> >> >00003db1b6c1e7e7d2ece41ff2184f76*9223370674468862807
>> >> 00003db1b6c1e7e7d2ece41ff2184f76*9223370674984237807
>> >> 00003db1b6c1e7e7d2ece41ff2184f76*9223370674987271807
>> >>
>> >> If I choose the rowkey :
>> >> 00003db1b6c1e7e7d2ece41ff2184f76*9223370674468862807, what would 
>> >> be the correct scan to get the previous and next key ?
>> >> Result would be :
>> >> 00003db1b6c1e7e7d2ece41ff2184f76*9223370674468022807
>> >> 00003db1b6c1e7e7d2ece41ff2184f76*9223370674984237807
>> >>
>> >> Thank you !
>> >> R.
>> >>
>> >> Une messagerie gratuite, garantie à vie et des services en plus, 
>> >> ça vous tente ?
>> >> Je crée ma boîte mail www.laposte.net
>> >>
>>
>
> Une messagerie gratuite, garantie à vie et des services en plus, ça vous tente ?
> Je crée ma boîte mail www.laposte.net

Re: Read access pattern

Posted by Jean-Marc Spaggiari <je...@spaggiari.org>.

HBASE-4811 is what you should be looking for, but it's not even close
to be implemented yet...

One option will be to have 2 tables, each in a reserved order. So
scanning forward in each will give you the key just after which at the
end will give you the key before the and the after...

2013/4/29  <ri...@laposte.net>:
>
> Thanx for the quick answer.
>
>> For the next key, I think you can simply use your current key as your
>> scanner first key. You will then find the one which is just after.
>> Then you will have to verify the MD5 hash to make sure it's still for
>> the same object.
> Right, this is basically easy.
>
>> First, if you know that you are storing data about every 10 seconds,
>> set the startRow with something like
>> getMD5AsHex(Bytes.toBytes(myObjectId)) + String.format("%19d\n",
>> (Long.MAX_VALUE - (changeDate.getTime() - 60000))) then ready the few
>> lines you will have until you find your current line, and keep the
>> last one.
>
> Actually it is impossible to know the timerange for which there will be a next entry
>
>>
>> Else, if you don't know, you will have to start with
>> scan.setStartRow(getMD5AsHex(Bytes.toBytes(myObjectId))); but you
>> might have to skip MANY lines before finding the right one. Do I don't
>> really recommend that.
>
> ouch, obviously not very efficient. I assume even with a filter ?
>> Message du 29/04/13 18:18
>> De : "Jean-Marc Spaggiari"
>> A : user@hbase.apache.org
>> Copie à :
>> Objet : Re: Read access pattern
>>
>> Hum.
>>
>> For the next key, I think you can simply use your current key as your
>> scanner first key. You will then find the one which is just after.
>> Then you will have to verify the MD5 hash to make sure it's still for
>> the same object.
>>
>> scan.setStartRow(getMD5AsHex(Bytes.toBytes(myObjectId)) +
>> String.format("%19d\n", (Long.MAX_VALUE - changeDate.getTime())));
>>
>> If you want to find the one just before, quickly, I see 2 options.
>>
>> First, if you know that you are storing data about every 10 seconds,
>> set the startRow with something like
>> getMD5AsHex(Bytes.toBytes(myObjectId)) + String.format("%19d\n",
>> (Long.MAX_VALUE - (changeDate.getTime() - 60000))) then ready the few
>> lines you will have until you find your current line, and keep the
>> last one.
>>
>> Else, if you don't know, you will have to start with
>> scan.setStartRow(getMD5AsHex(Bytes.toBytes(myObjectId))); but you
>> might have to skip MANY lines before finding the right one. Do I don't
>> really recommend that.
>>
>> JM
>>
>> 2013/4/29 Shahab Yunus :
>> > I think you cannot use the scanner simply to to a range scan here as your
>> > keys are not monotonically increasing. You need to apply logic to
>> > decode/reverse your mechanism that you have used to hash your keys at the
>> > time of writing. You might want to check out the SemaText library which
>> > does distributed scans and seem to handle the scenarios that you want to
>> > implement.
>> > http://blog.sematext.com/2012/04/09/hbasewd-avoid-regionserver-hotspotting-despite-writing-records-with-sequential-keys/
>> >
>> >
>> > On Mon, Apr 29, 2013 at 11:03 AM, wrote:
>> >
>> >> Hi,
>> >>
>> >> I have a rowkey defined by :
>> >> getMD5AsHex(Bytes.toBytes(myObjectId)) + String.format("%19d\n",
>> >> (Long.MAX_VALUE - changeDate.getTime()));
>> >>
>> >> How could I get the previous and next row for a given rowkey ?
>> >> For instance, I have the following ordered keys :
>> >>
>> >> 00003db1b6c1e7e7d2ece41ff2184f76*9223370673172227807
>> >> 00003db1b6c1e7e7d2ece41ff2184f76*9223370674468022807
>> >> >00003db1b6c1e7e7d2ece41ff2184f76*9223370674468862807
>> >> 00003db1b6c1e7e7d2ece41ff2184f76*9223370674984237807
>> >> 00003db1b6c1e7e7d2ece41ff2184f76*9223370674987271807
>> >>
>> >> If I choose the rowkey :
>> >> 00003db1b6c1e7e7d2ece41ff2184f76*9223370674468862807, what would be the
>> >> correct scan to get the previous and next key ?
>> >> Result would be :
>> >> 00003db1b6c1e7e7d2ece41ff2184f76*9223370674468022807
>> >> 00003db1b6c1e7e7d2ece41ff2184f76*9223370674984237807
>> >>
>> >> Thank you !
>> >> R.
>> >>
>> >> Une messagerie gratuite, garantie à vie et des services en plus, ça vous
>> >> tente ?
>> >> Je crée ma boîte mail www.laposte.net
>> >>
>>
>
> Une messagerie gratuite, garantie à vie et des services en plus, ça vous tente ?
> Je crée ma boîte mail www.laposte.net

RE: Read access pattern

Posted by ri...@laposte.net.

1. Change the schema
If I understand correctly, in this scenario, I loose the ordering (changeDate desc). Moreover in my case, I could have 100k rows per objectId, meaning I would have to iterate a long list, but I understand the logic.

If I only look for 24 hours before the original column hour, it maybe should be simpler to just play with the current rowkey pattern : hash(objectid)+LONG.MAX_VALUE-changeDate.getTime()-1000*3600*24 and iterate to find the last item before reaching the current target. Or maybe using also a filter on the regionserver to achieve this ?

2. 
Whaow I feel weak to go this way :)

-----Message d'origine-----
De : Asaf Mesika [mailto:asaf.mesika@gmail.com] 
Envoyé : mardi 30 avril 2013 07:50
À : user@hbase.apache.org; ricla@laposte.net
Objet : Re: Read access pattern

Couple of raw implementation thoughts:

1. Change the schema
Take the timestamps inside the row. Rowkey is the hash(objectid), and column qualifier is the LONG.MAX_VALUE - changeDate - getTime(). You can even save it using Bytes.toBytes(ts) to save space - will always be 8 bytes, instead of the longer bytes string.

This will enable you to "view" all the timestamps related to a single objectid in one place. The problem with placing TS in the rowkey is that it's all over the place - spread across regions, so it's harder to get a valid who is before who response (indexing), without paying a penalty on insertion for keeping it up to date.

I have two ideas - one is expensive read and the other is expensive write.

Expensive read:
When you write, you write two columns for that row: one named i_[Rounded-to-the-hour-timestamp] with value of 1 (dummy value), indicating you have timestamps with this hour, and the other is your original column named ts_[timestamp].
You can implement a Filter, which upon arriving at the required row, will first start by reading all "hour" timestamps, so it can find out where to jump in the ts_[timestamp] column. Upon arriving to the required hour timestamp matching the one you are looking for, you can know which hour was before it, thus you can jump to it (using the hint method in the Filter interface). The read is expensive since you need to read all i_[Rounded-to-the-hour-timestamp] columns in the worst case. Maybe you relax it by saying I only look for 24 hours before the original column hour, thus reducing it only to 24 read worst case.
The write is cheap, the read is not.

Expensive write:
You can keep a column named i, which maintains an encoded version of an index for the hours, thus when you read, you achieve the correct before hour on log(n) searching through it and then jump to the ts_[timestamp] column.
The write will be expensive, since you need to read-modify-write this column on each timestamp you write.  The read is sort of cheap.

2.
I though I had another option of using RegionObserver and EndpointCoprocessor but the biggest problem is the the predecessor timestamp may be in another region server. The first idea is more implementable :)

On Mon, Apr 29, 2013 at 8:05 PM, <ri...@laposte.net> wrote:

>
> Thanx for the quick answer.
>
> > For the next key, I think you can simply use your current key as 
> > your scanner first key. You will then find the one which is just after.
> > Then you will have to verify the MD5 hash to make sure it's still 
> > for the same object.
> Right, this is basically easy.
>
> > First, if you know that you are storing data about every 10 seconds, 
> > set the startRow with something like
> > getMD5AsHex(Bytes.toBytes(myObjectId)) + String.format("%19d\n", 
> > (Long.MAX_VALUE - (changeDate.getTime() - 60000))) then ready the 
> > few lines you will have until you find your current line, and keep 
> > the last one.
>
> Actually it is impossible to know the timerange for which there will 
> be a next entry
>
> >
> > Else, if you don't know, you will have to start with 
> > scan.setStartRow(getMD5AsHex(Bytes.toBytes(myObjectId))); but you 
> > might have to skip MANY lines before finding the right one. Do I 
> > don't really recommend that.
>
> ouch, obviously not very efficient. I assume even with a filter ?
> > Message du 29/04/13 18:18
> > De : "Jean-Marc Spaggiari"
> > A : user@hbase.apache.org
> > Copie à :
> > Objet : Re: Read access pattern
> >
> > Hum.
> >
> > For the next key, I think you can simply use your current key as 
> > your scanner first key. You will then find the one which is just after.
> > Then you will have to verify the MD5 hash to make sure it's still 
> > for the same object.
> >
> > scan.setStartRow(getMD5AsHex(Bytes.toBytes(myObjectId)) + 
> > String.format("%19d\n", (Long.MAX_VALUE - changeDate.getTime())));
> >
> > If you want to find the one just before, quickly, I see 2 options.
> >
> > First, if you know that you are storing data about every 10 seconds, 
> > set the startRow with something like
> > getMD5AsHex(Bytes.toBytes(myObjectId)) + String.format("%19d\n", 
> > (Long.MAX_VALUE - (changeDate.getTime() - 60000))) then ready the 
> > few lines you will have until you find your current line, and keep 
> > the last one.
> >
> > Else, if you don't know, you will have to start with 
> > scan.setStartRow(getMD5AsHex(Bytes.toBytes(myObjectId))); but you 
> > might have to skip MANY lines before finding the right one. Do I 
> > don't really recommend that.
> >
> > JM
> >
> > 2013/4/29 Shahab Yunus :
> > > I think you cannot use the scanner simply to to a range scan here 
> > > as
> your
> > > keys are not monotonically increasing. You need to apply logic to 
> > > decode/reverse your mechanism that you have used to hash your keys 
> > > at
> the
> > > time of writing. You might want to check out the SemaText library 
> > > which does distributed scans and seem to handle the scenarios that 
> > > you want
> to
> > > implement.
> > >
> http://blog.sematext.com/2012/04/09/hbasewd-avoid-regionserver-hotspot
> ting-despite-writing-records-with-sequential-keys/
> > >
> > >
> > > On Mon, Apr 29, 2013 at 11:03 AM, wrote:
> > >
> > >> Hi,
> > >>
> > >> I have a rowkey defined by :
> > >> getMD5AsHex(Bytes.toBytes(myObjectId)) + String.format("%19d\n", 
> > >> (Long.MAX_VALUE - changeDate.getTime()));
> > >>
> > >> How could I get the previous and next row for a given rowkey ?
> > >> For instance, I have the following ordered keys :
> > >>
> > >> 00003db1b6c1e7e7d2ece41ff2184f76*9223370673172227807
> > >> 00003db1b6c1e7e7d2ece41ff2184f76*9223370674468022807
> > >> >00003db1b6c1e7e7d2ece41ff2184f76*9223370674468862807
> > >> 00003db1b6c1e7e7d2ece41ff2184f76*9223370674984237807
> > >> 00003db1b6c1e7e7d2ece41ff2184f76*9223370674987271807
> > >>
> > >> If I choose the rowkey :
> > >> 00003db1b6c1e7e7d2ece41ff2184f76*9223370674468862807, what would 
> > >> be
> the
> > >> correct scan to get the previous and next key ?
> > >> Result would be :
> > >> 00003db1b6c1e7e7d2ece41ff2184f76*9223370674468022807
> > >> 00003db1b6c1e7e7d2ece41ff2184f76*9223370674984237807
> > >>
> > >> Thank you !
> > >> R.
> > >>
> > >> Une messagerie gratuite, garantie à vie et des services en plus, 
> > >> ça
> vous
> > >> tente ?
> > >> Je crée ma boîte mail www.laposte.net
> > >>
> >
>
> Une messagerie gratuite, garantie à vie et des services en plus, ça 
> vous tente ?
> Je crée ma boîte mail www.laposte.net
>

Re: Read access pattern

Posted by Asaf Mesika <as...@gmail.com>.

Couple of raw implementation thoughts:

1. Change the schema
Take the timestamps inside the row. Rowkey is the hash(objectid), and
column qualifier is the LONG.MAX_VALUE - changeDate - getTime(). You can
even save it using Bytes.toBytes(ts) to save space - will always be 8
bytes, instead of the longer bytes string.

This will enable you to "view" all the timestamps related to a single
objectid in one place. The problem with placing TS in the rowkey is that
it's all over the place - spread across regions, so it's harder to get a
valid who is before who response (indexing), without paying a penalty on
insertion for keeping it up to date.

I have two ideas - one is expensive read and the other is expensive write.

Expensive read:
When you write, you write two columns for that row: one named
i_[Rounded-to-the-hour-timestamp] with value of 1 (dummy value), indicating
you have timestamps with this hour, and the other is your original column
named ts_[timestamp].
You can implement a Filter, which upon arriving at the required row, will
first start by reading all "hour" timestamps, so it can find out where to
jump in the ts_[timestamp] column. Upon arriving to the required hour
timestamp matching the one you are looking for, you can know which hour was
before it, thus you can jump to it (using the hint method in the Filter
interface). The read is expensive since you need to read all
i_[Rounded-to-the-hour-timestamp] columns in the worst case. Maybe you
relax it by saying I only look for 24 hours before the original column
hour, thus reducing it only to 24 read worst case.
The write is cheap, the read is not.

Expensive write:
You can keep a column named i, which maintains an encoded version of an
index for the hours, thus when you read, you achieve the correct before
hour on log(n) searching through it and then jump to the ts_[timestamp]
column.
The write will be expensive, since you need to read-modify-write this
column on each timestamp you write.  The read is sort of cheap.

2.
I though I had another option of using RegionObserver and
EndpointCoprocessor but the biggest problem is the the predecessor
timestamp may be in another region server. The first idea is more
implementable :)

On Mon, Apr 29, 2013 at 8:05 PM, <ri...@laposte.net> wrote:

>
> Thanx for the quick answer.
>
> > For the next key, I think you can simply use your current key as your
> > scanner first key. You will then find the one which is just after.
> > Then you will have to verify the MD5 hash to make sure it's still for
> > the same object.
> Right, this is basically easy.
>
> > First, if you know that you are storing data about every 10 seconds,
> > set the startRow with something like
> > getMD5AsHex(Bytes.toBytes(myObjectId)) + String.format("%19d\n",
> > (Long.MAX_VALUE - (changeDate.getTime() - 60000))) then ready the few
> > lines you will have until you find your current line, and keep the
> > last one.
>
> Actually it is impossible to know the timerange for which there will be a
> next entry
>
> >
> > Else, if you don't know, you will have to start with
> > scan.setStartRow(getMD5AsHex(Bytes.toBytes(myObjectId))); but you
> > might have to skip MANY lines before finding the right one. Do I don't
> > really recommend that.
>
> ouch, obviously not very efficient. I assume even with a filter ?
> > Message du 29/04/13 18:18
> > De : "Jean-Marc Spaggiari"
> > A : user@hbase.apache.org
> > Copie à :
> > Objet : Re: Read access pattern
> >
> > Hum.
> >
> > For the next key, I think you can simply use your current key as your
> > scanner first key. You will then find the one which is just after.
> > Then you will have to verify the MD5 hash to make sure it's still for
> > the same object.
> >
> > scan.setStartRow(getMD5AsHex(Bytes.toBytes(myObjectId)) +
> > String.format("%19d\n", (Long.MAX_VALUE - changeDate.getTime())));
> >
> > If you want to find the one just before, quickly, I see 2 options.
> >
> > First, if you know that you are storing data about every 10 seconds,
> > set the startRow with something like
> > getMD5AsHex(Bytes.toBytes(myObjectId)) + String.format("%19d\n",
> > (Long.MAX_VALUE - (changeDate.getTime() - 60000))) then ready the few
> > lines you will have until you find your current line, and keep the
> > last one.
> >
> > Else, if you don't know, you will have to start with
> > scan.setStartRow(getMD5AsHex(Bytes.toBytes(myObjectId))); but you
> > might have to skip MANY lines before finding the right one. Do I don't
> > really recommend that.
> >
> > JM
> >
> > 2013/4/29 Shahab Yunus :
> > > I think you cannot use the scanner simply to to a range scan here as
> your
> > > keys are not monotonically increasing. You need to apply logic to
> > > decode/reverse your mechanism that you have used to hash your keys at
> the
> > > time of writing. You might want to check out the SemaText library which
> > > does distributed scans and seem to handle the scenarios that you want
> to
> > > implement.
> > >
> http://blog.sematext.com/2012/04/09/hbasewd-avoid-regionserver-hotspotting-despite-writing-records-with-sequential-keys/
> > >
> > >
> > > On Mon, Apr 29, 2013 at 11:03 AM, wrote:
> > >
> > >> Hi,
> > >>
> > >> I have a rowkey defined by :
> > >> getMD5AsHex(Bytes.toBytes(myObjectId)) + String.format("%19d\n",
> > >> (Long.MAX_VALUE - changeDate.getTime()));
> > >>
> > >> How could I get the previous and next row for a given rowkey ?
> > >> For instance, I have the following ordered keys :
> > >>
> > >> 00003db1b6c1e7e7d2ece41ff2184f76*9223370673172227807
> > >> 00003db1b6c1e7e7d2ece41ff2184f76*9223370674468022807
> > >> >00003db1b6c1e7e7d2ece41ff2184f76*9223370674468862807
> > >> 00003db1b6c1e7e7d2ece41ff2184f76*9223370674984237807
> > >> 00003db1b6c1e7e7d2ece41ff2184f76*9223370674987271807
> > >>
> > >> If I choose the rowkey :
> > >> 00003db1b6c1e7e7d2ece41ff2184f76*9223370674468862807, what would be
> the
> > >> correct scan to get the previous and next key ?
> > >> Result would be :
> > >> 00003db1b6c1e7e7d2ece41ff2184f76*9223370674468022807
> > >> 00003db1b6c1e7e7d2ece41ff2184f76*9223370674984237807
> > >>
> > >> Thank you !
> > >> R.
> > >>
> > >> Une messagerie gratuite, garantie à vie et des services en plus, ça
> vous
> > >> tente ?
> > >> Je crée ma boîte mail www.laposte.net
> > >>
> >
>
> Une messagerie gratuite, garantie à vie et des services en plus, ça vous
> tente ?
> Je crée ma boîte mail www.laposte.net
>

Re: Read access pattern

Posted by ri...@laposte.net.

Thanx for the quick answer.

> For the next key, I think you can simply use your current key as your
> scanner first key. You will then find the one which is just after.
> Then you will have to verify the MD5 hash to make sure it's still for
> the same object.
Right, this is basically easy.

> First, if you know that you are storing data about every 10 seconds,
> set the startRow with something like
> getMD5AsHex(Bytes.toBytes(myObjectId)) + String.format("%19d\n",
> (Long.MAX_VALUE - (changeDate.getTime() - 60000))) then ready the few
> lines you will have until you find your current line, and keep the
> last one.

Actually it is impossible to know the timerange for which there will be a next entry

>
> Else, if you don't know, you will have to start with
> scan.setStartRow(getMD5AsHex(Bytes.toBytes(myObjectId))); but you
> might have to skip MANY lines before finding the right one. Do I don't
> really recommend that.

ouch, obviously not very efficient. I assume even with a filter ?
> Message du 29/04/13 18:18
> De : "Jean-Marc Spaggiari"
> A : user@hbase.apache.org
> Copie à :
> Objet : Re: Read access pattern
>
> Hum.
>
> For the next key, I think you can simply use your current key as your
> scanner first key. You will then find the one which is just after.
> Then you will have to verify the MD5 hash to make sure it's still for
> the same object.
>
> scan.setStartRow(getMD5AsHex(Bytes.toBytes(myObjectId)) +
> String.format("%19d\n", (Long.MAX_VALUE - changeDate.getTime())));
>
> If you want to find the one just before, quickly, I see 2 options.
>
> First, if you know that you are storing data about every 10 seconds,
> set the startRow with something like
> getMD5AsHex(Bytes.toBytes(myObjectId)) + String.format("%19d\n",
> (Long.MAX_VALUE - (changeDate.getTime() - 60000))) then ready the few
> lines you will have until you find your current line, and keep the
> last one.
>
> Else, if you don't know, you will have to start with
> scan.setStartRow(getMD5AsHex(Bytes.toBytes(myObjectId))); but you
> might have to skip MANY lines before finding the right one. Do I don't
> really recommend that.
>
> JM
>
> 2013/4/29 Shahab Yunus :
> > I think you cannot use the scanner simply to to a range scan here as your
> > keys are not monotonically increasing. You need to apply logic to
> > decode/reverse your mechanism that you have used to hash your keys at the
> > time of writing. You might want to check out the SemaText library which
> > does distributed scans and seem to handle the scenarios that you want to
> > implement.
> > http://blog.sematext.com/2012/04/09/hbasewd-avoid-regionserver-hotspotting-despite-writing-records-with-sequential-keys/
> >
> >
> > On Mon, Apr 29, 2013 at 11:03 AM, wrote:
> >
> >> Hi,
> >>
> >> I have a rowkey defined by :
> >> getMD5AsHex(Bytes.toBytes(myObjectId)) + String.format("%19d\n",
> >> (Long.MAX_VALUE - changeDate.getTime()));
> >>
> >> How could I get the previous and next row for a given rowkey ?
> >> For instance, I have the following ordered keys :
> >>
> >> 00003db1b6c1e7e7d2ece41ff2184f76*9223370673172227807
> >> 00003db1b6c1e7e7d2ece41ff2184f76*9223370674468022807
> >> >00003db1b6c1e7e7d2ece41ff2184f76*9223370674468862807
> >> 00003db1b6c1e7e7d2ece41ff2184f76*9223370674984237807
> >> 00003db1b6c1e7e7d2ece41ff2184f76*9223370674987271807
> >>
> >> If I choose the rowkey :
> >> 00003db1b6c1e7e7d2ece41ff2184f76*9223370674468862807, what would be the
> >> correct scan to get the previous and next key ?
> >> Result would be :
> >> 00003db1b6c1e7e7d2ece41ff2184f76*9223370674468022807
> >> 00003db1b6c1e7e7d2ece41ff2184f76*9223370674984237807
> >>
> >> Thank you !
> >> R.
> >>
> >> Une messagerie gratuite, garantie à vie et des services en plus, ça vous
> >> tente ?
> >> Je crée ma boîte mail www.laposte.net
> >>
> 

Une messagerie gratuite, garantie à vie et des services en plus, ça vous tente ?
Je crée ma boîte mail www.laposte.net

Re: Read access pattern

Posted by Jean-Marc Spaggiari <je...@spaggiari.org>.

Hum.

For the next key, I think you can simply use your current key as your
scanner first key. You will then find the one which is just after.
Then you will have to verify the MD5 hash to make sure it's still for
the same object.

scan.setStartRow(getMD5AsHex(Bytes.toBytes(myObjectId)) +
String.format("%19d\n", (Long.MAX_VALUE - changeDate.getTime())));

If you want to find the one just before, quickly, I see 2 options.

First, if you know that you are storing data about every 10 seconds,
set the startRow with something like
getMD5AsHex(Bytes.toBytes(myObjectId)) + String.format("%19d\n",
(Long.MAX_VALUE - (changeDate.getTime() - 60000))) then ready the few
lines you will have until you find your current line, and keep the
last one.

Else, if you don't know, you will have to start with
scan.setStartRow(getMD5AsHex(Bytes.toBytes(myObjectId))); but you
might have to skip MANY lines before finding the right one. Do I don't
really recommend that.

JM

2013/4/29 Shahab Yunus <sh...@gmail.com>:
> I think you cannot use the scanner simply to to a range scan here as your
> keys are not monotonically increasing. You need to apply logic to
> decode/reverse your mechanism that you have used to hash your keys at the
> time of writing. You might want to check out the SemaText library which
> does distributed scans and seem to handle the scenarios that you want to
> implement.
> http://blog.sematext.com/2012/04/09/hbasewd-avoid-regionserver-hotspotting-despite-writing-records-with-sequential-keys/
>
>
> On Mon, Apr 29, 2013 at 11:03 AM, <ri...@laposte.net> wrote:
>
>> Hi,
>>
>> I have a rowkey defined by :
>>         getMD5AsHex(Bytes.toBytes(myObjectId)) + String.format("%19d\n",
>> (Long.MAX_VALUE - changeDate.getTime()));
>>
>> How could I get the previous and next row for a given rowkey ?
>> For instance, I have the following ordered keys :
>>
>> 00003db1b6c1e7e7d2ece41ff2184f76*9223370673172227807
>> 00003db1b6c1e7e7d2ece41ff2184f76*9223370674468022807
>> >00003db1b6c1e7e7d2ece41ff2184f76*9223370674468862807
>> 00003db1b6c1e7e7d2ece41ff2184f76*9223370674984237807
>> 00003db1b6c1e7e7d2ece41ff2184f76*9223370674987271807
>>
>> If I choose the rowkey :
>> 00003db1b6c1e7e7d2ece41ff2184f76*9223370674468862807, what would be the
>> correct scan to get the previous and next key ?
>> Result would be :
>> 00003db1b6c1e7e7d2ece41ff2184f76*9223370674468022807
>> 00003db1b6c1e7e7d2ece41ff2184f76*9223370674984237807
>>
>> Thank you !
>> R.
>>
>> Une messagerie gratuite, garantie à vie et des services en plus, ça vous
>> tente ?
>> Je crée ma boîte mail www.laposte.net
>>

Re: Read access pattern

Posted by James Taylor <jt...@salesforce.com>.

bq. The downside that I see, is the bucket_number that we have to 
maintain both at time or reading/writing and update it in case of 
cluster restructuring.

I agree that this maintenance can be painful. However, Phoenix 
(https://github.com/forcedotcom/phoenix) now supports salting, 
automating this maintenance.  If you want to salt your table, just add a 
SALT_BUCKETS = <n> property at the end of your DDL statement, where <n> 
is the total number of buckets (up to a max of 256).  For example:

CREATE TABLE t (date_time DATE NOT NULL, event_id CHAR(15) NOT NULL
     CONSTRAINT pk PRIMARY KEY (date_time, event_id))
     SALT_BUCKETS=10;

This will add one byte at the beginning of your row key whose value is 
formed by hashing the row key and mod-ing with 10. This will 
automatically be done for any upsert and queries will automatically be 
distributed and the results combined as expected.

Thanks,

James
@JamesPlusPlus
http://phoenix-hbase.blogspot.com/

On 04/30/2013 09:17 AM, Shahab Yunus wrote:
> Well those are *some* words :) Anyway, can you explain a bit in detail that
> why you feel so strongly about this design/approach? The salting here is
> not the only option mentioned and static hashing can be used as well. Plus
> even in case of salting, wouldn't the distributed scan take care of it? The
> downside that I see, is the bucket_number that we have to maintain both at
> time or reading/writing and update it in case of cluster restructuring.
>
> Thanks,
> Shahab
>
>
> On Tue, Apr 30, 2013 at 11:57 AM, Michael Segel
> <mi...@hotmail.com>wrote:
>
>> Geez that's a bad article.
>> Never salt.
>>
>> And yes there's a difference between using a salt and using the first 2-4
>> bytes from your MD5 hash.
>>
>> (Hint: Salts are random. Your hash isn't. )
>>
>> Sorry to be-itch but its a bad idea and it shouldn't be propagated.
>>
>> On Apr 29, 2013, at 10:17 AM, Shahab Yunus <sh...@gmail.com> wrote:
>>
>>> I think you cannot use the scanner simply to to a range scan here as your
>>> keys are not monotonically increasing. You need to apply logic to
>>> decode/reverse your mechanism that you have used to hash your keys at the
>>> time of writing. You might want to check out the SemaText library which
>>> does distributed scans and seem to handle the scenarios that you want to
>>> implement.
>>>
>> http://blog.sematext.com/2012/04/09/hbasewd-avoid-regionserver-hotspotting-despite-writing-records-with-sequential-keys/
>>>
>>> On Mon, Apr 29, 2013 at 11:03 AM, <ri...@laposte.net> wrote:
>>>
>>>> Hi,
>>>>
>>>> I have a rowkey defined by :
>>>>         getMD5AsHex(Bytes.toBytes(myObjectId)) + String.format("%19d\n",
>>>> (Long.MAX_VALUE - changeDate.getTime()));
>>>>
>>>> How could I get the previous and next row for a given rowkey ?
>>>> For instance, I have the following ordered keys :
>>>>
>>>> 00003db1b6c1e7e7d2ece41ff2184f76*9223370673172227807
>>>> 00003db1b6c1e7e7d2ece41ff2184f76*9223370674468022807
>>>>> 00003db1b6c1e7e7d2ece41ff2184f76*9223370674468862807
>>>> 00003db1b6c1e7e7d2ece41ff2184f76*9223370674984237807
>>>> 00003db1b6c1e7e7d2ece41ff2184f76*9223370674987271807
>>>>
>>>> If I choose the rowkey :
>>>> 00003db1b6c1e7e7d2ece41ff2184f76*9223370674468862807, what would be the
>>>> correct scan to get the previous and next key ?
>>>> Result would be :
>>>> 00003db1b6c1e7e7d2ece41ff2184f76*9223370674468022807
>>>> 00003db1b6c1e7e7d2ece41ff2184f76*9223370674984237807
>>>>
>>>> Thank you !
>>>> R.
>>>>
>>>> Une messagerie gratuite, garantie à vie et des services en plus, ça vous
>>>> tente ?
>>>> Je crée ma boîte mail www.laposte.net
>>>>
>>

Re: Read access pattern

Posted by Naidu MS <sa...@gmail.com>.

Hi i have two questions regarding hdfs and jps utility

I am new to Hadoop and started leraning hadoop from the past week

1.when ever i start start-all.sh and jps in console it showing the
processes started

*naidu@naidu:~/work/hadoop-1.0.4/bin$ jps*
*22283 NameNode*
*23516 TaskTracker*
*26711 Jps*
*22541 DataNode*
*23255 JobTracker*
*22813 SecondaryNameNode*
*Could not synchronize with target*

But along with the list of process stared it always showing *" Could not
synchronize with target" *in the jps output. What is meant by "Could not
synchronize with target"?  Can some one explain why this is happening?


2.Is it possible to format namenode multiple  times? When i enter the
 namenode -format command, it not formatting the name node and showing the
following ouput.

*naidu@naidu:~/work/hadoop-1.0.4/bin$ hadoop namenode -format*
*Warning: $HADOOP_HOME is deprecated.*
*
*
*13/05/01 12:08:04 INFO namenode.NameNode: STARTUP_MSG: *
*/*************************************************************
*STARTUP_MSG: Starting NameNode*
*STARTUP_MSG:   host = naidu/127.0.0.1*
*STARTUP_MSG:   args = [-format]*
*STARTUP_MSG:   version = 1.0.4*
*STARTUP_MSG:   build =
https://svn.apache.org/repos/asf/hadoop/common/branches/branch-1.0 -r
1393290; compiled by 'hortonfo' on Wed Oct  3 05:13:58 UTC 2012*
*************************************************************/*
*Re-format filesystem in /home/naidu/dfs/namenode ? (Y or N) y*
*Format aborted in /home/naidu/dfs/namenode*
*13/05/01 12:08:05 INFO namenode.NameNode: SHUTDOWN_MSG: *
*/*************************************************************
*SHUTDOWN_MSG: Shutting down NameNode at naidu/127.0.0.1*
*
*
*************************************************************/*

Can someone help me in understanding this? Why is it not possible to format
name node multiple times?


On Wed, May 1, 2013 at 10:42 AM, lars hofhansl <la...@apache.org> wrote:

> I do not want to be rude or anything... But how often we need to have this
> discussion?
>
> When you salt your rowkeys with say 10 salt values then for each read you
> need to fork of 10 read requests, and each of them touches only 1/10th of
> the tables (which nicely with HBase's prefix scans).
>
> Obviously, if you only need point gets you wouldn't salting, that would be
> stupid. If you mostly do range scans, than salting is quite nice.
>
> Saying that salting is bad, because it does not work for point gets is
> like saying that bulldozers are bad, because you cannot use on them race
> tracks. :)
>
>
> -- Lars
>
>
>
> ________________________________
>  From: Michael Segel <mi...@hotmail.com>
> To: user@hbase.apache.org
> Sent: Tuesday, April 30, 2013 10:06 AM
> Subject: Re: Read access pattern
>
>
> Sure.
>
> By definition, the salt number is a random seed that is not associated
> with the underlying record.
> A simple example is a round robin counter (mod the counter by 10 yielding
> [0..9] )
>
> So you get a record, prepend your salt and you write it out to HBase. The
> salt will push the data out to a different region.
>
> But what happens when you want to read the data?
>
> So on a full table scan... no biggie, its the same.
>
> But suppose I want to do a partial table scan. Now I have to do multiple
> partial scans because I dont know the salt.
> Or if I want to do a simple get() I now have to do N number of get()s
> where N is the number of salt values allowed. In my example that's 10.
>
> And that's the problem.
>
> You are better off doing a hash of the record, use the first couple of
> bytes off the hash and then writing the record out.
> You want the record, take the key, hash it, using the same process and you
> have 1 get().
>
> You're still screwed up on doing a range scan, but you can't have
> everything.
>
> THIS IS WHY I AND MANY CARDIOLOGISTS SAY NO TO SALT. The only difference
> is that they are talking about excess sodium chloride in your diet. I'm
> talking about using a salt aka 'random seed'.
>
> Does that make sense?
>
>
> On Apr 30, 2013, at 11:17 AM, Shahab Yunus <sh...@gmail.com> wrote:
>
> > Well those are *some* words :) Anyway, can you explain a bit in detail
> that
> > why you feel so strongly about this design/approach? The salting here is
> > not the only option mentioned and static hashing can be used as well.
> Plus
> > even in case of salting, wouldn't the distributed scan take care of it?
> The
> > downside that I see, is the bucket_number that we have to maintain both
> at
> > time or reading/writing and update it in case of cluster restructuring.
> >
> > Thanks,
> > Shahab
> >
> >
> > On Tue, Apr 30, 2013 at 11:57 AM, Michael Segel
> > <mi...@hotmail.com>wrote:
> >
> >> Geez that's a bad article.
> >> Never salt.
> >>
> >> And yes there's a difference between using a salt and using the first
> 2-4
> >> bytes from your MD5 hash.
> >>
> >> (Hint: Salts are random. Your hash isn't. )
> >>
> >> Sorry to be-itch but its a bad idea and it shouldn't be propagated.
> >>
> >> On Apr 29, 2013, at 10:17 AM, Shahab Yunus <sh...@gmail.com>
> wrote:
> >>
> >>> I think you cannot use the scanner simply to to a range scan here as
> your
> >>> keys are not monotonically increasing. You need to apply logic to
> >>> decode/reverse your mechanism that you have used to hash your keys at
> the
> >>> time of writing. You might want to check out the SemaText library which
> >>> does distributed scans and seem to handle the scenarios that you want
> to
> >>> implement.
> >>>
> >>
> http://blog.sematext.com/2012/04/09/hbasewd-avoid-regionserver-hotspotting-despite-writing-records-with-sequential-keys/
> >>>
> >>>
> >>> On Mon, Apr 29, 2013 at 11:03 AM, <ri...@laposte.net> wrote:
> >>>
> >>>> Hi,
> >>>>
> >>>> I have a rowkey defined by :
> >>>>       getMD5AsHex(Bytes.toBytes(myObjectId)) + String.format("%19d\n",
> >>>> (Long.MAX_VALUE - changeDate.getTime()));
> >>>>
> >>>> How could I get the previous and next row for a given rowkey ?
> >>>> For instance, I have the following ordered keys :
> >>>>
> >>>> 00003db1b6c1e7e7d2ece41ff2184f76*9223370673172227807
> >>>> 00003db1b6c1e7e7d2ece41ff2184f76*9223370674468022807
> >>>>> 00003db1b6c1e7e7d2ece41ff2184f76*9223370674468862807
> >>>> 00003db1b6c1e7e7d2ece41ff2184f76*9223370674984237807
> >>>> 00003db1b6c1e7e7d2ece41ff2184f76*9223370674987271807
> >>>>
> >>>> If I choose the rowkey :
> >>>> 00003db1b6c1e7e7d2ece41ff2184f76*9223370674468862807, what would be
> the
> >>>> correct scan to get the previous and next key ?
> >>>> Result would be :
> >>>> 00003db1b6c1e7e7d2ece41ff2184f76*9223370674468022807
> >>>> 00003db1b6c1e7e7d2ece41ff2184f76*9223370674984237807
> >>>>
> >>>> Thank you !
> >>>> R.
> >>>>
> >>>> Une messagerie gratuite, garantie à vie et des services en plus, ça
> vous
> >>>> tente ?
> >>>> Je crée ma boîte mail www.laposte.net
> >>>>
> >>
> >>
>

Re: Read access pattern

Posted by Shahab Yunus <sh...@gmail.com>.

I see what you are saying Michael but I think following is a blanket
assumption:
bq Think of it this way... the operation was a success but the patient
died. eq

This is not always the case. Yes, if your use-case/system is such that it
will have lots of users trying to access then perhaps N users kicking off N
concurrent/distributed reads is not efficient but what if you have a batch
use case where these distributed scans might actually help. Point being,
rather than shooting down the idea as a whole, we can perhaps qualify it
with areas where it might be useful and area others where it can have
adverse affect.

Regards,
Shahab




On Wed, May 1, 2013 at 10:14 AM, Michael Segel <mi...@hotmail.com>wrote:

> Unfortunately as this idea keeps popping up, you are going to have this
> discussion.
>
> 1) As you admit... salting is bad when your primary access vector  is
> get()s.
> 2) Range scans. Instead of 1 range scan, you now have N where N is the
> number of salt values. In this case 10.
> You wouldn't think this as bad, however when you have a system which has a
> lot of users, lots of queries which now have to scan N times the number of
> records for each scan? Excessive overhead. Just because the scans happen in
> parallel, you are still tying up a finite amount of resources.
>
> So you have to go back and ask the initial question... why?
> Can you change your key?
> What is the problem you're trying to solve?
>
> The point is that just because you can do it, doesn't make it a good idea.
>
> Think of it this way... the operation was a success but the patient died.
>
>
> On May 1, 2013, at 12:12 AM, lars hofhansl <la...@apache.org> wrote:
>
> > I do not want to be rude or anything... But how often we need to have
> this discussion?
> >
> > When you salt your rowkeys with say 10 salt values then for each read
> you need to fork of 10 read requests, and each of them touches only 1/10th
> of the tables (which nicely with HBase's prefix scans).
> >
> > Obviously, if you only need point gets you wouldn't salting, that would
> be stupid. If you mostly do range scans, than salting is quite nice.
> >
> > Saying that salting is bad, because it does not work for point gets is
> like saying that bulldozers are bad, because you cannot use on them race
> tracks. :)
> >
> >
> > -- Lars
> >
> >
> >
> > ________________________________
> > From: Michael Segel <mi...@hotmail.com>
> > To: user@hbase.apache.org
> > Sent: Tuesday, April 30, 2013 10:06 AM
> > Subject: Re: Read access pattern
> >
> >
> > Sure.
> >
> > By definition, the salt number is a random seed that is not associated
> with the underlying record.
> > A simple example is a round robin counter (mod the counter by 10
> yielding [0..9] )
> >
> > So you get a record, prepend your salt and you write it out to HBase.
> The salt will push the data out to a different region.
> >
> > But what happens when you want to read the data?
> >
> > So on a full table scan... no biggie, its the same.
> >
> > But suppose I want to do a partial table scan. Now I have to do multiple
> partial scans because I dont know the salt.
> > Or if I want to do a simple get() I now have to do N number of get()s
> where N is the number of salt values allowed. In my example that's 10.
> >
> > And that's the problem.
> >
> > You are better off doing a hash of the record, use the first couple of
> bytes off the hash and then writing the record out.
> > You want the record, take the key, hash it, using the same process and
> you have 1 get().
> >
> > You're still screwed up on doing a range scan, but you can't have
> everything.
> >
> > THIS IS WHY I AND MANY CARDIOLOGISTS SAY NO TO SALT. The only difference
> is that they are talking about excess sodium chloride in your diet. I'm
> talking about using a salt aka 'random seed'.
> >
> > Does that make sense?
> >
> >
> > On Apr 30, 2013, at 11:17 AM, Shahab Yunus <sh...@gmail.com>
> wrote:
> >
> >> Well those are *some* words :) Anyway, can you explain a bit in detail
> that
> >> why you feel so strongly about this design/approach? The salting here is
> >> not the only option mentioned and static hashing can be used as well.
> Plus
> >> even in case of salting, wouldn't the distributed scan take care of it?
> The
> >> downside that I see, is the bucket_number that we have to maintain both
> at
> >> time or reading/writing and update it in case of cluster restructuring.
> >>
> >> Thanks,
> >> Shahab
> >>
> >>
> >> On Tue, Apr 30, 2013 at 11:57 AM, Michael Segel
> >> <mi...@hotmail.com>wrote:
> >>
> >>> Geez that's a bad article.
> >>> Never salt.
> >>>
> >>> And yes there's a difference between using a salt and using the first
> 2-4
> >>> bytes from your MD5 hash.
> >>>
> >>> (Hint: Salts are random. Your hash isn't. )
> >>>
> >>> Sorry to be-itch but its a bad idea and it shouldn't be propagated.
> >>>
> >>> On Apr 29, 2013, at 10:17 AM, Shahab Yunus <sh...@gmail.com>
> wrote:
> >>>
> >>>> I think you cannot use the scanner simply to to a range scan here as
> your
> >>>> keys are not monotonically increasing. You need to apply logic to
> >>>> decode/reverse your mechanism that you have used to hash your keys at
> the
> >>>> time of writing. You might want to check out the SemaText library
> which
> >>>> does distributed scans and seem to handle the scenarios that you want
> to
> >>>> implement.
> >>>>
> >>>
> http://blog.sematext.com/2012/04/09/hbasewd-avoid-regionserver-hotspotting-despite-writing-records-with-sequential-keys/
> >>>>
> >>>>
> >>>> On Mon, Apr 29, 2013 at 11:03 AM, <ri...@laposte.net> wrote:
> >>>>
> >>>>> Hi,
> >>>>>
> >>>>> I have a rowkey defined by :
> >>>>>        getMD5AsHex(Bytes.toBytes(myObjectId)) +
> String.format("%19d\n",
> >>>>> (Long.MAX_VALUE - changeDate.getTime()));
> >>>>>
> >>>>> How could I get the previous and next row for a given rowkey ?
> >>>>> For instance, I have the following ordered keys :
> >>>>>
> >>>>> 00003db1b6c1e7e7d2ece41ff2184f76*9223370673172227807
> >>>>> 00003db1b6c1e7e7d2ece41ff2184f76*9223370674468022807
> >>>>>> 00003db1b6c1e7e7d2ece41ff2184f76*9223370674468862807
> >>>>> 00003db1b6c1e7e7d2ece41ff2184f76*9223370674984237807
> >>>>> 00003db1b6c1e7e7d2ece41ff2184f76*9223370674987271807
> >>>>>
> >>>>> If I choose the rowkey :
> >>>>> 00003db1b6c1e7e7d2ece41ff2184f76*9223370674468862807, what would be
> the
> >>>>> correct scan to get the previous and next key ?
> >>>>> Result would be :
> >>>>> 00003db1b6c1e7e7d2ece41ff2184f76*9223370674468022807
> >>>>> 00003db1b6c1e7e7d2ece41ff2184f76*9223370674984237807
> >>>>>
> >>>>> Thank you !
> >>>>> R.
> >>>>>
> >>>>> Une messagerie gratuite, garantie à vie et des services en plus, ça
> vous
> >>>>> tente ?
> >>>>> Je crée ma boîte mail www.laposte.net
> >>>>>
> >>>
>
>

Re: Read access pattern

Posted by Michael Segel <mi...@hotmail.com>.

Unfortunately as this idea keeps popping up, you are going to have this discussion. 

1) As you admit... salting is bad when your primary access vector  is get()s. 
2) Range scans. Instead of 1 range scan, you now have N where N is the number of salt values. In this case 10.  
You wouldn't think this as bad, however when you have a system which has a lot of users, lots of queries which now have to scan N times the number of records for each scan? Excessive overhead. Just because the scans happen in parallel, you are still tying up a finite amount of resources. 

So you have to go back and ask the initial question... why? 
Can you change your key? 
What is the problem you're trying to solve? 

The point is that just because you can do it, doesn't make it a good idea. 

Think of it this way... the operation was a success but the patient died.  


On May 1, 2013, at 12:12 AM, lars hofhansl <la...@apache.org> wrote:

> I do not want to be rude or anything... But how often we need to have this discussion?
> 
> When you salt your rowkeys with say 10 salt values then for each read you need to fork of 10 read requests, and each of them touches only 1/10th of the tables (which nicely with HBase's prefix scans).
> 
> Obviously, if you only need point gets you wouldn't salting, that would be stupid. If you mostly do range scans, than salting is quite nice.
> 
> Saying that salting is bad, because it does not work for point gets is like saying that bulldozers are bad, because you cannot use on them race tracks. :)
> 
> 
> -- Lars
> 
> 
> 
> ________________________________
> From: Michael Segel <mi...@hotmail.com>
> To: user@hbase.apache.org 
> Sent: Tuesday, April 30, 2013 10:06 AM
> Subject: Re: Read access pattern
> 
> 
> Sure.
> 
> By definition, the salt number is a random seed that is not associated with the underlying record. 
> A simple example is a round robin counter (mod the counter by 10 yielding [0..9] )
> 
> So you get a record, prepend your salt and you write it out to HBase. The salt will push the data out to a different region.
> 
> But what happens when you want to read the data? 
> 
> So on a full table scan... no biggie, its the same. 
> 
> But suppose I want to do a partial table scan. Now I have to do multiple partial scans because I dont know the salt. 
> Or if I want to do a simple get() I now have to do N number of get()s where N is the number of salt values allowed. In my example that's 10.
> 
> And that's the problem. 
> 
> You are better off doing a hash of the record, use the first couple of bytes off the hash and then writing the record out. 
> You want the record, take the key, hash it, using the same process and you have 1 get(). 
> 
> You're still screwed up on doing a range scan, but you can't have everything.
> 
> THIS IS WHY I AND MANY CARDIOLOGISTS SAY NO TO SALT. The only difference is that they are talking about excess sodium chloride in your diet. I'm talking about using a salt aka 'random seed'.
> 
> Does that make sense? 
> 
> 
> On Apr 30, 2013, at 11:17 AM, Shahab Yunus <sh...@gmail.com> wrote:
> 
>> Well those are *some* words :) Anyway, can you explain a bit in detail that
>> why you feel so strongly about this design/approach? The salting here is
>> not the only option mentioned and static hashing can be used as well. Plus
>> even in case of salting, wouldn't the distributed scan take care of it? The
>> downside that I see, is the bucket_number that we have to maintain both at
>> time or reading/writing and update it in case of cluster restructuring.
>> 
>> Thanks,
>> Shahab
>> 
>> 
>> On Tue, Apr 30, 2013 at 11:57 AM, Michael Segel
>> <mi...@hotmail.com>wrote:
>> 
>>> Geez that's a bad article.
>>> Never salt.
>>> 
>>> And yes there's a difference between using a salt and using the first 2-4
>>> bytes from your MD5 hash.
>>> 
>>> (Hint: Salts are random. Your hash isn't. )
>>> 
>>> Sorry to be-itch but its a bad idea and it shouldn't be propagated.
>>> 
>>> On Apr 29, 2013, at 10:17 AM, Shahab Yunus <sh...@gmail.com> wrote:
>>> 
>>>> I think you cannot use the scanner simply to to a range scan here as your
>>>> keys are not monotonically increasing. You need to apply logic to
>>>> decode/reverse your mechanism that you have used to hash your keys at the
>>>> time of writing. You might want to check out the SemaText library which
>>>> does distributed scans and seem to handle the scenarios that you want to
>>>> implement.
>>>> 
>>> http://blog.sematext.com/2012/04/09/hbasewd-avoid-regionserver-hotspotting-despite-writing-records-with-sequential-keys/
>>>> 
>>>> 
>>>> On Mon, Apr 29, 2013 at 11:03 AM, <ri...@laposte.net> wrote:
>>>> 
>>>>> Hi,
>>>>> 
>>>>> I have a rowkey defined by :
>>>>>        getMD5AsHex(Bytes.toBytes(myObjectId)) + String.format("%19d\n",
>>>>> (Long.MAX_VALUE - changeDate.getTime()));
>>>>> 
>>>>> How could I get the previous and next row for a given rowkey ?
>>>>> For instance, I have the following ordered keys :
>>>>> 
>>>>> 00003db1b6c1e7e7d2ece41ff2184f76*9223370673172227807
>>>>> 00003db1b6c1e7e7d2ece41ff2184f76*9223370674468022807
>>>>>> 00003db1b6c1e7e7d2ece41ff2184f76*9223370674468862807
>>>>> 00003db1b6c1e7e7d2ece41ff2184f76*9223370674984237807
>>>>> 00003db1b6c1e7e7d2ece41ff2184f76*9223370674987271807
>>>>> 
>>>>> If I choose the rowkey :
>>>>> 00003db1b6c1e7e7d2ece41ff2184f76*9223370674468862807, what would be the
>>>>> correct scan to get the previous and next key ?
>>>>> Result would be :
>>>>> 00003db1b6c1e7e7d2ece41ff2184f76*9223370674468022807
>>>>> 00003db1b6c1e7e7d2ece41ff2184f76*9223370674984237807
>>>>> 
>>>>> Thank you !
>>>>> R.
>>>>> 
>>>>> Une messagerie gratuite, garantie à vie et des services en plus, ça vous
>>>>> tente ?
>>>>> Je crée ma boîte mail www.laposte.net
>>>>> 
>>>

Re: Read access pattern

Posted by lars hofhansl <la...@apache.org>.

I do not want to be rude or anything... But how often we need to have this discussion?

When you salt your rowkeys with say 10 salt values then for each read you need to fork of 10 read requests, and each of them touches only 1/10th of the tables (which nicely with HBase's prefix scans).

Obviously, if you only need point gets you wouldn't salting, that would be stupid. If you mostly do range scans, than salting is quite nice.

Saying that salting is bad, because it does not work for point gets is like saying that bulldozers are bad, because you cannot use on them race tracks. :)

-- Lars

________________________________
 From: Michael Segel <mi...@hotmail.com>
To: user@hbase.apache.org 
Sent: Tuesday, April 30, 2013 10:06 AM
Subject: Re: Read access pattern

Sure.

By definition, the salt number is a random seed that is not associated with the underlying record. 
A simple example is a round robin counter (mod the counter by 10 yielding [0..9] )

So you get a record, prepend your salt and you write it out to HBase. The salt will push the data out to a different region.

But what happens when you want to read the data? 

So on a full table scan... no biggie, its the same. 

But suppose I want to do a partial table scan. Now I have to do multiple partial scans because I dont know the salt. 
Or if I want to do a simple get() I now have to do N number of get()s where N is the number of salt values allowed. In my example that's 10.

And that's the problem. 

You are better off doing a hash of the record, use the first couple of bytes off the hash and then writing the record out. 
You want the record, take the key, hash it, using the same process and you have 1 get(). 

You're still screwed up on doing a range scan, but you can't have everything.

THIS IS WHY I AND MANY CARDIOLOGISTS SAY NO TO SALT. The only difference is that they are talking about excess sodium chloride in your diet. I'm talking about using a salt aka 'random seed'.

Does that make sense? 

On Apr 30, 2013, at 11:17 AM, Shahab Yunus <sh...@gmail.com> wrote:

> Well those are *some* words :) Anyway, can you explain a bit in detail that
> why you feel so strongly about this design/approach? The salting here is
> not the only option mentioned and static hashing can be used as well. Plus
> even in case of salting, wouldn't the distributed scan take care of it? The
> downside that I see, is the bucket_number that we have to maintain both at
> time or reading/writing and update it in case of cluster restructuring.
> 
> Thanks,
> Shahab
> 
> 
> On Tue, Apr 30, 2013 at 11:57 AM, Michael Segel
> <mi...@hotmail.com>wrote:
> 
>> Geez that's a bad article.
>> Never salt.
>> 
>> And yes there's a difference between using a salt and using the first 2-4
>> bytes from your MD5 hash.
>> 
>> (Hint: Salts are random. Your hash isn't. )
>> 
>> Sorry to be-itch but its a bad idea and it shouldn't be propagated.
>> 
>> On Apr 29, 2013, at 10:17 AM, Shahab Yunus <sh...@gmail.com> wrote:
>> 
>>> I think you cannot use the scanner simply to to a range scan here as your
>>> keys are not monotonically increasing. You need to apply logic to
>>> decode/reverse your mechanism that you have used to hash your keys at the
>>> time of writing. You might want to check out the SemaText library which
>>> does distributed scans and seem to handle the scenarios that you want to
>>> implement.
>>> 
>> http://blog.sematext.com/2012/04/09/hbasewd-avoid-regionserver-hotspotting-despite-writing-records-with-sequential-keys/
>>> 
>>> 
>>> On Mon, Apr 29, 2013 at 11:03 AM, <ri...@laposte.net> wrote:
>>> 
>>>> Hi,
>>>> 
>>>> I have a rowkey defined by :
>>>>       getMD5AsHex(Bytes.toBytes(myObjectId)) + String.format("%19d\n",
>>>> (Long.MAX_VALUE - changeDate.getTime()));
>>>> 
>>>> How could I get the previous and next row for a given rowkey ?
>>>> For instance, I have the following ordered keys :
>>>> 
>>>> 00003db1b6c1e7e7d2ece41ff2184f76*9223370673172227807
>>>> 00003db1b6c1e7e7d2ece41ff2184f76*9223370674468022807
>>>>> 00003db1b6c1e7e7d2ece41ff2184f76*9223370674468862807
>>>> 00003db1b6c1e7e7d2ece41ff2184f76*9223370674984237807
>>>> 00003db1b6c1e7e7d2ece41ff2184f76*9223370674987271807
>>>> 
>>>> If I choose the rowkey :
>>>> 00003db1b6c1e7e7d2ece41ff2184f76*9223370674468862807, what would be the
>>>> correct scan to get the previous and next key ?
>>>> Result would be :
>>>> 00003db1b6c1e7e7d2ece41ff2184f76*9223370674468022807
>>>> 00003db1b6c1e7e7d2ece41ff2184f76*9223370674984237807
>>>> 
>>>> Thank you !
>>>> R.
>>>> 
>>>> Une messagerie gratuite, garantie à vie et des services en plus, ça vous
>>>> tente ?
>>>> Je crée ma boîte mail www.laposte.net
>>>> 
>> 
>>

Re: Read access pattern

Posted by Michael Segel <mi...@hotmail.com>.

Sure.

By definition, the salt number is a random seed that is not associated with the underlying record. 
A simple example is a round robin counter (mod the counter by 10 yielding [0..9] )

So you get a record, prepend your salt and you write it out to HBase. The salt will push the data out to a different region.

But what happens when you want to read the data? 

So on a full table scan... no biggie, its the same. 

But suppose I want to do a partial table scan. Now I have to do multiple partial scans because I dont know the salt. 
Or if I want to do a simple get() I now have to do N number of get()s where N is the number of salt values allowed. In my example that's 10.

And that's the problem. 

You are better off doing a hash of the record, use the first couple of bytes off the hash and then writing the record out. 
You want the record, take the key, hash it, using the same process and you have 1 get(). 

You're still screwed up on doing a range scan, but you can't have everything.

THIS IS WHY I AND MANY CARDIOLOGISTS SAY NO TO SALT. The only difference is that they are talking about excess sodium chloride in your diet. I'm talking about using a salt aka 'random seed'.

Does that make sense? 

On Apr 30, 2013, at 11:17 AM, Shahab Yunus <sh...@gmail.com> wrote:

> Well those are *some* words :) Anyway, can you explain a bit in detail that
> why you feel so strongly about this design/approach? The salting here is
> not the only option mentioned and static hashing can be used as well. Plus
> even in case of salting, wouldn't the distributed scan take care of it? The
> downside that I see, is the bucket_number that we have to maintain both at
> time or reading/writing and update it in case of cluster restructuring.
> 
> Thanks,
> Shahab
> 
> 
> On Tue, Apr 30, 2013 at 11:57 AM, Michael Segel
> <mi...@hotmail.com>wrote:
> 
>> Geez that's a bad article.
>> Never salt.
>> 
>> And yes there's a difference between using a salt and using the first 2-4
>> bytes from your MD5 hash.
>> 
>> (Hint: Salts are random. Your hash isn't. )
>> 
>> Sorry to be-itch but its a bad idea and it shouldn't be propagated.
>> 
>> On Apr 29, 2013, at 10:17 AM, Shahab Yunus <sh...@gmail.com> wrote:
>> 
>>> I think you cannot use the scanner simply to to a range scan here as your
>>> keys are not monotonically increasing. You need to apply logic to
>>> decode/reverse your mechanism that you have used to hash your keys at the
>>> time of writing. You might want to check out the SemaText library which
>>> does distributed scans and seem to handle the scenarios that you want to
>>> implement.
>>> 
>> http://blog.sematext.com/2012/04/09/hbasewd-avoid-regionserver-hotspotting-despite-writing-records-with-sequential-keys/
>>> 
>>> 
>>> On Mon, Apr 29, 2013 at 11:03 AM, <ri...@laposte.net> wrote:
>>> 
>>>> Hi,
>>>> 
>>>> I have a rowkey defined by :
>>>>       getMD5AsHex(Bytes.toBytes(myObjectId)) + String.format("%19d\n",
>>>> (Long.MAX_VALUE - changeDate.getTime()));
>>>> 
>>>> How could I get the previous and next row for a given rowkey ?
>>>> For instance, I have the following ordered keys :
>>>> 
>>>> 00003db1b6c1e7e7d2ece41ff2184f76*9223370673172227807
>>>> 00003db1b6c1e7e7d2ece41ff2184f76*9223370674468022807
>>>>> 00003db1b6c1e7e7d2ece41ff2184f76*9223370674468862807
>>>> 00003db1b6c1e7e7d2ece41ff2184f76*9223370674984237807
>>>> 00003db1b6c1e7e7d2ece41ff2184f76*9223370674987271807
>>>> 
>>>> If I choose the rowkey :
>>>> 00003db1b6c1e7e7d2ece41ff2184f76*9223370674468862807, what would be the
>>>> correct scan to get the previous and next key ?
>>>> Result would be :
>>>> 00003db1b6c1e7e7d2ece41ff2184f76*9223370674468022807
>>>> 00003db1b6c1e7e7d2ece41ff2184f76*9223370674984237807
>>>> 
>>>> Thank you !
>>>> R.
>>>> 
>>>> Une messagerie gratuite, garantie à vie et des services en plus, ça vous
>>>> tente ?
>>>> Je crée ma boîte mail www.laposte.net
>>>> 
>> 
>>

Re: Read access pattern

Posted by Shahab Yunus <sh...@gmail.com>.

Well those are *some* words :) Anyway, can you explain a bit in detail that
why you feel so strongly about this design/approach? The salting here is
not the only option mentioned and static hashing can be used as well. Plus
even in case of salting, wouldn't the distributed scan take care of it? The
downside that I see, is the bucket_number that we have to maintain both at
time or reading/writing and update it in case of cluster restructuring.

Thanks,
Shahab


On Tue, Apr 30, 2013 at 11:57 AM, Michael Segel
<mi...@hotmail.com>wrote:

> Geez that's a bad article.
> Never salt.
>
> And yes there's a difference between using a salt and using the first 2-4
> bytes from your MD5 hash.
>
> (Hint: Salts are random. Your hash isn't. )
>
> Sorry to be-itch but its a bad idea and it shouldn't be propagated.
>
> On Apr 29, 2013, at 10:17 AM, Shahab Yunus <sh...@gmail.com> wrote:
>
> > I think you cannot use the scanner simply to to a range scan here as your
> > keys are not monotonically increasing. You need to apply logic to
> > decode/reverse your mechanism that you have used to hash your keys at the
> > time of writing. You might want to check out the SemaText library which
> > does distributed scans and seem to handle the scenarios that you want to
> > implement.
> >
> http://blog.sematext.com/2012/04/09/hbasewd-avoid-regionserver-hotspotting-despite-writing-records-with-sequential-keys/
> >
> >
> > On Mon, Apr 29, 2013 at 11:03 AM, <ri...@laposte.net> wrote:
> >
> >> Hi,
> >>
> >> I have a rowkey defined by :
> >>        getMD5AsHex(Bytes.toBytes(myObjectId)) + String.format("%19d\n",
> >> (Long.MAX_VALUE - changeDate.getTime()));
> >>
> >> How could I get the previous and next row for a given rowkey ?
> >> For instance, I have the following ordered keys :
> >>
> >> 00003db1b6c1e7e7d2ece41ff2184f76*9223370673172227807
> >> 00003db1b6c1e7e7d2ece41ff2184f76*9223370674468022807
> >>> 00003db1b6c1e7e7d2ece41ff2184f76*9223370674468862807
> >> 00003db1b6c1e7e7d2ece41ff2184f76*9223370674984237807
> >> 00003db1b6c1e7e7d2ece41ff2184f76*9223370674987271807
> >>
> >> If I choose the rowkey :
> >> 00003db1b6c1e7e7d2ece41ff2184f76*9223370674468862807, what would be the
> >> correct scan to get the previous and next key ?
> >> Result would be :
> >> 00003db1b6c1e7e7d2ece41ff2184f76*9223370674468022807
> >> 00003db1b6c1e7e7d2ece41ff2184f76*9223370674984237807
> >>
> >> Thank you !
> >> R.
> >>
> >> Une messagerie gratuite, garantie à vie et des services en plus, ça vous
> >> tente ?
> >> Je crée ma boîte mail www.laposte.net
> >>
>
>

Re: Read access pattern

Posted by Michael Segel <mi...@hotmail.com>.

Geez that's a bad article. 
Never salt. 

And yes there's a difference between using a salt and using the first 2-4 bytes from your MD5 hash. 

(Hint: Salts are random. Your hash isn't. )

Sorry to be-itch but its a bad idea and it shouldn't be propagated. 

On Apr 29, 2013, at 10:17 AM, Shahab Yunus <sh...@gmail.com> wrote:

> I think you cannot use the scanner simply to to a range scan here as your
> keys are not monotonically increasing. You need to apply logic to
> decode/reverse your mechanism that you have used to hash your keys at the
> time of writing. You might want to check out the SemaText library which
> does distributed scans and seem to handle the scenarios that you want to
> implement.
> http://blog.sematext.com/2012/04/09/hbasewd-avoid-regionserver-hotspotting-despite-writing-records-with-sequential-keys/
> 
> 
> On Mon, Apr 29, 2013 at 11:03 AM, <ri...@laposte.net> wrote:
> 
>> Hi,
>> 
>> I have a rowkey defined by :
>>        getMD5AsHex(Bytes.toBytes(myObjectId)) + String.format("%19d\n",
>> (Long.MAX_VALUE - changeDate.getTime()));
>> 
>> How could I get the previous and next row for a given rowkey ?
>> For instance, I have the following ordered keys :
>> 
>> 00003db1b6c1e7e7d2ece41ff2184f76*9223370673172227807
>> 00003db1b6c1e7e7d2ece41ff2184f76*9223370674468022807
>>> 00003db1b6c1e7e7d2ece41ff2184f76*9223370674468862807
>> 00003db1b6c1e7e7d2ece41ff2184f76*9223370674984237807
>> 00003db1b6c1e7e7d2ece41ff2184f76*9223370674987271807
>> 
>> If I choose the rowkey :
>> 00003db1b6c1e7e7d2ece41ff2184f76*9223370674468862807, what would be the
>> correct scan to get the previous and next key ?
>> Result would be :
>> 00003db1b6c1e7e7d2ece41ff2184f76*9223370674468022807
>> 00003db1b6c1e7e7d2ece41ff2184f76*9223370674984237807
>> 
>> Thank you !
>> R.
>> 
>> Une messagerie gratuite, garantie à vie et des services en plus, ça vous
>> tente ?
>> Je crée ma boîte mail www.laposte.net
>>

Re: Read access pattern

Posted by Shahab Yunus <sh...@gmail.com>.

I think you cannot use the scanner simply to to a range scan here as your
keys are not monotonically increasing. You need to apply logic to
decode/reverse your mechanism that you have used to hash your keys at the
time of writing. You might want to check out the SemaText library which
does distributed scans and seem to handle the scenarios that you want to
implement.
http://blog.sematext.com/2012/04/09/hbasewd-avoid-regionserver-hotspotting-despite-writing-records-with-sequential-keys/

On Mon, Apr 29, 2013 at 11:03 AM, <ri...@laposte.net> wrote:

> Hi,
>
> I have a rowkey defined by :
>         getMD5AsHex(Bytes.toBytes(myObjectId)) + String.format("%19d\n",
> (Long.MAX_VALUE - changeDate.getTime()));
>
> How could I get the previous and next row for a given rowkey ?
> For instance, I have the following ordered keys :
>
> 00003db1b6c1e7e7d2ece41ff2184f76*9223370673172227807
> 00003db1b6c1e7e7d2ece41ff2184f76*9223370674468022807
> >00003db1b6c1e7e7d2ece41ff2184f76*9223370674468862807
> 00003db1b6c1e7e7d2ece41ff2184f76*9223370674984237807
> 00003db1b6c1e7e7d2ece41ff2184f76*9223370674987271807
>
> If I choose the rowkey :
> 00003db1b6c1e7e7d2ece41ff2184f76*9223370674468862807, what would be the
> correct scan to get the previous and next key ?
> Result would be :
> 00003db1b6c1e7e7d2ece41ff2184f76*9223370674468022807
> 00003db1b6c1e7e7d2ece41ff2184f76*9223370674984237807
>
> Thank you !
> R.
>
> Une messagerie gratuite, garantie à vie et des services en plus, ça vous
> tente ?
> Je crée ma boîte mail www.laposte.net
>