You are viewing a plain text version of this content. The canonical link for it is here.

Posted to dev@hbase.apache.org by yu...@gmail.com on 2012/02/21 14:26:42 UTC

LIRS cache as an alternative to LRU cache

Hi,
Shall we experiment with low inter-reference recency set replacement policy to see if block cache becomes more effective ?

Cheers

Re: LIRS cache as an alternative to LRU cache

Posted by Li Pi <li...@idle.li>.

I thought about this over the summer when I was working on 4027.

Pretty much same idea as Nicholas here. I figured LIRs might be troublesome
to implement - and also thought that newer features, such as 4027 or the
reference counting patch, was a better use of time.

The larger the cache gets, the less the eviction algorithm matters. And we
seem to be trending towards the possibility of larger caches.

~Li

On Tue, Feb 21, 2012 at 9:01 AM, Nicolas Spiegelberg <ns...@fb.com>wrote:

> We had the author of LIRS come to Facebook last year to talk about his
> algorithm and general benefits.  At the time, we were looking at
> increasing block cache efficiency.  The general consensus was that it
> wasn't an exponential perf gain, so we could get bigger wins from
> cache-on-write intelligence, in-memory data compression techniques, and
> adding stats so we could understand how to tune the existing LRU
> algorithm.  I still think that these 3 goals are more important at the
> moment because LIRS would be a decent bit of code and only incremental
> gain.  It's probably something to revisit in a year or two.
>
> Nicolas
>
> On 2/21/12 8:26 AM, "yuzhihong@gmail.com" <yu...@gmail.com> wrote:
>
> >Hi,
> >Shall we experiment with low inter-reference recency set replacement
> >policy to see if block cache becomes more effective ?
> >
> >Cheers
> >
> >
>
>

Re: LIRS cache as an alternative to LRU cache

Posted by Ted Yu <yu...@gmail.com>.

That's correct, Nicolas.

We can make BlockCache pluggable when we find the next candidate which
exhibits definite benefit over current implementation.

Cheers

On Tue, Feb 21, 2012 at 10:53 AM, Nicolas Spiegelberg
<ns...@fb.com>wrote:

> In general, I agree about making isolated algorithms pluggable.  In this
> particular case, I think that Ted was trying to gather consensus on ways
> to increase cache efficiency with LIRS being the strawman.  I think it's
> good to bring up these discussions because it's really easy to add 10k
> lines to a system and really hard to figure out the correct next step to
> maximally evolve the system.
>
> On 2/21/12 1:22 PM, "Dhruba Borthakur" <dh...@gmail.com> wrote:
>
> >I think we should make the BlockCache pluggable for HBase. A simple
> >reflection-based enhancement to CacheConfig.instantiateBlockCache should
> >do
> >the trick, is it not? If people think that this is valuable, I can submit
> >a
> >patch.
> >
> >This will enable people to play with their own versions of the BlockCache
> >without making it part of core-HBase code.
> >
> >thanks,
> >dhruba
> >
> >On Tue, Feb 21, 2012 at 10:03 AM, Jean-Daniel Cryans
> ><jd...@apache.org>wrote:
> >
> >> If it was ARC (which uses both LRU and LFU) we'd have patenting issues
> >> with IBM, what we have is closer to a 2Q:
> >> http://www.vldb.org/conf/1994/P439.PDF
> >>
> >> J-D
> >>
> >> On Tue, Feb 21, 2012 at 9:19 AM, Nicolas Spiegelberg
> >> <ns...@fb.com> wrote:
> >> > Vlad,
> >> >
> >> > You're correct.  The existing algorithm is called an Adaptive
> >>Replacement
> >> > Cache.  However, note that this cache does need some proper tuning for
> >> > optimal efficiency.
> >> >
> >> > Nicolas
> >> >
> >> > On 2/21/12 12:09 PM, "Vladimir Rodionov" <vr...@carrieriq.com>
> >> wrote:
> >> >
> >> >>afaik, existing LruBlockCache is not exactly LRU cache
> >> >>It utilizes more advanced algorithm to avoid cache trashing during
> >>scan
> >> >>ops by
> >> >>dividing cache into three sub-caches (for newly added blocks, for
> >> >>promoted blocks and for in-memory blocks)
> >> >>
> >> >>
> >> >>Best regards,
> >> >>Vladimir Rodionov
> >> >>Principal Platform Engineer
> >> >>Carrier IQ, www.carrieriq.com
> >> >>e-mail: vrodionov@carrieriq.com
> >> >>
> >> >>________________________________________
> >> >>From: Nicolas Spiegelberg [nspiegelberg@fb.com]
> >> >>Sent: Tuesday, February 21, 2012 9:01 AM
> >> >>To: dev@hbase.apache.org
> >> >>Subject: Re: LIRS cache as an alternative to LRU cache
> >> >>
> >> >>We had the author of LIRS come to Facebook last year to talk about his
> >> >>algorithm and general benefits.  At the time, we were looking at
> >> >>increasing block cache efficiency.  The general consensus was that it
> >> >>wasn't an exponential perf gain, so we could get bigger wins from
> >> >>cache-on-write intelligence, in-memory data compression techniques,
> >>and
> >> >>adding stats so we could understand how to tune the existing LRU
> >> >>algorithm.  I still think that these 3 goals are more important at the
> >> >>moment because LIRS would be a decent bit of code and only incremental
> >> >>gain.  It's probably something to revisit in a year or two.
> >> >>
> >> >>Nicolas
> >> >>
> >> >>On 2/21/12 8:26 AM, "yuzhihong@gmail.com" <yu...@gmail.com>
> wrote:
> >> >>
> >> >>>Hi,
> >> >>>Shall we experiment with low inter-reference recency set replacement
> >> >>>policy to see if block cache becomes more effective ?
> >> >>>
> >> >>>Cheers
> >> >>>
> >> >>>
> >> >>
> >> >>
> >> >>Confidentiality Notice:  The information contained in this message,
> >> >>including any attachments hereto, may be confidential and is intended
> >>to
> >> >>be read only by the individual or entity to whom this message is
> >> >>addressed. If the reader of this message is not the intended
> >>recipient or
> >> >>an agent or designee of the intended recipient, please note that any
> >> >>review, use, disclosure or distribution of this message or its
> >> >>attachments, in any form, is strictly prohibited.  If you have
> >>received
> >> >>this message in error, please immediately notify the sender and/or
> >> >>Notifications@carrieriq.com and delete or destroy any copy of this
> >> >>message and its attachments.
> >> >
> >>
> >
> >
> >
> >--
> >Subscribe to my posts at http://www.facebook.com/dhruba
>
>

Re: LIRS cache as an alternative to LRU cache

Posted by Nicolas Spiegelberg <ns...@fb.com>.

In general, I agree about making isolated algorithms pluggable.  In this
particular case, I think that Ted was trying to gather consensus on ways
to increase cache efficiency with LIRS being the strawman.  I think it's
good to bring up these discussions because it's really easy to add 10k
lines to a system and really hard to figure out the correct next step to
maximally evolve the system.

On 2/21/12 1:22 PM, "Dhruba Borthakur" <dh...@gmail.com> wrote:

>I think we should make the BlockCache pluggable for HBase. A simple
>reflection-based enhancement to CacheConfig.instantiateBlockCache should
>do
>the trick, is it not? If people think that this is valuable, I can submit
>a
>patch.
>
>This will enable people to play with their own versions of the BlockCache
>without making it part of core-HBase code.
>
>thanks,
>dhruba
>
>On Tue, Feb 21, 2012 at 10:03 AM, Jean-Daniel Cryans
><jd...@apache.org>wrote:
>
>> If it was ARC (which uses both LRU and LFU) we'd have patenting issues
>> with IBM, what we have is closer to a 2Q:
>> http://www.vldb.org/conf/1994/P439.PDF
>>
>> J-D
>>
>> On Tue, Feb 21, 2012 at 9:19 AM, Nicolas Spiegelberg
>> <ns...@fb.com> wrote:
>> > Vlad,
>> >
>> > You're correct.  The existing algorithm is called an Adaptive
>>Replacement
>> > Cache.  However, note that this cache does need some proper tuning for
>> > optimal efficiency.
>> >
>> > Nicolas
>> >
>> > On 2/21/12 12:09 PM, "Vladimir Rodionov" <vr...@carrieriq.com>
>> wrote:
>> >
>> >>afaik, existing LruBlockCache is not exactly LRU cache
>> >>It utilizes more advanced algorithm to avoid cache trashing during
>>scan
>> >>ops by
>> >>dividing cache into three sub-caches (for newly added blocks, for
>> >>promoted blocks and for in-memory blocks)
>> >>
>> >>
>> >>Best regards,
>> >>Vladimir Rodionov
>> >>Principal Platform Engineer
>> >>Carrier IQ, www.carrieriq.com
>> >>e-mail: vrodionov@carrieriq.com
>> >>
>> >>________________________________________
>> >>From: Nicolas Spiegelberg [nspiegelberg@fb.com]
>> >>Sent: Tuesday, February 21, 2012 9:01 AM
>> >>To: dev@hbase.apache.org
>> >>Subject: Re: LIRS cache as an alternative to LRU cache
>> >>
>> >>We had the author of LIRS come to Facebook last year to talk about his
>> >>algorithm and general benefits.  At the time, we were looking at
>> >>increasing block cache efficiency.  The general consensus was that it
>> >>wasn't an exponential perf gain, so we could get bigger wins from
>> >>cache-on-write intelligence, in-memory data compression techniques,
>>and
>> >>adding stats so we could understand how to tune the existing LRU
>> >>algorithm.  I still think that these 3 goals are more important at the
>> >>moment because LIRS would be a decent bit of code and only incremental
>> >>gain.  It's probably something to revisit in a year or two.
>> >>
>> >>Nicolas
>> >>
>> >>On 2/21/12 8:26 AM, "yuzhihong@gmail.com" <yu...@gmail.com> wrote:
>> >>
>> >>>Hi,
>> >>>Shall we experiment with low inter-reference recency set replacement
>> >>>policy to see if block cache becomes more effective ?
>> >>>
>> >>>Cheers
>> >>>
>> >>>
>> >>
>> >>
>> >>Confidentiality Notice:  The information contained in this message,
>> >>including any attachments hereto, may be confidential and is intended
>>to
>> >>be read only by the individual or entity to whom this message is
>> >>addressed. If the reader of this message is not the intended
>>recipient or
>> >>an agent or designee of the intended recipient, please note that any
>> >>review, use, disclosure or distribution of this message or its
>> >>attachments, in any form, is strictly prohibited.  If you have
>>received
>> >>this message in error, please immediately notify the sender and/or
>> >>Notifications@carrieriq.com and delete or destroy any copy of this
>> >>message and its attachments.
>> >
>>
>
>
>
>-- 
>Subscribe to my posts at http://www.facebook.com/dhruba

Re: LIRS cache as an alternative to LRU cache

Posted by Dhruba Borthakur <dh...@gmail.com>.

I think we should make the BlockCache pluggable for HBase. A simple
reflection-based enhancement to CacheConfig.instantiateBlockCache should do
the trick, is it not? If people think that this is valuable, I can submit a
patch.

This will enable people to play with their own versions of the BlockCache
without making it part of core-HBase code.

thanks,
dhruba

On Tue, Feb 21, 2012 at 10:03 AM, Jean-Daniel Cryans <jd...@apache.org>wrote:

> If it was ARC (which uses both LRU and LFU) we'd have patenting issues
> with IBM, what we have is closer to a 2Q:
> http://www.vldb.org/conf/1994/P439.PDF
>
> J-D
>
> On Tue, Feb 21, 2012 at 9:19 AM, Nicolas Spiegelberg
> <ns...@fb.com> wrote:
> > Vlad,
> >
> > You're correct.  The existing algorithm is called an Adaptive Replacement
> > Cache.  However, note that this cache does need some proper tuning for
> > optimal efficiency.
> >
> > Nicolas
> >
> > On 2/21/12 12:09 PM, "Vladimir Rodionov" <vr...@carrieriq.com>
> wrote:
> >
> >>afaik, existing LruBlockCache is not exactly LRU cache
> >>It utilizes more advanced algorithm to avoid cache trashing during scan
> >>ops by
> >>dividing cache into three sub-caches (for newly added blocks, for
> >>promoted blocks and for in-memory blocks)
> >>
> >>
> >>Best regards,
> >>Vladimir Rodionov
> >>Principal Platform Engineer
> >>Carrier IQ, www.carrieriq.com
> >>e-mail: vrodionov@carrieriq.com
> >>
> >>________________________________________
> >>From: Nicolas Spiegelberg [nspiegelberg@fb.com]
> >>Sent: Tuesday, February 21, 2012 9:01 AM
> >>To: dev@hbase.apache.org
> >>Subject: Re: LIRS cache as an alternative to LRU cache
> >>
> >>We had the author of LIRS come to Facebook last year to talk about his
> >>algorithm and general benefits.  At the time, we were looking at
> >>increasing block cache efficiency.  The general consensus was that it
> >>wasn't an exponential perf gain, so we could get bigger wins from
> >>cache-on-write intelligence, in-memory data compression techniques, and
> >>adding stats so we could understand how to tune the existing LRU
> >>algorithm.  I still think that these 3 goals are more important at the
> >>moment because LIRS would be a decent bit of code and only incremental
> >>gain.  It's probably something to revisit in a year or two.
> >>
> >>Nicolas
> >>
> >>On 2/21/12 8:26 AM, "yuzhihong@gmail.com" <yu...@gmail.com> wrote:
> >>
> >>>Hi,
> >>>Shall we experiment with low inter-reference recency set replacement
> >>>policy to see if block cache becomes more effective ?
> >>>
> >>>Cheers
> >>>
> >>>
> >>
> >>
> >>Confidentiality Notice:  The information contained in this message,
> >>including any attachments hereto, may be confidential and is intended to
> >>be read only by the individual or entity to whom this message is
> >>addressed. If the reader of this message is not the intended recipient or
> >>an agent or designee of the intended recipient, please note that any
> >>review, use, disclosure or distribution of this message or its
> >>attachments, in any form, is strictly prohibited.  If you have received
> >>this message in error, please immediately notify the sender and/or
> >>Notifications@carrieriq.com and delete or destroy any copy of this
> >>message and its attachments.
> >
>



-- 
Subscribe to my posts at http://www.facebook.com/dhruba

Re: LIRS cache as an alternative to LRU cache

Posted by Jean-Daniel Cryans <jd...@apache.org>.

If it was ARC (which uses both LRU and LFU) we'd have patenting issues
with IBM, what we have is closer to a 2Q:
http://www.vldb.org/conf/1994/P439.PDF

J-D

On Tue, Feb 21, 2012 at 9:19 AM, Nicolas Spiegelberg
<ns...@fb.com> wrote:
> Vlad,
>
> You're correct.  The existing algorithm is called an Adaptive Replacement
> Cache.  However, note that this cache does need some proper tuning for
> optimal efficiency.
>
> Nicolas
>
> On 2/21/12 12:09 PM, "Vladimir Rodionov" <vr...@carrieriq.com> wrote:
>
>>afaik, existing LruBlockCache is not exactly LRU cache
>>It utilizes more advanced algorithm to avoid cache trashing during scan
>>ops by
>>dividing cache into three sub-caches (for newly added blocks, for
>>promoted blocks and for in-memory blocks)
>>
>>
>>Best regards,
>>Vladimir Rodionov
>>Principal Platform Engineer
>>Carrier IQ, www.carrieriq.com
>>e-mail: vrodionov@carrieriq.com
>>
>>________________________________________
>>From: Nicolas Spiegelberg [nspiegelberg@fb.com]
>>Sent: Tuesday, February 21, 2012 9:01 AM
>>To: dev@hbase.apache.org
>>Subject: Re: LIRS cache as an alternative to LRU cache
>>
>>We had the author of LIRS come to Facebook last year to talk about his
>>algorithm and general benefits.  At the time, we were looking at
>>increasing block cache efficiency.  The general consensus was that it
>>wasn't an exponential perf gain, so we could get bigger wins from
>>cache-on-write intelligence, in-memory data compression techniques, and
>>adding stats so we could understand how to tune the existing LRU
>>algorithm.  I still think that these 3 goals are more important at the
>>moment because LIRS would be a decent bit of code and only incremental
>>gain.  It's probably something to revisit in a year or two.
>>
>>Nicolas
>>
>>On 2/21/12 8:26 AM, "yuzhihong@gmail.com" <yu...@gmail.com> wrote:
>>
>>>Hi,
>>>Shall we experiment with low inter-reference recency set replacement
>>>policy to see if block cache becomes more effective ?
>>>
>>>Cheers
>>>
>>>
>>
>>
>>Confidentiality Notice:  The information contained in this message,
>>including any attachments hereto, may be confidential and is intended to
>>be read only by the individual or entity to whom this message is
>>addressed. If the reader of this message is not the intended recipient or
>>an agent or designee of the intended recipient, please note that any
>>review, use, disclosure or distribution of this message or its
>>attachments, in any form, is strictly prohibited.  If you have received
>>this message in error, please immediately notify the sender and/or
>>Notifications@carrieriq.com and delete or destroy any copy of this
>>message and its attachments.
>

Re: LIRS cache as an alternative to LRU cache

Posted by Nicolas Spiegelberg <ns...@fb.com>.

Vlad,

You're correct.  The existing algorithm is called an Adaptive Replacement
Cache.  However, note that this cache does need some proper tuning for
optimal efficiency.

Nicolas

On 2/21/12 12:09 PM, "Vladimir Rodionov" <vr...@carrieriq.com> wrote:

>afaik, existing LruBlockCache is not exactly LRU cache
>It utilizes more advanced algorithm to avoid cache trashing during scan
>ops by
>dividing cache into three sub-caches (for newly added blocks, for
>promoted blocks and for in-memory blocks)
>
>
>Best regards,
>Vladimir Rodionov
>Principal Platform Engineer
>Carrier IQ, www.carrieriq.com
>e-mail: vrodionov@carrieriq.com
>
>________________________________________
>From: Nicolas Spiegelberg [nspiegelberg@fb.com]
>Sent: Tuesday, February 21, 2012 9:01 AM
>To: dev@hbase.apache.org
>Subject: Re: LIRS cache as an alternative to LRU cache
>
>We had the author of LIRS come to Facebook last year to talk about his
>algorithm and general benefits.  At the time, we were looking at
>increasing block cache efficiency.  The general consensus was that it
>wasn't an exponential perf gain, so we could get bigger wins from
>cache-on-write intelligence, in-memory data compression techniques, and
>adding stats so we could understand how to tune the existing LRU
>algorithm.  I still think that these 3 goals are more important at the
>moment because LIRS would be a decent bit of code and only incremental
>gain.  It's probably something to revisit in a year or two.
>
>Nicolas
>
>On 2/21/12 8:26 AM, "yuzhihong@gmail.com" <yu...@gmail.com> wrote:
>
>>Hi,
>>Shall we experiment with low inter-reference recency set replacement
>>policy to see if block cache becomes more effective ?
>>
>>Cheers
>>
>>
>
>
>Confidentiality Notice:  The information contained in this message,
>including any attachments hereto, may be confidential and is intended to
>be read only by the individual or entity to whom this message is
>addressed. If the reader of this message is not the intended recipient or
>an agent or designee of the intended recipient, please note that any
>review, use, disclosure or distribution of this message or its
>attachments, in any form, is strictly prohibited.  If you have received
>this message in error, please immediately notify the sender and/or
>Notifications@carrieriq.com and delete or destroy any copy of this
>message and its attachments.

RE: LIRS cache as an alternative to LRU cache

Posted by Vladimir Rodionov <vr...@carrieriq.com>.

afaik, existing LruBlockCache is not exactly LRU cache
It utilizes more advanced algorithm to avoid cache trashing during scan ops by
dividing cache into three sub-caches (for newly added blocks, for promoted blocks and for in-memory blocks)

Best regards,
Vladimir Rodionov
Principal Platform Engineer
Carrier IQ, www.carrieriq.com
e-mail: vrodionov@carrieriq.com

________________________________________
From: Nicolas Spiegelberg [nspiegelberg@fb.com]
Sent: Tuesday, February 21, 2012 9:01 AM
To: dev@hbase.apache.org
Subject: Re: LIRS cache as an alternative to LRU cache

We had the author of LIRS come to Facebook last year to talk about his
algorithm and general benefits.  At the time, we were looking at
increasing block cache efficiency.  The general consensus was that it
wasn't an exponential perf gain, so we could get bigger wins from
cache-on-write intelligence, in-memory data compression techniques, and
adding stats so we could understand how to tune the existing LRU
algorithm.  I still think that these 3 goals are more important at the
moment because LIRS would be a decent bit of code and only incremental
gain.  It's probably something to revisit in a year or two.

Nicolas

On 2/21/12 8:26 AM, "yuzhihong@gmail.com" <yu...@gmail.com> wrote:

>Hi,
>Shall we experiment with low inter-reference recency set replacement
>policy to see if block cache becomes more effective ?
>
>Cheers
>
>

Confidentiality Notice:  The information contained in this message, including any attachments hereto, may be confidential and is intended to be read only by the individual or entity to whom this message is addressed. If the reader of this message is not the intended recipient or an agent or designee of the intended recipient, please note that any review, use, disclosure or distribution of this message or its attachments, in any form, is strictly prohibited.  If you have received this message in error, please immediately notify the sender and/or Notifications@carrieriq.com and delete or destroy any copy of this message and its attachments.

Re: LIRS cache as an alternative to LRU cache

Posted by Nicolas Spiegelberg <ns...@fb.com>.

We had the author of LIRS come to Facebook last year to talk about his
algorithm and general benefits.  At the time, we were looking at
increasing block cache efficiency.  The general consensus was that it
wasn't an exponential perf gain, so we could get bigger wins from
cache-on-write intelligence, in-memory data compression techniques, and
adding stats so we could understand how to tune the existing LRU
algorithm.  I still think that these 3 goals are more important at the
moment because LIRS would be a decent bit of code and only incremental
gain.  It's probably something to revisit in a year or two.

Nicolas

On 2/21/12 8:26 AM, "yuzhihong@gmail.com" <yu...@gmail.com> wrote:

>Hi,
>Shall we experiment with low inter-reference recency set replacement
>policy to see if block cache becomes more effective ?
>
>Cheers
>
>