You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@carbondata.apache.org by Mohammad shahid khan <mo...@gmail.com> on 2016/10/26 12:36:20 UTC

B-Tree LRU cache (New Feature)

Hi All,
Please find the problem and proposed solution.

*B-Tree LRU Cache:*

Problem:

CarbonData is maintaining two level of B-Tree cache, one at the driver
level and another at executor level.  Currently CarbonData has the
mechanism to invalidate the segments and blocks cache for the invalid table
segments, but there is no eviction policy for the unused cached object. So
the instance at which complete memory is utilized then the system will not
be able to process any new requests.

*Solution:*

In the cache maintained at the driver level and at the executor there must
be objects in cache currently not in use. Therefore system should have the
mechanism to below mechanism.

1.       Set the max memory limit till which objects could be hold in the
memory.

2.       When configured memory limit reached then identify the cached
objects currently not in use so that the required memory could be freed
without impacting the existing process.

3.       Eviction should be done only till the required memory is not meet.

For details please refer to attachments.


Regards.

Shahid

Re: B-Tree LRU cache (New Feature)

Posted by Venkata Gollamudi <g....@gmail.com>.
Hi Shahid,

This solution, LRU cache for BTree is required to ensure to avoid out of
memory, when too many number of tables exists in store and all are not
frequently used.

Please raise an issue to track this feature.

Regards,
Ramana

On Wed, Nov 23, 2016 at 6:30 PM, mohdshahidkhan <
mohdshahidkhan1987@gmail.com> wrote:

> Please find Design document for B-Tree LRU cache
> https://drive.google.com/file/d/0B8sQb--59vO7bWxVeWs1ajBiMG8/view?usp=
> sharing
>
>
>
> --
> View this message in context: http://apache-carbondata-
> mailing-list-archive.1130556.n5.nabble.com/B-Tree-LRU-
> cache-New-Feature-tp2366p3130.html
> Sent from the Apache CarbonData Mailing List archive mailing list archive
> at Nabble.com.
>

Re: B-Tree LRU cache (New Feature)

Posted by mohdshahidkhan <mo...@gmail.com>.
Hi Sujith, 
I agree with your that after compaction there is no use of having the
segments as well as block 
cache, We should have the mechanism to invalidate the compacted segments
cache from driver and 
block level cache from the executor.



--
View this message in context: http://apache-carbondata-mailing-list-archive.1130556.n5.nabble.com/B-Tree-LRU-cache-New-Feature-tp2366p4454.html
Sent from the Apache CarbonData Mailing List archive mailing list archive at Nabble.com.

Re: B-Tree LRU cache (New Feature)

Posted by jarray888 <ja...@163.com>.
+1



--
View this message in context: http://apache-carbondata-mailing-list-archive.1130556.n5.nabble.com/B-Tree-LRU-cache-New-Feature-tp2366p4016.html
Sent from the Apache CarbonData Mailing List archive mailing list archive at Nabble.com.

Re: B-Tree LRU cache (New Feature)

Posted by Venkata Gollamudi <g....@gmail.com>.
Hi Shahid,
Introduce CacheClient who is the owner for proper increment and decrement
of access count, if objects being used and not used. Other wise access
count handling becomes complicated as we add more features to system.
Regards,
Ramana

On Sun, Dec 4, 2016, 10:31 PM manish gupta <to...@gmail.com>
wrote:

> Hi Sujith,
>
> I agree with your point. We can always send a list of invalid segments to
> the executors in the query model that needs to be cleared from the cache.
> But there are few cases where clearing B-tree cache cannot be ensured like:
> 1. Table is dropped
> 2. Execution of clean table DML command.
>
> In these cases we cannot ensure that invalid objects from cache are cleared
> from all the executors. Removal only from driver can be ensured.
> To handle these cases each executor should have a mechanism to decide for
> the invalid segments/block/dictionary cache.
>
> Regards
> Manish Gupta
>
> On Sun, Dec 4, 2016 at 10:14 PM, sujith chacko <
> sujithchacko.2010@gmail.com>
> wrote:
>
> > Hi Shahid,
> >
> >    its a well explained document, just need few clarifications,
> >
> > a) once compaction is done the segments and its blocks will be
> invalidated,
> > LRU's scope is to evict the unused objects from memory or  least recently
> > used objects from memory, but after compaction the segment itself becomes
> > invalid,So is it really require to hold such objects in LRU cache and
> wait
> > for eviction  till its memory size gets full?
> >
> > Thanks,
> > Sujith
> >
> > On Wed, Nov 23, 2016 at 6:30 PM, mohdshahidkhan <
> > mohdshahidkhan1987@gmail.com> wrote:
> >
> > > Please find Design document for B-Tree LRU cache
> > > https://drive.google.com/file/d/0B8sQb--59vO7bWxVeWs1ajBiMG8/view?usp=
> > > sharing
> > >
> > >
> > >
> > > --
> > > View this message in context: http://apache-carbondata-
> > > mailing-list-archive.1130556.n5.nabble.com/B-Tree-LRU-
> > > cache-New-Feature-tp2366p3130.html
> > > Sent from the Apache CarbonData Mailing List archive mailing list
> archive
> > > at Nabble.com.
> > >
> >
>

Re: B-Tree LRU cache (New Feature)

Posted by manish gupta <to...@gmail.com>.
Hi Sujith,

I agree with your point. We can always send a list of invalid segments to
the executors in the query model that needs to be cleared from the cache.
But there are few cases where clearing B-tree cache cannot be ensured like:
1. Table is dropped
2. Execution of clean table DML command.

In these cases we cannot ensure that invalid objects from cache are cleared
from all the executors. Removal only from driver can be ensured.
To handle these cases each executor should have a mechanism to decide for
the invalid segments/block/dictionary cache.

Regards
Manish Gupta

On Sun, Dec 4, 2016 at 10:14 PM, sujith chacko <su...@gmail.com>
wrote:

> Hi Shahid,
>
>    its a well explained document, just need few clarifications,
>
> a) once compaction is done the segments and its blocks will be invalidated,
> LRU's scope is to evict the unused objects from memory or  least recently
> used objects from memory, but after compaction the segment itself becomes
> invalid,So is it really require to hold such objects in LRU cache and wait
> for eviction  till its memory size gets full?
>
> Thanks,
> Sujith
>
> On Wed, Nov 23, 2016 at 6:30 PM, mohdshahidkhan <
> mohdshahidkhan1987@gmail.com> wrote:
>
> > Please find Design document for B-Tree LRU cache
> > https://drive.google.com/file/d/0B8sQb--59vO7bWxVeWs1ajBiMG8/view?usp=
> > sharing
> >
> >
> >
> > --
> > View this message in context: http://apache-carbondata-
> > mailing-list-archive.1130556.n5.nabble.com/B-Tree-LRU-
> > cache-New-Feature-tp2366p3130.html
> > Sent from the Apache CarbonData Mailing List archive mailing list archive
> > at Nabble.com.
> >
>

Re: B-Tree LRU cache (New Feature)

Posted by sujith chacko <su...@gmail.com>.
Hi Shahid,

   its a well explained document, just need few clarifications,

a) once compaction is done the segments and its blocks will be invalidated,
LRU's scope is to evict the unused objects from memory or  least recently
used objects from memory, but after compaction the segment itself becomes
invalid,So is it really require to hold such objects in LRU cache and wait
for eviction  till its memory size gets full?

Thanks,
Sujith

On Wed, Nov 23, 2016 at 6:30 PM, mohdshahidkhan <
mohdshahidkhan1987@gmail.com> wrote:

> Please find Design document for B-Tree LRU cache
> https://drive.google.com/file/d/0B8sQb--59vO7bWxVeWs1ajBiMG8/view?usp=
> sharing
>
>
>
> --
> View this message in context: http://apache-carbondata-
> mailing-list-archive.1130556.n5.nabble.com/B-Tree-LRU-
> cache-New-Feature-tp2366p3130.html
> Sent from the Apache CarbonData Mailing List archive mailing list archive
> at Nabble.com.
>

Re: B-Tree LRU cache (New Feature)

Posted by mohdshahidkhan <mo...@gmail.com>.
Please find Design document for B-Tree LRU cache 
https://drive.google.com/file/d/0B8sQb--59vO7bWxVeWs1ajBiMG8/view?usp=sharing



--
View this message in context: http://apache-carbondata-mailing-list-archive.1130556.n5.nabble.com/B-Tree-LRU-cache-New-Feature-tp2366p3130.html
Sent from the Apache CarbonData Mailing List archive mailing list archive at Nabble.com.