You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@hive.apache.org by Uli Bethke <ul...@sonra.io> on 2014/08/01 12:43:10 UTC

Hive: Centralized HDFS Caching

Hi.

in Hive can I make use of the centralized cache management introduced in Hadoop
2.3
(http://hadoop.apache.org/docs/r2.3.0/hadoop-project-dist/hadoop-hdfs/CentralizedCacheManagement.html)?
If not implemented yet, is this on the roadmap?

My use case is that I want to pin a fact table that needs to be queried
frequently into memory.

Impala already supports this as per the Cloudera documentation
http://www.cloudera.com/content/cloudera-content/cloudera-docs/Impala/latest/Installing-and-Using-Impala/ciiu_perf_hdfs_caching.html

Thanks
uli

Re: Hive: Centralized HDFS Caching

Posted by Uli Bethke <ul...@sonra.io>.
I am already using tez as the execution engine and used hdfs cacheadmin to pin a
file to memroy. However querying that file through Hive still goes to disk.

Any ideas?


> On 01 August 2014 at 11:46 Nitin Pawar <ni...@gmail.com> wrote:
> 
>  Please take a look at hive with tez as execution engine on hadoop 2.3.
> 
>  it may help you compare it with what you want to achieve
> 
> 
>  On Fri, Aug 1, 2014 at 4:13 PM, Uli Bethke <uli.bethke@sonra.io
> <ma...@sonra.io> > wrote:
>    > >    Hi.
> > 
> >    in Hive can I make use of the centralized cache management introduced in
> > Hadoop 2.3 (
> > http://hadoop.apache.org/docs/r2.3.0/hadoop-project-dist/hadoop-hdfs/CentralizedCacheManagement.html)?
> > If not implemented yet, is this on the roadmap?
> > 
> >    My use case is that I want to pin a fact table that needs to be queried
> > frequently into memory.
> > 
> >    Impala already supports this as per the Cloudera documentation
> > http://www.cloudera.com/content/cloudera-content/cloudera-docs/Impala/latest/Installing-and-Using-Impala/ciiu_perf_hdfs_caching.html
> > 
> >    Thanks
> >    uli
> >  > 
> 
> 
>  --
>  Nitin Pawar
> 


------------------------------
Uli Bethke
Sonra. Unleash the Value of your Data.
Web: http://www.sonra.io
Skype: uli.bethke

ODI Training. Now available!
http://www.odi-training.com
Our ODI book on Amazon Kindle
http://amzn.to/1kDMFor

Re: Hive: Centralized HDFS Caching

Posted by Nitin Pawar <ni...@gmail.com>.
Please take a look at hive with tez as execution engine on hadoop 2.3.

it may help you compare it with what you want to achieve


On Fri, Aug 1, 2014 at 4:13 PM, Uli Bethke <ul...@sonra.io> wrote:

>   Hi.
>
>  in Hive can I make use of the centralized cache management introduced in
> Hadoop 2.3 (
> http://hadoop.apache.org/docs/r2.3.0/hadoop-project-dist/hadoop-hdfs/CentralizedCacheManagement.html)?
> If not implemented yet, is this on the roadmap?
>
>  My use case is that I want to pin a fact table that needs to be queried
> frequently into memory.
>
>  Impala already supports this as per the Cloudera documentation
> http://www.cloudera.com/content/cloudera-content/cloudera-docs/Impala/latest/Installing-and-Using-Impala/ciiu_perf_hdfs_caching.html
>
>  Thanks
>  uli
>



-- 
Nitin Pawar