You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@ignite.apache.org by barham <ba...@gmail.com> on 2016/04/29 18:35:11 UTC

HDFS Caching

I'm running Ignite 1.5.0 Hadoop Accelerator version on top of CDH 5.  I'm
trying to write my own SecondaryFileSystem, but as a first step, I created
one that just funnels all of the calls down to the
IgniteHadoopIgfsSecondaryFileSystem and I just log out every time one of my
methods is called.  I'm using the default configuration provided in the
Hadoop Accelerator binary distribution except I added my secondary file
system to the configuration.  

Every time I run hadoop fs -cat <filename> from the command line or 
ignite.fileSystem("igfs").open(<filename>) from inside a java app, my log
statement in my SecondaryFileSystem's open method is printed out.  Even if I
read the same file over and over.  To me, that means my files aren't being
cached inside Ignite (which is the reason I'm looking into Ignite).  I feel
like I must be missing something obvious.  I tried creating a tiny (10 byte)
ASCII text file and reading that in case my files were too big in HDFS.

Thanks for any help.



--
View this message in context: http://apache-ignite-users.70518.x6.nabble.com/HDFS-Caching-tp4695.html
Sent from the Apache Ignite Users mailing list archive at Nabble.com.

Re: HDFS Caching

Posted by Ivan Veselovsky <iv...@gridgain.com>.
Hi, Barry,
yes, permission security model is not currently supported in IGFS. The
permission mode just propagated to IGFS and can be taken as a part of file
meta information. But that does not restrict access to the file: anybody who
can access cache can access file data. You can create feature-request ticket
in our  Jira  and discuss implementation on dev-list (
dev@ignite.incubator.apache.org ).



--
View this message in context: http://apache-ignite-users.70518.x6.nabble.com/HDFS-Caching-tp4695p4756.html
Sent from the Apache Ignite Users mailing list archive at Nabble.com.

Re: HDFS Caching

Posted by barham <ba...@gmail.com>.
Thanks, stepping through the code some more, I found that it is in fact
reading from the cache.  The IgfsSecondaryFileSystemPositionedReadable that
is returned from open is not actually being used.

I do have a follow-up question though.  I have a file in HDFS that is
chmod-ed 600 to user1.  When I turn on Ignite on top of HDFS, if user1 reads
the file, it gets into cache, now user2 is able to read the file (bad for
me).  If I restart Ignite and user2 tries to read the file before it is
cached, I receive "Failed to read data due to secondary file system
exception: Permission denied" which is what I expected.  One, is there any
built-in way to prevent this (a per user cache?, permissions on the cache?)? 
Two, is there any way to create a custom IgfsMetaManager.  In my own
Secondary FS implementation, it might be useful to have control over how the
id is created so that I can have files in my file system that have the same
path, but actually are different based on user (and thus can't be cached
just by file path alone).

-Barry



--
View this message in context: http://apache-ignite-users.70518.x6.nabble.com/HDFS-Caching-tp4695p4735.html
Sent from the Apache Ignite Users mailing list archive at Nabble.com.

Re: HDFS Caching

Posted by vkulichenko <va...@gmail.com>.
Hi,

As far as I know, IGFS can redirect to secondary file system not only to
read from it, but also for integrity purposes (e.g., to check if the file in
secondary FS was updated directly, without updating IGFS). In any case, the
data itself will be read from memory if it is there. I would try to create
several larger files and see if adding IGFS improves performance.

-Val



--
View this message in context: http://apache-ignite-users.70518.x6.nabble.com/HDFS-Caching-tp4695p4713.html
Sent from the Apache Ignite Users mailing list archive at Nabble.com.