You are viewing a plain text version of this content. The canonical link for it is here.
Posted to hdfs-user@hadoop.apache.org by hwpstorage <hw...@gmail.com> on 2014/04/02 22:56:51 UTC

Re: problem with HDFS caching in Hadoop 2.3

Thanks for the reply.
I used the cat command this time, the result is not great.

In my test, file hadoop003.log is cached while hadoop010.log is not cached.

-bash-4.1$  /hadoop/hadoop-2.3.0/bin/hadoop fs -ls
-rw-r--r--   3 hdfs hadoop  209715206 2014-03-06 18:14 hadoop003.log
-rw-r--r--   3 hdfs hadoop  209715272 2014-03-07 14:37 hadoop010.log

-bash-4.1$ hdfs cacheadmin -listDirectives -stats -path hadoop003.log
Found 1 entry
 ID POOL      REPL EXPIRY  PATH                       BYTES_NEEDED
BYTES_CACHED  FILES_NEEDED  FILES_CACHED
  5 wptest1      3 never   /user/hdfs/hadoop003.log      629145618
629145618             1             1

run first time
-bash-4.1$  time /hadoop/hadoop-2.3.0/bin/hadoop fs -cat hadoop003.log>
/tmp/aa
real    0m4.881s
user    0m4.805s
sys     0m1.468s

-bash-4.1$  time /hadoop/hadoop-2.3.0/bin/hadoop fs -cat hadoop010.log>
/tmp/aa
real    0m6.479s
user    0m4.777s
sys     0m1.312s

run 2nd time.
-bash-4.1$ time /hadoop/hadoop-2.3.0/bin/hadoop fs -cat hadoop003.log>
/tmp/aa
real    0m4.751s
user    0m4.685s
sys     0m1.313s

-bash-4.1$ time /hadoop/hadoop-2.3.0/bin/hadoop fs -cat hadoop010.log>
/tmp/aa
real    0m4.916s
user    0m4.779s
sys     0m1.378s
I did not see much cache improvement.

please advice.

Thanks

On Tue, Mar 11, 2014 at 3:55 PM, Colin McCabe <cm...@alumni.cmu.edu>wrote:

>  On Fri, Mar 7, 2014 at 7:37 AM, hwpstorage <hw...@gmail.com> wrote:
> > Hello,
> >
> > It looks like the HDFS caching does not work well.
> > The cached log file is around 200MB. The hadoop cluster has 3 nodes, each
> > has 4GB memory.
> >
> > -bash-4.1$ hdfs cacheadmin -addPool wptest1
> > Successfully added cache pool wptest1.
> >
> > -bash-4.1$ /hadoop/hadoop-2.3.0/bin/hdfs cacheadmin -listPools
> > Found 1 result.
> > NAME     OWNER  GROUP  MODE            LIMIT  MAXTTL
> > wptest1  hdfs   hdfs   rwxr-xr-x   unlimited   never
> >
> > -bash-4.1$ hdfs cacheadmin -addDirective -path hadoop003.log -pool
> wptest1
> > Added cache directive 1
> >
> > -bash-4.1$  time /hadoop/hadoop-2.3.0/bin/hadoop fs -tail hadoop003.log
> > real    0m2.796s
> > user    0m4.263s
> > sys     0m0.203s
> >
> > -bash-4.1$  time /hadoop/hadoop-2.3.0/bin/hadoop fs -tail hadoop003.log
> > real    0m3.050s
> > user    0m4.176s
> > sys     0m0.192s
> >
> > It is weird that the cache status shows 0 byte cached:-bash-4.1$
> > /hadoop/hadoop-2.3.0/bin/hdfs cacheadmin -listDirectives -stats -path
> > hadoop003.log -pool wptest1
> > Found 1 entry
> > ID POOL      REPL EXPIRY  PATH                       BYTES_NEEDED
> > BYTES_CACHED  FILES_NEEDED  FILES_CACHED
> >   1 wptest1      1 never   /user/hdfs/hadoop003.log      209715206
> > 0             1             0
>
> If you take a look at this output, you can see that nothing is actually
> cached.
>
> One way to figure out why this is is to look at the logs of the
> NameNode and DataNode.  Some of the relevant logs are at DEBUG or
> TRACE level, so you may need to turn up the logs.  The
> CacheReplicationMonitor and FsDatasetCache classes are good places to
> start.
>
> Also be sure to check that you have set dfs.datanode.max.locked.memory.
>
> As Andrew commented, "hadoop tail" is not a good command to use for
> measuring performance, since you have a few seconds of Java startup
> time, followed by any HDFS setup time, followed by reading a single
> kilobyte of data.  If you want to use the shell, the simplest thing to
> do is to use cat and read a large file, so that those startup costs
> don't dominate the measurement.
>
> best,
> Colin
>
>
> >
> > -bash-4.1$ file /hadoop/hadoop-2.3.0/lib/native/libhadoop.so.1.0.0
> > /hadoop/hadoop-2.3.0/lib/native/libhadoop.so.1.0.0: ELF 64-bit LSB shared
> > object, x86-64, version 1 (SYSV), dynamically linked, not stripped
> >
> > I also tried the word count example with the same file. The execution
> time
> > is always 40 seconds. (The map/reduce job without cache is 42 seconds)
> > Is there anything wrong?
> > Thanks a lot
>