You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@hadoop.apache.org by "fgeueke ." <fg...@gmail.com> on 2013/09/11 22:17:42 UTC

Fwd: self-referencing jar file fills disk

Hi.  My apologies if I'm posting to the wrong list for this (perhaps
mapreduce-dev instead?).  I'm running into a sporadic issue with a MR job
I'm working on where my 'hadoop jar' command leaves behind a ./target
directory.  Further re-runs eventually result in what appears to be a
self-referencing jar growing indefinitely.  I only seem to be running into
the issue when I use my own WritableComparable subclass as keys outputted
by my map tasks.

Commands I run:

javac -cp `hbase classpath` HbaseToHdfs{,Mapper,Reducer}.java
SamisTypeIRecord.java TSDBTimeseries.java
jar cvf HbaseToHdfs.jar HbaseToHdfs{,Mapper,Reducer}.class
SamisTypeIRecord.class TSDBTimeseries.class
hadoop jar HbaseToHdfs.jar HbaseToHdfs -libjars
/usr/lib/hbase/hbase.jar,HbaseToHdfs.jar IPDRRecords test

After running that last command a few times - interspersed with hadoop fs
-rmr test commands I see this in ./target:

[fgeueke@lsn-syslog:~/ipdr_hadoop/hbaseToTsdb 04:06 PM]$ ls -lhR ./target
./target:
total 4.0K
drwxrwxr-x 2 fgeueke fgeueke 4.0K Sep 11 15:58 test-dir

./target/test-dir:
total 101M
-rw-rw-r-- 1 fgeueke fgeueke    0 Sep 11 15:58 hadoop-1303615468974398110
-rw-rw-r-- 1 fgeueke fgeueke 101M Sep 11 15:58
hadoop-1303615468974398110.jar
-rw-rw-r-- 1 fgeueke fgeueke    0 Sep 11 15:57 hadoop-4633794609113161296
-rw-rw-r-- 1 fgeueke fgeueke  94K Sep 11 15:57
hadoop-4633794609113161296.jar
-rw-rw-r-- 1 fgeueke fgeueke    0 Sep 11 15:57 hadoop-4810091450154527466
-rw-rw-r-- 1 fgeueke fgeueke  48K Sep 11 15:57
hadoop-4810091450154527466.jar

the MR job never registers with the JobTracker and stops here:

13/09/11 15:58:05 WARN conf.Configuration: fs.checkpoint.period is
deprecated. Instead, use dfs.namenode.checkpoint.period
13/09/11 15:58:05 WARN conf.Configuration:
topology.node.switch.mapping.impl is deprecated. Instead, use
net.topology.node.switch.mapping.impl
13/09/11 15:58:05 WARN conf.Configuration: io.bytes.per.checksum is
deprecated. Instead, use dfs.bytes-per-checksum
^C[fgeueke@lsn-syslog:~/ipdr_hadoop/hbaseToTsdb 03:58 PM]$

I couldn't find anything useful in /var/log/h*.

Any help you could provide would be greatly appreciated.  Thanks.

-Frank Geueke