You are viewing a plain text version of this content. The canonical link for it is here.

Posted to common-user@hadoop.apache.org by "Ratner, Alan S (IS)" <Al...@ngc.com> on 2011/03/04 22:55:51 UTC

RE: EXT :Re: Problem running a Hadoop program with external libraries

Aaron,

   Thanks for the rapid responses.


*         "ulimit -u unlimited" is in .bashrc.


*         HADOOP_HEAPSIZE is set to 4000 MB in hadoop-env.sh


*         Mapred.child.ulimit is set to 2048000 in mapred-site.xml


*         Mapred.child.java.opts is set to -Xmx1536m in mapred-site.xml

   I take it you are suggesting that I change the java.opts command to:

Mapred.child.java.opts is <value> -Xmx1536m -Djava.library.path=/path/to/native/libs </value>


Alan Ratner
Northrop Grumman Information Systems
Manager of Large-Scale Computing
9020 Junction Drive
Annapolis Junction, MD 20701
(410) 707-8605 (cell)

From: Aaron Kimball [mailto:akimball83@gmail.com]
Sent: Friday, March 04, 2011 4:30 PM
To: common-user@hadoop.apache.org
Cc: Ratner, Alan S (IS)
Subject: EXT :Re: Problem running a Hadoop program with external libraries

Actually, I just misread your email and missed the difference between your 2nd and 3rd attempts.

Are you enforcing min/max JVM heap sizes on your tasks? Are you enforcing a ulimit (either through your shell configuration, or through Hadoop itself)? I don't know where these "cannot allocate memory" errors are coming from. If they're from the OS, could it be because it needs to fork() and momentarily exceed the ulimit before loading the native libs?

- Aaron

On Fri, Mar 4, 2011 at 1:26 PM, Aaron Kimball <ak...@gmail.com>> wrote:
I don't know if putting native-code .so files inside a jar works. A native-code .so is not "classloaded" in the same way .class files are.

So the correct .so files probably need to exist in some physical directory on the worker machines. You may want to doublecheck that the correct directory on the worker machines is identified in the JVM property 'java.library.path' (instead of / in addition to $LD_LIBRARY_PATH). This can be manipulated in the Hadoop configuration setting mapred.child.java.opts (include '-Djava.library.path=/path/to/native/libs' in the string there.)

Also, if you added your .so files to a directory that is already used by the tasktracker (like hadoop-0.21.0/lib/native/Linux-amd64-64/), you may need to restart the tasktracker instance for it to take effect. (This is true of .jar files in the $HADOOP_HOME/lib directory; I don't know if it is true for native libs as well.)

- Aaron

On Fri, Mar 4, 2011 at 12:53 PM, Ratner, Alan S (IS) <Al...@ngc.com>> wrote:
We are having difficulties running a Hadoop program making calls to external libraries - but this occurs only when we run the program on our cluster and not from within Eclipse where we are apparently running in Hadoop's standalone mode.  This program invokes the Open Computer Vision libraries (OpenCV and JavaCV).  (I don't think there is a problem with our cluster - we've run many Hadoop jobs on it without difficulty.)

1.      I normally use Eclipse to create jar files for our Hadoop programs but I inadvertently hit the "run as Java application" button and the program ran fine, reading the input file from the eclipse workspace rather than HDFS and writing the output file to the same place.  Hadoop's output appears below.  (This occurred on the master Hadoop server.)

2.      I then "exported" from Eclipse a "runnable jar" which "extracted required libraries" into the generated jar - presumably producing a jar file that incorporated all the required library functions. (The plain jar file for this program is 17 kB while the runnable jar is 30MB.)  When I try to run this on my Hadoop cluster (including my master and slave servers) the program reports that it is unable to locate "libopencv_highgui.so.2.2: cannot open shared object file: No such file or directory".  Now, in addition to this library being incorporated inside the runnable jar file it is also present on each of my servers at hadoop-0.21.0/lib/native/Linux-amd64-64/ where we have loaded the same libraries (to give Hadoop 2 shots at finding them).  These include:
     ...
     libopencv_highgui_pch_dephelp.a
     libopencv_highgui.so
     libopencv_highgui.so.2.2
     libopencv_highgui.so.2.2.0
     ...

     When I poke around inside the runnable jar I find javacv_linux-x86_64.jar which contains:
     com/googlecode/javacv/cpp/linux-x86_64/libjniopencv_highgui.so

3.      I then tried adding the following to mapred-site.xml as suggested in Patch 2838 that's supposed to be included in hadoop 0.21 https://issues.apache.org/jira/browse/HADOOP-2838
     <property>
       <name>mapred.child.env</name>
       <value>LD_LIBRARY_PATH=/home/ngc/hadoop-0.21.0/lib/native/Linux-amd64-64</value>
     </property>
     The log is included at the bottom of this email with Hadoop now complaining about a different missing library with an out-of-memory error.

Does anyone have any ideas as to what is going wrong here?  Any help would be appreciated.  Thanks.

Alan


BTW: Each of our servers has 4 hard drives and many of the errors below refer to the 3 drives (/media/hd2 or hd3 or hd4) reserved exclusively for HDFS and thus perhaps not a good place for Hadoop to be looking for a library file.  My slaves have 24 GB RAM, the jar file is 30 MB, and the sequence file being read is 400 KB - so I hope I am not running out of memory.


1.      RUNNING DIRECTLY FROM ECLIPSE IN HADOOP'S STANDALONE MODE - SUCCESS

>>>> Running Face Program
11/03/04 12:44:10 INFO security.Groups: Group mapping impl=org.apache.hadoop.security.ShellBasedUnixGroupsMapping; cacheTimeout=300000
11/03/04 12:44:10 INFO jvm.JvmMetrics: Initializing JVM Metrics with processName=JobTracker, sessionId=
11/03/04 12:44:10 WARN mapreduce.JobSubmitter: Use GenericOptionsParser for parsing the arguments. Applications should implement Tool for the same.
11/03/04 12:44:10 WARN mapreduce.JobSubmitter: No job jar file set.  User classes may not be found. See Job or Job#setJar(String).
11/03/04 12:44:10 INFO mapred.FileInputFormat: Total input paths to process : 1
11/03/04 12:44:10 WARN conf.Configuration: mapred.map.tasks is deprecated. Instead, use mapreduce.job.maps
11/03/04 12:44:10 INFO mapreduce.JobSubmitter: number of splits:1
11/03/04 12:44:10 INFO mapreduce.JobSubmitter: adding the following namenodes' delegation tokens:null
11/03/04 12:44:10 WARN security.TokenCache: Overwriting existing token storage with # keys=0
11/03/04 12:44:10 INFO mapreduce.Job: Running job: job_local_0001
11/03/04 12:44:10 INFO mapred.LocalJobRunner: Waiting for map tasks
11/03/04 12:44:10 INFO mapred.LocalJobRunner: Starting task: attempt_local_0001_m_000000_0
11/03/04 12:44:10 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
11/03/04 12:44:10 INFO compress.CodecPool: Got brand-new decompressor
11/03/04 12:44:10 INFO mapred.MapTask: numReduceTasks: 1
11/03/04 12:44:10 INFO mapred.MapTask: (EQUATOR) 0 kvi 26214396(104857584)
11/03/04 12:44:10 INFO mapred.MapTask: mapreduce.task.io.sort.mb: 100
11/03/04 12:44:10 INFO mapred.MapTask: soft limit at 83886080
11/03/04 12:44:10 INFO mapred.MapTask: bufstart = 0; bufvoid = 104857600
11/03/04 12:44:10 INFO mapred.MapTask: kvstart = 26214396; length = 6553600
11/03/04 12:44:11 INFO mapreduce.Job:  map 0% reduce 0%
11/03/04 12:44:16 INFO mapred.LocalJobRunner: file:/home/ngc/eclipse_workspace/HadoopPrograms/Images2/JPGSequenceFile.001:0+411569 > map
11/03/04 12:44:17 INFO mapreduce.Job:  map 57% reduce 0%
11/03/04 12:44:18 INFO mapred.LocalJobRunner: file:/home/ngc/eclipse_workspace/HadoopPrograms/Images2/JPGSequenceFile.001:0+411569 > map
11/03/04 12:44:18 INFO mapred.MapTask: Starting flush of map output
11/03/04 12:44:18 INFO mapred.MapTask: Spilling map output
11/03/04 12:44:18 INFO mapred.MapTask: bufstart = 0; bufend = 1454; bufvoid = 104857600
11/03/04 12:44:18 INFO mapred.MapTask: kvstart = 26214396(104857584); kvend = 26214324(104857296); length = 73/6553600
11/03/04 12:44:18 INFO mapred.MapTask: Finished spill 0
11/03/04 12:44:18 INFO mapred.Task: Task:attempt_local_0001_m_000000_0 is done. And is in the process of commiting
11/03/04 12:44:18 INFO mapred.LocalJobRunner: file:/home/ngc/eclipse_workspace/HadoopPrograms/Images2/JPGSequenceFile.001:0+411569 > sort
11/03/04 12:44:18 INFO mapred.Task: Task 'attempt_local_0001_m_000000_0' done.
11/03/04 12:44:18 INFO mapred.LocalJobRunner: Finishing task: attempt_local_0001_m_000000_0
11/03/04 12:44:18 INFO mapred.LocalJobRunner: Map task executor complete.
11/03/04 12:44:18 INFO mapred.Merger: Merging 1 sorted segments
11/03/04 12:44:18 INFO mapred.Merger: Down to the last merge-pass, with 1 segments left of total size: 1481 bytes
11/03/04 12:44:18 INFO mapred.LocalJobRunner:
11/03/04 12:44:18 INFO mapred.Task: Task:attempt_local_0001_r_000000_0 is done. And is in the process of commiting
11/03/04 12:44:18 INFO mapred.LocalJobRunner:
11/03/04 12:44:18 INFO mapred.Task: Task attempt_local_0001_r_000000_0 is allowed to commit now
11/03/04 12:44:18 INFO mapred.FileOutputCommitter: Saved output of task 'attempt_local_0001_r_000000_0' to file:/home/ngc/eclipse_workspace/HadoopPrograms/FaceOutput
11/03/04 12:44:18 INFO mapred.LocalJobRunner: reduce > sort
11/03/04 12:44:18 INFO mapred.Task: Task 'attempt_local_0001_r_000000_0' done.
11/03/04 12:44:18 INFO mapreduce.Job:  map 100% reduce 100%
11/03/04 12:44:18 INFO mapreduce.Job: Job complete: job_local_0001
11/03/04 12:44:18 INFO mapreduce.Job: Counters: 18
       FileInputFormatCounters
               BYTES_READ=411439
       FileSystemCounters
               FILE_BYTES_READ=825005
               FILE_BYTES_WRITTEN=127557
       Map-Reduce Framework
               Combine input records=0
               Combine output records=0
               Failed Shuffles=0
               GC time elapsed (ms)=88
               Map input records=20
               Map output bytes=1454
               Map output records=19
               Merged Map outputs=0
               Reduce input groups=19
               Reduce input records=19
               Reduce output records=19
               Reduce shuffle bytes=0
               Shuffled Maps =0
               Spilled Records=38
               SPLIT_RAW_BYTES=127
>>>> 0.036993828        img_9619.jpg 2 found at [ 35, 201, 37 ], [ 84, 41, 68 ],
...
>>>> 0.41283935 img_538.jpg 3 found at [ 265, 44, 80 ], [ 132, 32, 101 ], [ 210, 119, 228 ],
>>>> Job Finished in 8.679 seconds

2.      RUNNING THE SAME PROGRAM IN HADOOP'S DISTRIBUTED MODE - FAILURE

ngc@hadoop1:~$ cd hadoop-0.21.0/
ngc@hadoop1:~/hadoop-0.21.0$ bin/hadoop fs -ls Imag*
11/03/04 12:58:18 INFO security.Groups: Group mapping impl=org.apache.hadoop.security.ShellBasedUnixGroupsMapping; cacheTimeout=300000
11/03/04 12:58:18 WARN conf.Configuration: mapred.task.id<http://mapred.task.id> is deprecated. Instead, use mapreduce.task.attempt.id<http://mapreduce.task.attempt.id>
Found 1 items
-rw-r--r--   2 ngc supergroup     411569 2011-03-02 13:21 /user/ngc/Images2/JPGSequenceFile.001
ngc@hadoop1:~/hadoop-0.21.0$ bin/hadoop jar ../eclipse/HannahFace.jar progs/HannahFace
>>>> Running Face Program
11/03/04 12:59:39 INFO security.Groups: Group mapping impl=org.apache.hadoop.security.ShellBasedUnixGroupsMapping; cacheTimeout=300000
11/03/04 12:59:40 WARN conf.Configuration: mapred.task.id<http://mapred.task.id> is deprecated. Instead, use mapreduce.task.attempt.id<http://mapreduce.task.attempt.id>
11/03/04 12:59:40 WARN mapreduce.JobSubmitter: Use GenericOptionsParser for parsing the arguments. Applications should implement Tool for the same.
11/03/04 12:59:40 INFO mapred.FileInputFormat: Total input paths to process : 1
11/03/04 12:59:40 WARN conf.Configuration: mapred.map.tasks is deprecated. Instead, use mapreduce.job.maps
11/03/04 12:59:40 INFO mapreduce.JobSubmitter: number of splits:100
11/03/04 12:59:40 INFO mapreduce.JobSubmitter: adding the following namenodes' delegation tokens:null
11/03/04 12:59:41 INFO mapreduce.Job: Running job: job_201103021428_0051
11/03/04 12:59:42 INFO mapreduce.Job:  map 0% reduce 0%
11/03/04 12:59:49 INFO mapreduce.Job:  map 7% reduce 0%
11/03/04 12:59:51 INFO mapreduce.Job:  map 10% reduce 0%
11/03/04 13:00:04 INFO mapreduce.Job:  map 12% reduce 0%
11/03/04 13:00:05 INFO mapreduce.Job:  map 16% reduce 0%
11/03/04 13:00:06 INFO mapreduce.Job:  map 28% reduce 0%
11/03/04 13:00:07 INFO mapreduce.Job:  map 37% reduce 0%
11/03/04 13:00:07 INFO mapreduce.Job: Task Id : attempt_201103021428_0051_m_000016_0, Status : FAILED
Error: /tmp/hadoop-ngc/mapred/local/taskTracker/ngc/jobcache/job_201103021428_0051/attempt_201103021428_0051_m_000016_0/work/tmp/libjniopencv_highgui9051044227445275266.so: libopencv_highgui.so.2.2: cannot open shared object file: No such file or directory
11/03/04 13:00:08 INFO mapreduce.Job: Task Id : attempt_201103021428_0051_m_000044_0, Status : FAILED
Error: /media/hd3/mapred/local/taskTracker/ngc/jobcache/job_201103021428_0051/attempt_201103021428_0051_m_000044_0/work/tmp/libjniopencv_highgui6446098204420446112.so: libopencv_highgui.so.2.2: cannot open shared object file: No such file or directory
11/03/04 13:00:09 INFO mapreduce.Job:  map 47% reduce 0%
11/03/04 13:00:09 INFO mapreduce.Job: Task Id : attempt_201103021428_0051_m_000048_0, Status : FAILED
Error: /media/hd2/mapred/local/taskTracker/ngc/jobcache/job_201103021428_0051/attempt_201103021428_0051_m_000048_0/work/tmp/libjniopencv_objdetect3671939282732993726.so: /media/hd2/mapred/local/taskTracker/ngc/jobcache/job_201103021428_0051/attempt_201103021428_0051_m_000048_0/work/tmp/libjniopencv_objdetect3671939282732993726.so: failed to map segment from shared object: Cannot allocate memory
11/03/04 13:00:09 INFO mapreduce.Job: Task Id : attempt_201103021428_0051_m_000052_0, Status : FAILED
Error: /tmp/hadoop-ngc/mapred/local/taskTracker/ngc/jobcache/job_201103021428_0051/attempt_201103021428_0051_m_000052_0/work/tmp/libjniopencv_highgui1579426900682939358.so: libopencv_highgui.so.2.2: cannot open shared object file: No such file or directory
11/03/04 13:00:10 INFO mapreduce.Job: Task Id : attempt_201103021428_0051_m_000001_0, Status : FAILED
Error: /media/hd2/mapred/local/taskTracker/ngc/jobcache/job_201103021428_0051/attempt_201103021428_0051_m_000001_0/work/tmp/libjniopencv_objdetect3457632677367330581.so: /media/hd2/mapred/local/taskTracker/ngc/jobcache/job_201103021428_0051/attempt_201103021428_0051_m_000001_0/work/tmp/libjniopencv_objdetect3457632677367330581.so: failed to map segment from shared object: Cannot allocate memory
attempt_201103021428_0051_m_000001_0: Java HotSpot(TM) 64-Bit Server VM warning: Exception java.lang.OutOfMemoryError occurred dispatching signal SIGTERM to handler- the VM may need to be forcibly terminated
11/03/04 13:00:11 INFO mapreduce.Job: Task Id : attempt_201103021428_0051_m_000022_0, Status : FAILED
11/03/04 13:00:12 INFO mapreduce.Job: Task Id : attempt_201103021428_0051_m_000026_0, Status : FAILED
11/03/04 13:00:13 INFO mapreduce.Job: Task Id : attempt_201103021428_0051_m_000006_0, Status : FAILED
11/03/04 13:00:14 INFO mapreduce.Job: Task Id : attempt_201103021428_0051_m_000011_0, Status : FAILED
11/03/04 13:00:16 INFO mapreduce.Job:  map 48% reduce 0%
11/03/04 13:00:17 INFO mapreduce.Job:  map 57% reduce 0%
11/03/04 13:00:18 INFO mapreduce.Job:  map 59% reduce 0%
11/03/04 13:00:18 INFO mapreduce.Job: Task Id : attempt_201103021428_0051_m_000061_0, Status : FAILED
Error: /media/hd3/mapred/local/taskTracker/ngc/jobcache/job_201103021428_0051/attempt_201103021428_0051_m_000061_0/work/tmp/libjniopencv_highgui3743962684984755257.so: libopencv_highgui.so.2.2: cannot open shared object file: No such file or directory
11/03/04 13:00:19 INFO mapreduce.Job:  map 68% reduce 0%
11/03/04 13:00:19 INFO mapreduce.Job: Task Id : attempt_201103021428_0051_m_000034_0, Status : FAILED
11/03/04 13:00:20 INFO mapreduce.Job:  map 68% reduce 15%
11/03/04 13:00:20 INFO mapreduce.Job: Task Id : attempt_201103021428_0051_m_000039_0, Status : FAILED
11/03/04 13:00:21 INFO mapreduce.Job: Task Id : attempt_201103021428_0051_m_000076_0, Status : FAILED
Error: /media/hd4/mapred/local/taskTracker/ngc/jobcache/job_201103021428_0051/attempt_201103021428_0051_m_000076_0/work/tmp/libjniopencv_highgui3438076786756619584.so: libopencv_highgui.so.2.2: cannot open shared object file: No such file or directory
11/03/04 13:00:22 INFO mapreduce.Job: Task Id : attempt_201103021428_0051_m_000057_0, Status : FAILED
11/03/04 13:00:23 INFO mapreduce.Job:  map 68% reduce 23%
11/03/04 13:00:23 INFO mapreduce.Job: Task Id : attempt_201103021428_0051_m_000065_0, Status : FAILED
11/03/04 13:00:24 INFO mapreduce.Job: Task Id : attempt_201103021428_0051_m_000069_0, Status : FAILED
11/03/04 13:00:25 INFO mapreduce.Job: Task Id : attempt_201103021428_0051_m_000082_0, Status : FAILED
11/03/04 13:00:36 INFO mapreduce.Job: Task Id : attempt_201103021428_0051_m_000082_1, Status : FAILED
Error: /media/hd4/mapred/local/taskTracker/ngc/jobcache/job_201103021428_0051/attempt_201103021428_0051_m_000082_1/work/tmp/libjniopencv_highgui7180733690064994995.so: libopencv_highgui.so.2.2: cannot open shared object file: No such file or directory
11/03/04 13:00:39 INFO mapreduce.Job: Task Id : attempt_201103021428_0051_m_000011_1, Status : FAILED
Error: /media/hd4/mapred/local/taskTracker/ngc/jobcache/job_201103021428_0051/attempt_201103021428_0051_m_000011_1/work/tmp/libjniopencv_highgui8978195121954363506.so: libopencv_highgui.so.2.2: cannot open shared object file: No such file or directory
11/03/04 13:00:41 INFO mapreduce.Job:  map 73% reduce 23%
11/03/04 13:00:42 INFO mapreduce.Job: Task Id : attempt_201103021428_0051_m_000069_1, Status : FAILED
11/03/04 13:00:43 INFO mapreduce.Job:  map 73% reduce 24%
11/03/04 13:00:43 INFO mapreduce.Job: Task Id : attempt_201103021428_0051_m_000048_1, Status : FAILED
Error: /media/hd2/mapred/local/taskTracker/ngc/jobcache/job_201103021428_0051/attempt_201103021428_0051_m_000048_1/work/tmp/libjniopencv_highgui7269142293373011624.so: libopencv_highgui.so.2.2: cannot open shared object file: No such file or directory
11/03/04 13:00:44 INFO mapreduce.Job:  map 74% reduce 24%
11/03/04 13:00:46 INFO mapreduce.Job:  map 74% reduce 25%
11/03/04 13:00:46 INFO mapreduce.Job: Task Id : attempt_201103021428_0051_m_000001_1, Status : FAILED
11/03/04 13:00:47 INFO mapreduce.Job:  map 75% reduce 25%
11/03/04 13:00:48 INFO mapreduce.Job: Task Id : attempt_201103021428_0051_m_000052_1, Status : FAILED
11/03/04 13:00:49 INFO mapreduce.Job: Task Id : attempt_201103021428_0051_m_000044_1, Status : FAILED
11/03/04 13:00:49 INFO mapreduce.Job: Task Id : attempt_201103021428_0051_m_000087_0, Status : FAILED
11/03/04 13:00:49 INFO mapreduce.Job: Task Id : attempt_201103021428_0051_m_000011_2, Status : FAILED
Error: /tmp/hadoop-ngc/mapred/local/taskTracker/ngc/jobcache/job_201103021428_0051/attempt_201103021428_0051_m_000011_2/work/tmp/libjniopencv_highgui6941559715123481707.so: libopencv_highgui.so.2.2: cannot open shared object file: No such file or directory
11/03/04 13:00:50 INFO mapreduce.Job:  map 79% reduce 25%
11/03/04 13:00:51 INFO mapreduce.Job: Task Id : attempt_201103021428_0051_m_000006_1, Status : FAILED
Error: /media/hd2/mapred/local/taskTracker/ngc/jobcache/job_201103021428_0051/attempt_201103021428_0051_m_000006_1/work/tmp/libjniopencv_highgui72992299570368055.so: libopencv_highgui.so.2.2: cannot open shared object file: No such file or directory
11/03/04 13:00:52 INFO mapreduce.Job: Task Id : attempt_201103021428_0051_m_000026_1, Status : FAILED
11/03/04 13:00:54 INFO mapreduce.Job: Task Id : attempt_201103021428_0051_m_000022_1, Status : FAILED
11/03/04 13:00:55 INFO mapreduce.Job:  map 79% reduce 26%
11/03/04 13:00:55 INFO mapreduce.Job: Task Id : attempt_201103021428_0051_m_000016_1, Status : FAILED
11/03/04 13:00:57 INFO mapreduce.Job: Task Id : attempt_201103021428_0051_m_000091_0, Status : FAILED
11/03/04 13:00:58 INFO mapreduce.Job: Task Id : attempt_201103021428_0051_m_000096_0, Status : FAILED
11/03/04 13:00:58 INFO mapreduce.Job: Task Id : attempt_201103021428_0051_m_000034_1, Status : FAILED
Error: /media/hd2/mapred/local/taskTracker/ngc/jobcache/job_201103021428_0051/attempt_201103021428_0051_m_000034_1/work/tmp/libjniopencv_highgui3721618225395918920.so: libopencv_highgui.so.2.2: cannot open shared object file: No such file or directory
11/03/04 13:01:01 INFO mapreduce.Job: Task Id : attempt_201103021428_0051_m_000065_1, Status : FAILED
11/03/04 13:01:03 INFO mapreduce.Job: Task Id : attempt_201103021428_0051_m_000057_1, Status : FAILED
11/03/04 13:01:04 INFO mapreduce.Job: Task Id : attempt_201103021428_0051_m_000076_1, Status : FAILED
11/03/04 13:01:06 INFO mapreduce.Job: Task Id : attempt_201103021428_0051_m_000039_1, Status : FAILED
11/03/04 13:01:07 INFO mapreduce.Job: Task Id : attempt_201103021428_0051_m_000061_1, Status : FAILED
11/03/04 13:01:09 INFO mapreduce.Job: Task Id : attempt_201103021428_0051_m_000069_2, Status : FAILED
Error: /media/hd3/mapred/local/taskTracker/ngc/jobcache/job_201103021428_0051/attempt_201103021428_0051_m_000069_2/work/tmp/libjniopencv_highgui8910946496817753039.so: libopencv_highgui.so.2.2: cannot open shared object file: No such file or directory
11/03/04 13:01:10 INFO mapreduce.Job: Task Id : attempt_201103021428_0051_m_000082_2, Status : FAILED
11/03/04 13:01:15 INFO mapreduce.Job: Job complete: job_201103021428_0051
11/03/04 13:01:15 INFO mapreduce.Job: Counters: 21
       FileInputFormatCounters
               BYTES_READ=0
       FileSystemCounters
               FILE_BYTES_WRITTEN=3040
               HDFS_BYTES_READ=1048281
       Job Counters
               Data-local map tasks=51
               Total time spent by all maps waiting after reserving slots (ms)=0
               Total time spent by all reduces waiting after reserving slots (ms)=0
               Failed map tasks=1
               Rack-local map tasks=86
               SLOTS_MILLIS_MAPS=1091359
               SLOTS_MILLIS_REDUCES=0
               Launched map tasks=137
               Launched reduce tasks=2
       Map-Reduce Framework
               Combine input records=0
               Failed Shuffles=0
               GC time elapsed (ms)=0
               Map input records=0
               Map output bytes=0
               Map output records=0
               Merged Map outputs=0
               Spilled Records=0
               SPLIT_RAW_BYTES=8960
Exception in thread "main" java.io.IOException: Job failed!
       at org.apache.hadoop.mapred.JobClient.runJob(JobClient.java:782)
       at progs.HannahFace.run(HannahFace.java:137)
       at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:69)
       at progs.HannahFace.main(HannahFace.java:162)
       at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
       at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
       at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
       at java.lang.reflect.Method.invoke(Method.java:597)
       at org.apache.hadoop.util.RunJar.main(RunJar.java:192)

3.      SAME COMMAND BUT AFTER I MODIFIED MAPRED-SITE.XML - FAILURE

ngc@hadoop1:~/hadoop-0.21.0$ bin/hadoop jar ../eclipse/HannahFace.jar progs/HannahFace
>>>> Running Face Program
11/03/04 15:07:11 INFO security.Groups: Group mapping impl=org.apache.hadoop.security.ShellBasedUnixGroupsMapping; cacheTimeout=300000
11/03/04 15:07:11 WARN conf.Configuration: mapred.task.id<http://mapred.task.id> is deprecated. Instead, use mapreduce.task.attempt.id<http://mapreduce.task.attempt.id>
11/03/04 15:07:11 WARN mapreduce.JobSubmitter: Use GenericOptionsParser for parsing the arguments. Applications should implement Tool for the same.
11/03/04 15:07:12 INFO mapred.FileInputFormat: Total input paths to process : 1
11/03/04 15:07:12 WARN conf.Configuration: mapred.map.tasks is deprecated. Instead, use mapreduce.job.maps
11/03/04 15:07:12 INFO mapreduce.JobSubmitter: number of splits:100
11/03/04 15:07:12 INFO mapreduce.JobSubmitter: adding the following namenodes' delegation tokens:null
11/03/04 15:07:12 INFO mapreduce.Job: Running job: job_201103021428_0069
11/03/04 15:07:13 INFO mapreduce.Job:  map 0% reduce 0%
11/03/04 15:07:20 INFO mapreduce.Job:  map 6% reduce 0%
11/03/04 15:07:21 INFO mapreduce.Job:  map 10% reduce 0%
11/03/04 15:07:36 INFO mapreduce.Job:  map 18% reduce 0%
11/03/04 15:07:38 INFO mapreduce.Job:  map 28% reduce 0%
11/03/04 15:07:38 INFO mapreduce.Job: Task Id : attempt_201103021428_0069_m_000016_0, Status : FAILED
Error: /media/hd4/mapred/local/taskTracker/ngc/jobcache/job_201103021428_0069/attempt_201103021428_0069_m_000016_0/work/tmp/libjniopencv_highgui4138482228584845301.so: libxcb.so.1: failed to map segment from shared object: Cannot allocate memory
11/03/04 15:07:39 INFO mapreduce.Job:  map 35% reduce 0%
11/03/04 15:07:40 INFO mapreduce.Job: Task Id : attempt_201103021428_0069_m_000001_0, Status : FAILED
Error: /media/hd3/mapred/local/taskTracker/ngc/jobcache/job_201103021428_0069/attempt_201103021428_0069_m_000001_0/work/tmp/libjniopencv_highgui2385564746644347958.so: libXau.so.6: failed to map segment from shared object: Cannot allocate memory
11/03/04 15:07:42 INFO mapreduce.Job:  map 39% reduce 0%
11/03/04 15:07:42 INFO mapreduce.Job: Task Id : attempt_201103021428_0069_m_000022_0, Status : FAILED
11/03/04 15:07:44 INFO mapreduce.Job:  map 50% reduce 0%
11/03/04 15:07:44 INFO mapreduce.Job: Task Id : attempt_201103021428_0069_m_000026_0, Status : FAILED
11/03/04 15:07:45 INFO mapreduce.Job: Task Id : attempt_201103021428_0069_m_000011_0, Status : FAILED
11/03/04 15:07:46 INFO mapreduce.Job: Task Id : attempt_201103021428_0069_m_000034_0, Status : FAILED
11/03/04 15:07:47 INFO mapreduce.Job:  map 50% reduce 13%
11/03/04 15:07:47 INFO mapreduce.Job: Task Id : attempt_201103021428_0069_m_000039_0, Status : FAILED
11/03/04 15:07:48 INFO mapreduce.Job:  map 59% reduce 13%
11/03/04 15:07:48 INFO mapreduce.Job: Task Id : attempt_201103021428_0069_m_000082_0, Status : FAILED
Error: /media/hd2/mapred/local/taskTracker/ngc/jobcache/job_201103021428_0069/attempt_201103021428_0069_m_000082_0/work/tmp/libjniopencv_highgui2586824844718343743.so: libxcb-render.so.0: failed to map segment from shared object: Cannot allocate memory
11/03/04 15:07:50 INFO mapreduce.Job: Task Id : attempt_201103021428_0069_m_000006_0, Status : FAILED
11/03/04 15:07:51 INFO mapreduce.Job:  map 67% reduce 13%
11/03/04 15:07:51 INFO mapreduce.Job: Task Id : attempt_201103021428_0069_m_000044_0, Status : FAILED
11/03/04 15:07:51 INFO mapreduce.Job: Task Id : attempt_201103021428_0069_m_000048_0, Status : FAILED
11/03/04 15:07:53 INFO mapreduce.Job: Task Id : attempt_201103021428_0069_m_000052_0, Status : FAILED
11/03/04 15:07:53 INFO mapreduce.Job: Task Id : attempt_201103021428_0069_m_000076_0, Status : FAILED
Error: /media/hd3/mapred/local/taskTracker/ngc/jobcache/job_201103021428_0069/attempt_201103021428_0069_m_000076_0/work/tmp/libjniopencv_highgui6607923753832414434.so: libfontconfig.so.1: failed to map segment from shared object: Cannot allocate memory
11/03/04 15:07:54 INFO mapreduce.Job: Task Id : attempt_201103021428_0069_m_000057_0, Status : FAILED
11/03/04 15:07:56 INFO mapreduce.Job: Task Id : attempt_201103021428_0069_m_000061_0, Status : FAILED
11/03/04 15:07:57 INFO mapreduce.Job: Task Id : attempt_201103021428_0069_m_000065_0, Status : FAILED
11/03/04 15:07:59 INFO mapreduce.Job: Task Id : attempt_201103021428_0069_m_000069_0, Status : FAILED
attempt_201103021428_0069_m_000069_0: Java HotSpot(TM) 64-Bit Server VM warning: Exception java.lang.OutOfMemoryError occurred dispatching signal SIGTERM to handler- the VM may need to be forcibly terminated

Re: EXT :Re: Problem running a Hadoop program with external libraries

Posted by Alejandro Abdelnur <tu...@cloudera.com>.

Why don't you put your native library in HDFS and use the DistributedCache
to make them avail to the tasks. For example:

Copy 'foo.so' to 'hdfs://localhost:8020/tmp/foo.so', then added to the job
distributed cache:

  DistributedCache.addCacheFile("hdfs://localhost:8020/tmp/foo.so#foo.so",
jobConf);
  DistributedCache.createSymlink(conf);

Note that the #foo.so will create a soflink in the task running dir. And the
task running dir is in LD_PATH of your task.

Alejandro

On Sat, Mar 5, 2011 at 7:19 AM, Lance Norskog <go...@gmail.com> wrote:

> I have never heard of putting a native code shared library in a Java jar. I
> doubt that it works. But it's a cool idea!
>
> A Unix binary program loads shared libraries from the paths given in the
> environment variable LD_LIBRARY_PATH. This has to be set to the directory
> with the OpenCV .so file when you start Java.
>
> Lance
>
> On Mar 4, 2011, at 2:13 PM, Brian Bockelman wrote:
>
> > Hi,
> >
> > Check your kernel's overcommit settings.  This will prevent the JVM from
> allocating memory even when there's free RAM.
> >
> > Brian
> >
> > On Mar 4, 2011, at 3:55 PM, Ratner, Alan S (IS) wrote:
> >
> >> Aaron,
> >>
> >>  Thanks for the rapid responses.
> >>
> >>
> >> *         "ulimit -u unlimited" is in .bashrc.
> >>
> >>
> >> *         HADOOP_HEAPSIZE is set to 4000 MB in hadoop-env.sh
> >>
> >>
> >> *         Mapred.child.ulimit is set to 2048000 in mapred-site.xml
> >>
> >>
> >> *         Mapred.child.java.opts is set to -Xmx1536m in mapred-site.xml
> >>
> >>  I take it you are suggesting that I change the java.opts command to:
> >>
> >> Mapred.child.java.opts is <value> -Xmx1536m
> -Djava.library.path=/path/to/native/libs </value>
> >>
> >>
> >> Alan Ratner
> >> Northrop Grumman Information Systems
> >> Manager of Large-Scale Computing
> >> 9020 Junction Drive
> >> Annapolis Junction, MD 20701
> >> (410) 707-8605 (cell)
> >>
> >> From: Aaron Kimball [mailto:akimball83@gmail.com]
> >> Sent: Friday, March 04, 2011 4:30 PM
> >> To: common-user@hadoop.apache.org
> >> Cc: Ratner, Alan S (IS)
> >> Subject: EXT :Re: Problem running a Hadoop program with external
> libraries
> >>
> >> Actually, I just misread your email and missed the difference between
> your 2nd and 3rd attempts.
> >>
> >> Are you enforcing min/max JVM heap sizes on your tasks? Are you
> enforcing a ulimit (either through your shell configuration, or through
> Hadoop itself)? I don't know where these "cannot allocate memory" errors are
> coming from. If they're from the OS, could it be because it needs to fork()
> and momentarily exceed the ulimit before loading the native libs?
> >>
> >> - Aaron
> >>
> >> On Fri, Mar 4, 2011 at 1:26 PM, Aaron Kimball <akimball83@gmail.com
> <ma...@gmail.com>> wrote:
> >> I don't know if putting native-code .so files inside a jar works. A
> native-code .so is not "classloaded" in the same way .class files are.
> >>
> >> So the correct .so files probably need to exist in some physical
> directory on the worker machines. You may want to doublecheck that the
> correct directory on the worker machines is identified in the JVM property
> 'java.library.path' (instead of / in addition to $LD_LIBRARY_PATH). This can
> be manipulated in the Hadoop configuration setting mapred.child.java.opts
> (include '-Djava.library.path=/path/to/native/libs' in the string there.)
> >>
> >> Also, if you added your .so files to a directory that is already used by
> the tasktracker (like hadoop-0.21.0/lib/native/Linux-amd64-64/), you may
> need to restart the tasktracker instance for it to take effect. (This is
> true of .jar files in the $HADOOP_HOME/lib directory; I don't know if it is
> true for native libs as well.)
> >>
> >> - Aaron
> >>
> >> On Fri, Mar 4, 2011 at 12:53 PM, Ratner, Alan S (IS) <
> Alan.Ratner@ngc.com<ma...@ngc.com>> wrote:
> >> We are having difficulties running a Hadoop program making calls to
> external libraries - but this occurs only when we run the program on our
> cluster and not from within Eclipse where we are apparently running in
> Hadoop's standalone mode.  This program invokes the Open Computer Vision
> libraries (OpenCV and JavaCV).  (I don't think there is a problem with our
> cluster - we've run many Hadoop jobs on it without difficulty.)
> >>
> >> 1.      I normally use Eclipse to create jar files for our Hadoop
> programs but I inadvertently hit the "run as Java application" button and
> the program ran fine, reading the input file from the eclipse workspace
> rather than HDFS and writing the output file to the same place.  Hadoop's
> output appears below.  (This occurred on the master Hadoop server.)
> >>
> >> 2.      I then "exported" from Eclipse a "runnable jar" which "extracted
> required libraries" into the generated jar - presumably producing a jar file
> that incorporated all the required library functions. (The plain jar file
> for this program is 17 kB while the runnable jar is 30MB.)  When I try to
> run this on my Hadoop cluster (including my master and slave servers) the
> program reports that it is unable to locate "libopencv_highgui.so.2.2:
> cannot open shared object file: No such file or directory".  Now, in
> addition to this library being incorporated inside the runnable jar file it
> is also present on each of my servers at
> hadoop-0.21.0/lib/native/Linux-amd64-64/ where we have loaded the same
> libraries (to give Hadoop 2 shots at finding them).  These include:
> >>    ...
> >>    libopencv_highgui_pch_dephelp.a
> >>    libopencv_highgui.so
> >>    libopencv_highgui.so.2.2
> >>    libopencv_highgui.so.2.2.0
> >>    ...
> >>
> >>    When I poke around inside the runnable jar I find
> javacv_linux-x86_64.jar which contains:
> >>    com/googlecode/javacv/cpp/linux-x86_64/libjniopencv_highgui.so
> >>
> >> 3.      I then tried adding the following to mapred-site.xml as
> suggested in Patch 2838 that's supposed to be included in hadoop 0.21
> https://issues.apache.org/jira/browse/HADOOP-2838
> >>    <property>
> >>      <name>mapred.child.env</name>
> >>
>  <value>LD_LIBRARY_PATH=/home/ngc/hadoop-0.21.0/lib/native/Linux-amd64-64</value>
> >>    </property>
> >>    The log is included at the bottom of this email with Hadoop now
> complaining about a different missing library with an out-of-memory error.
> >>
> >> Does anyone have any ideas as to what is going wrong here?  Any help
> would be appreciated.  Thanks.
> >>
> >> Alan
> >>
> >>
> >> BTW: Each of our servers has 4 hard drives and many of the errors below
> refer to the 3 drives (/media/hd2 or hd3 or hd4) reserved exclusively for
> HDFS and thus perhaps not a good place for Hadoop to be looking for a
> library file.  My slaves have 24 GB RAM, the jar file is 30 MB, and the
> sequence file being read is 400 KB - so I hope I am not running out of
> memory.
> >>
> >>
> >> 1.      RUNNING DIRECTLY FROM ECLIPSE IN HADOOP'S STANDALONE MODE -
> SUCCESS
> >>
> >>>>>> Running Face Program
> >> 11/03/04 12:44:10 INFO security.Groups: Group mapping
> impl=org.apache.hadoop.security.ShellBasedUnixGroupsMapping;
> cacheTimeout=300000
> >> 11/03/04 12:44:10 INFO jvm.JvmMetrics: Initializing JVM Metrics with
> processName=JobTracker, sessionId=
> >> 11/03/04 12:44:10 WARN mapreduce.JobSubmitter: Use GenericOptionsParser
> for parsing the arguments. Applications should implement Tool for the same.
> >> 11/03/04 12:44:10 WARN mapreduce.JobSubmitter: No job jar file set.
>  User classes may not be found. See Job or Job#setJar(String).
> >> 11/03/04 12:44:10 INFO mapred.FileInputFormat: Total input paths to
> process : 1
> >> 11/03/04 12:44:10 WARN conf.Configuration: mapred.map.tasks is
> deprecated. Instead, use mapreduce.job.maps
> >> 11/03/04 12:44:10 INFO mapreduce.JobSubmitter: number of splits:1
> >> 11/03/04 12:44:10 INFO mapreduce.JobSubmitter: adding the following
> namenodes' delegation tokens:null
> >> 11/03/04 12:44:10 WARN security.TokenCache: Overwriting existing token
> storage with # keys=0
> >> 11/03/04 12:44:10 INFO mapreduce.Job: Running job: job_local_0001
> >> 11/03/04 12:44:10 INFO mapred.LocalJobRunner: Waiting for map tasks
> >> 11/03/04 12:44:10 INFO mapred.LocalJobRunner: Starting task:
> attempt_local_0001_m_000000_0
> >> 11/03/04 12:44:10 WARN util.NativeCodeLoader: Unable to load
> native-hadoop library for your platform... using builtin-java classes where
> applicable
> >> 11/03/04 12:44:10 INFO compress.CodecPool: Got brand-new decompressor
> >> 11/03/04 12:44:10 INFO mapred.MapTask: numReduceTasks: 1
> >> 11/03/04 12:44:10 INFO mapred.MapTask: (EQUATOR) 0 kvi
> 26214396(104857584)
> >> 11/03/04 12:44:10 INFO mapred.MapTask: mapreduce.task.io.sort.mb: 100
> >> 11/03/04 12:44:10 INFO mapred.MapTask: soft limit at 83886080
> >> 11/03/04 12:44:10 INFO mapred.MapTask: bufstart = 0; bufvoid = 104857600
> >> 11/03/04 12:44:10 INFO mapred.MapTask: kvstart = 26214396; length =
> 6553600
> >> 11/03/04 12:44:11 INFO mapreduce.Job:  map 0% reduce 0%
> >> 11/03/04 12:44:16 INFO mapred.LocalJobRunner:
> file:/home/ngc/eclipse_workspace/HadoopPrograms/Images2/JPGSequenceFile.001:0+411569
> > map
> >> 11/03/04 12:44:17 INFO mapreduce.Job:  map 57% reduce 0%
> >> 11/03/04 12:44:18 INFO mapred.LocalJobRunner:
> file:/home/ngc/eclipse_workspace/HadoopPrograms/Images2/JPGSequenceFile.001:0+411569
> > map
> >> 11/03/04 12:44:18 INFO mapred.MapTask: Starting flush of map output
> >> 11/03/04 12:44:18 INFO mapred.MapTask: Spilling map output
> >> 11/03/04 12:44:18 INFO mapred.MapTask: bufstart = 0; bufend = 1454;
> bufvoid = 104857600
> >> 11/03/04 12:44:18 INFO mapred.MapTask: kvstart = 26214396(104857584);
> kvend = 26214324(104857296); length = 73/6553600
> >> 11/03/04 12:44:18 INFO mapred.MapTask: Finished spill 0
> >> 11/03/04 12:44:18 INFO mapred.Task: Task:attempt_local_0001_m_000000_0
> is done. And is in the process of commiting
> >> 11/03/04 12:44:18 INFO mapred.LocalJobRunner:
> file:/home/ngc/eclipse_workspace/HadoopPrograms/Images2/JPGSequenceFile.001:0+411569
> > sort
> >> 11/03/04 12:44:18 INFO mapred.Task: Task 'attempt_local_0001_m_000000_0'
> done.
> >> 11/03/04 12:44:18 INFO mapred.LocalJobRunner: Finishing task:
> attempt_local_0001_m_000000_0
> >> 11/03/04 12:44:18 INFO mapred.LocalJobRunner: Map task executor
> complete.
> >> 11/03/04 12:44:18 INFO mapred.Merger: Merging 1 sorted segments
> >> 11/03/04 12:44:18 INFO mapred.Merger: Down to the last merge-pass, with
> 1 segments left of total size: 1481 bytes
> >> 11/03/04 12:44:18 INFO mapred.LocalJobRunner:
> >> 11/03/04 12:44:18 INFO mapred.Task: Task:attempt_local_0001_r_000000_0
> is done. And is in the process of commiting
> >> 11/03/04 12:44:18 INFO mapred.LocalJobRunner:
> >> 11/03/04 12:44:18 INFO mapred.Task: Task attempt_local_0001_r_000000_0
> is allowed to commit now
> >> 11/03/04 12:44:18 INFO mapred.FileOutputCommitter: Saved output of task
> 'attempt_local_0001_r_000000_0' to
> file:/home/ngc/eclipse_workspace/HadoopPrograms/FaceOutput
> >> 11/03/04 12:44:18 INFO mapred.LocalJobRunner: reduce > sort
> >> 11/03/04 12:44:18 INFO mapred.Task: Task 'attempt_local_0001_r_000000_0'
> done.
> >> 11/03/04 12:44:18 INFO mapreduce.Job:  map 100% reduce 100%
> >> 11/03/04 12:44:18 INFO mapreduce.Job: Job complete: job_local_0001
> >> 11/03/04 12:44:18 INFO mapreduce.Job: Counters: 18
> >>      FileInputFormatCounters
> >>              BYTES_READ=411439
> >>      FileSystemCounters
> >>              FILE_BYTES_READ=825005
> >>              FILE_BYTES_WRITTEN=127557
> >>      Map-Reduce Framework
> >>              Combine input records=0
> >>              Combine output records=0
> >>              Failed Shuffles=0
> >>              GC time elapsed (ms)=88
> >>              Map input records=20
> >>              Map output bytes=1454
> >>              Map output records=19
> >>              Merged Map outputs=0
> >>              Reduce input groups=19
> >>              Reduce input records=19
> >>              Reduce output records=19
> >>              Reduce shuffle bytes=0
> >>              Shuffled Maps =0
> >>              Spilled Records=38
> >>              SPLIT_RAW_BYTES=127
> >>>>>> 0.036993828        img_9619.jpg 2 found at [ 35, 201, 37 ], [ 84,
> 41, 68 ],
> >> ...
> >>>>>> 0.41283935 img_538.jpg 3 found at [ 265, 44, 80 ], [ 132, 32, 101 ],
> [ 210, 119, 228 ],
> >>>>>> Job Finished in 8.679 seconds
> >>
> >> 2.      RUNNING THE SAME PROGRAM IN HADOOP'S DISTRIBUTED MODE - FAILURE
> >>
> >> ngc@hadoop1:~$ cd hadoop-0.21.0/
> >> ngc@hadoop1:~/hadoop-0.21.0$ bin/hadoop fs -ls Imag*
> >> 11/03/04 12:58:18 INFO security.Groups: Group mapping
> impl=org.apache.hadoop.security.ShellBasedUnixGroupsMapping;
> cacheTimeout=300000
> >> 11/03/04 12:58:18 WARN conf.Configuration: mapred.task.id<
> http://mapred.task.id> is deprecated. Instead, use
> mapreduce.task.attempt.id<http://mapreduce.task.attempt.id>
> >> Found 1 items
> >> -rw-r--r--   2 ngc supergroup     411569 2011-03-02 13:21
> /user/ngc/Images2/JPGSequenceFile.001
> >> ngc@hadoop1:~/hadoop-0.21.0$ bin/hadoop jar ../eclipse/HannahFace.jar
> progs/HannahFace
> >>>>>> Running Face Program
> >> 11/03/04 12:59:39 INFO security.Groups: Group mapping
> impl=org.apache.hadoop.security.ShellBasedUnixGroupsMapping;
> cacheTimeout=300000
> >> 11/03/04 12:59:40 WARN conf.Configuration: mapred.task.id<
> http://mapred.task.id> is deprecated. Instead, use
> mapreduce.task.attempt.id<http://mapreduce.task.attempt.id>
> >> 11/03/04 12:59:40 WARN mapreduce.JobSubmitter: Use GenericOptionsParser
> for parsing the arguments. Applications should implement Tool for the same.
> >> 11/03/04 12:59:40 INFO mapred.FileInputFormat: Total input paths to
> process : 1
> >> 11/03/04 12:59:40 WARN conf.Configuration: mapred.map.tasks is
> deprecated. Instead, use mapreduce.job.maps
> >> 11/03/04 12:59:40 INFO mapreduce.JobSubmitter: number of splits:100
> >> 11/03/04 12:59:40 INFO mapreduce.JobSubmitter: adding the following
> namenodes' delegation tokens:null
> >> 11/03/04 12:59:41 INFO mapreduce.Job: Running job: job_201103021428_0051
> >> 11/03/04 12:59:42 INFO mapreduce.Job:  map 0% reduce 0%
> >> 11/03/04 12:59:49 INFO mapreduce.Job:  map 7% reduce 0%
> >> 11/03/04 12:59:51 INFO mapreduce.Job:  map 10% reduce 0%
> >> 11/03/04 13:00:04 INFO mapreduce.Job:  map 12% reduce 0%
> >> 11/03/04 13:00:05 INFO mapreduce.Job:  map 16% reduce 0%
> >> 11/03/04 13:00:06 INFO mapreduce.Job:  map 28% reduce 0%
> >> 11/03/04 13:00:07 INFO mapreduce.Job:  map 37% reduce 0%
> >> 11/03/04 13:00:07 INFO mapreduce.Job: Task Id :
> attempt_201103021428_0051_m_000016_0, Status : FAILED
> >> Error:
> /tmp/hadoop-ngc/mapred/local/taskTracker/ngc/jobcache/job_201103021428_0051/attempt_201103021428_0051_m_000016_0/work/tmp/libjniopencv_highgui9051044227445275266.so:
> libopencv_highgui.so.2.2: cannot open shared object file: No such file or
> directory
> >> 11/03/04 13:00:08 INFO mapreduce.Job: Task Id :
> attempt_201103021428_0051_m_000044_0, Status : FAILED
> >> Error:
> /media/hd3/mapred/local/taskTracker/ngc/jobcache/job_201103021428_0051/attempt_201103021428_0051_m_000044_0/work/tmp/libjniopencv_highgui6446098204420446112.so:
> libopencv_highgui.so.2.2: cannot open shared object file: No such file or
> directory
> >> 11/03/04 13:00:09 INFO mapreduce.Job:  map 47% reduce 0%
> >> 11/03/04 13:00:09 INFO mapreduce.Job: Task Id :
> attempt_201103021428_0051_m_000048_0, Status : FAILED
> >> Error:
> /media/hd2/mapred/local/taskTracker/ngc/jobcache/job_201103021428_0051/attempt_201103021428_0051_m_000048_0/work/tmp/libjniopencv_objdetect3671939282732993726.so:
> /media/hd2/mapred/local/taskTracker/ngc/jobcache/job_201103021428_0051/attempt_201103021428_0051_m_000048_0/work/tmp/libjniopencv_objdetect3671939282732993726.so:
> failed to map segment from shared object: Cannot allocate memory
> >> 11/03/04 13:00:09 INFO mapreduce.Job: Task Id :
> attempt_201103021428_0051_m_000052_0, Status : FAILED
> >> Error:
> /tmp/hadoop-ngc/mapred/local/taskTracker/ngc/jobcache/job_201103021428_0051/attempt_201103021428_0051_m_000052_0/work/tmp/libjniopencv_highgui1579426900682939358.so:
> libopencv_highgui.so.2.2: cannot open shared object file: No such file or
> directory
> >> 11/03/04 13:00:10 INFO mapreduce.Job: Task Id :
> attempt_201103021428_0051_m_000001_0, Status : FAILED
> >> Error:
> /media/hd2/mapred/local/taskTracker/ngc/jobcache/job_201103021428_0051/attempt_201103021428_0051_m_000001_0/work/tmp/libjniopencv_objdetect3457632677367330581.so:
> /media/hd2/mapred/local/taskTracker/ngc/jobcache/job_201103021428_0051/attempt_201103021428_0051_m_000001_0/work/tmp/libjniopencv_objdetect3457632677367330581.so:
> failed to map segment from shared object: Cannot allocate memory
> >> attempt_201103021428_0051_m_000001_0: Java HotSpot(TM) 64-Bit Server VM
> warning: Exception java.lang.OutOfMemoryError occurred dispatching signal
> SIGTERM to handler- the VM may need to be forcibly terminated
> >> 11/03/04 13:00:11 INFO mapreduce.Job: Task Id :
> attempt_201103021428_0051_m_000022_0, Status : FAILED
> >> 11/03/04 13:00:12 INFO mapreduce.Job: Task Id :
> attempt_201103021428_0051_m_000026_0, Status : FAILED
> >> 11/03/04 13:00:13 INFO mapreduce.Job: Task Id :
> attempt_201103021428_0051_m_000006_0, Status : FAILED
> >> 11/03/04 13:00:14 INFO mapreduce.Job: Task Id :
> attempt_201103021428_0051_m_000011_0, Status : FAILED
> >> 11/03/04 13:00:16 INFO mapreduce.Job:  map 48% reduce 0%
> >> 11/03/04 13:00:17 INFO mapreduce.Job:  map 57% reduce 0%
> >> 11/03/04 13:00:18 INFO mapreduce.Job:  map 59% reduce 0%
> >> 11/03/04 13:00:18 INFO mapreduce.Job: Task Id :
> attempt_201103021428_0051_m_000061_0, Status : FAILED
> >> Error:
> /media/hd3/mapred/local/taskTracker/ngc/jobcache/job_201103021428_0051/attempt_201103021428_0051_m_000061_0/work/tmp/libjniopencv_highgui3743962684984755257.so:
> libopencv_highgui.so.2.2: cannot open shared object file: No such file or
> directory
> >> 11/03/04 13:00:19 INFO mapreduce.Job:  map 68% reduce 0%
> >> 11/03/04 13:00:19 INFO mapreduce.Job: Task Id :
> attempt_201103021428_0051_m_000034_0, Status : FAILED
> >> 11/03/04 13:00:20 INFO mapreduce.Job:  map 68% reduce 15%
> >> 11/03/04 13:00:20 INFO mapreduce.Job: Task Id :
> attempt_201103021428_0051_m_000039_0, Status : FAILED
> >> 11/03/04 13:00:21 INFO mapreduce.Job: Task Id :
> attempt_201103021428_0051_m_000076_0, Status : FAILED
> >> Error:
> /media/hd4/mapred/local/taskTracker/ngc/jobcache/job_201103021428_0051/attempt_201103021428_0051_m_000076_0/work/tmp/libjniopencv_highgui3438076786756619584.so:
> libopencv_highgui.so.2.2: cannot open shared object file: No such file or
> directory
> >> 11/03/04 13:00:22 INFO mapreduce.Job: Task Id :
> attempt_201103021428_0051_m_000057_0, Status : FAILED
> >> 11/03/04 13:00:23 INFO mapreduce.Job:  map 68% reduce 23%
> >> 11/03/04 13:00:23 INFO mapreduce.Job: Task Id :
> attempt_201103021428_0051_m_000065_0, Status : FAILED
> >> 11/03/04 13:00:24 INFO mapreduce.Job: Task Id :
> attempt_201103021428_0051_m_000069_0, Status : FAILED
> >> 11/03/04 13:00:25 INFO mapreduce.Job: Task Id :
> attempt_201103021428_0051_m_000082_0, Status : FAILED
> >> 11/03/04 13:00:36 INFO mapreduce.Job: Task Id :
> attempt_201103021428_0051_m_000082_1, Status : FAILED
> >> Error:
> /media/hd4/mapred/local/taskTracker/ngc/jobcache/job_201103021428_0051/attempt_201103021428_0051_m_000082_1/work/tmp/libjniopencv_highgui7180733690064994995.so:
> libopencv_highgui.so.2.2: cannot open shared object file: No such file or
> directory
> >> 11/03/04 13:00:39 INFO mapreduce.Job: Task Id :
> attempt_201103021428_0051_m_000011_1, Status : FAILED
> >> Error:
> /media/hd4/mapred/local/taskTracker/ngc/jobcache/job_201103021428_0051/attempt_201103021428_0051_m_000011_1/work/tmp/libjniopencv_highgui8978195121954363506.so:
> libopencv_highgui.so.2.2: cannot open shared object file: No such file or
> directory
> >> 11/03/04 13:00:41 INFO mapreduce.Job:  map 73% reduce 23%
> >> 11/03/04 13:00:42 INFO mapreduce.Job: Task Id :
> attempt_201103021428_0051_m_000069_1, Status : FAILED
> >> 11/03/04 13:00:43 INFO mapreduce.Job:  map 73% reduce 24%
> >> 11/03/04 13:00:43 INFO mapreduce.Job: Task Id :
> attempt_201103021428_0051_m_000048_1, Status : FAILED
> >> Error:
> /media/hd2/mapred/local/taskTracker/ngc/jobcache/job_201103021428_0051/attempt_201103021428_0051_m_000048_1/work/tmp/libjniopencv_highgui7269142293373011624.so:
> libopencv_highgui.so.2.2: cannot open shared object file: No such file or
> directory
> >> 11/03/04 13:00:44 INFO mapreduce.Job:  map 74% reduce 24%
> >> 11/03/04 13:00:46 INFO mapreduce.Job:  map 74% reduce 25%
> >> 11/03/04 13:00:46 INFO mapreduce.Job: Task Id :
> attempt_201103021428_0051_m_000001_1, Status : FAILED
> >> 11/03/04 13:00:47 INFO mapreduce.Job:  map 75% reduce 25%
> >> 11/03/04 13:00:48 INFO mapreduce.Job: Task Id :
> attempt_201103021428_0051_m_000052_1, Status : FAILED
> >> 11/03/04 13:00:49 INFO mapreduce.Job: Task Id :
> attempt_201103021428_0051_m_000044_1, Status : FAILED
> >> 11/03/04 13:00:49 INFO mapreduce.Job: Task Id :
> attempt_201103021428_0051_m_000087_0, Status : FAILED
> >> 11/03/04 13:00:49 INFO mapreduce.Job: Task Id :
> attempt_201103021428_0051_m_000011_2, Status : FAILED
> >> Error:
> /tmp/hadoop-ngc/mapred/local/taskTracker/ngc/jobcache/job_201103021428_0051/attempt_201103021428_0051_m_000011_2/work/tmp/libjniopencv_highgui6941559715123481707.so:
> libopencv_highgui.so.2.2: cannot open shared object file: No such file or
> directory
> >> 11/03/04 13:00:50 INFO mapreduce.Job:  map 79% reduce 25%
> >> 11/03/04 13:00:51 INFO mapreduce.Job: Task Id :
> attempt_201103021428_0051_m_000006_1, Status : FAILED
> >> Error:
> /media/hd2/mapred/local/taskTracker/ngc/jobcache/job_201103021428_0051/attempt_201103021428_0051_m_000006_1/work/tmp/libjniopencv_highgui72992299570368055.so:
> libopencv_highgui.so.2.2: cannot open shared object file: No such file or
> directory
> >> 11/03/04 13:00:52 INFO mapreduce.Job: Task Id :
> attempt_201103021428_0051_m_000026_1, Status : FAILED
> >> 11/03/04 13:00:54 INFO mapreduce.Job: Task Id :
> attempt_201103021428_0051_m_000022_1, Status : FAILED
> >> 11/03/04 13:00:55 INFO mapreduce.Job:  map 79% reduce 26%
> >> 11/03/04 13:00:55 INFO mapreduce.Job: Task Id :
> attempt_201103021428_0051_m_000016_1, Status : FAILED
> >> 11/03/04 13:00:57 INFO mapreduce.Job: Task Id :
> attempt_201103021428_0051_m_000091_0, Status : FAILED
> >> 11/03/04 13:00:58 INFO mapreduce.Job: Task Id :
> attempt_201103021428_0051_m_000096_0, Status : FAILED
> >> 11/03/04 13:00:58 INFO mapreduce.Job: Task Id :
> attempt_201103021428_0051_m_000034_1, Status : FAILED
> >> Error:
> /media/hd2/mapred/local/taskTracker/ngc/jobcache/job_201103021428_0051/attempt_201103021428_0051_m_000034_1/work/tmp/libjniopencv_highgui3721618225395918920.so:
> libopencv_highgui.so.2.2: cannot open shared object file: No such file or
> directory
> >> 11/03/04 13:01:01 INFO mapreduce.Job: Task Id :
> attempt_201103021428_0051_m_000065_1, Status : FAILED
> >> 11/03/04 13:01:03 INFO mapreduce.Job: Task Id :
> attempt_201103021428_0051_m_000057_1, Status : FAILED
> >> 11/03/04 13:01:04 INFO mapreduce.Job: Task Id :
> attempt_201103021428_0051_m_000076_1, Status : FAILED
> >> 11/03/04 13:01:06 INFO mapreduce.Job: Task Id :
> attempt_201103021428_0051_m_000039_1, Status : FAILED
> >> 11/03/04 13:01:07 INFO mapreduce.Job: Task Id :
> attempt_201103021428_0051_m_000061_1, Status : FAILED
> >> 11/03/04 13:01:09 INFO mapreduce.Job: Task Id :
> attempt_201103021428_0051_m_000069_2, Status : FAILED
> >> Error:
> /media/hd3/mapred/local/taskTracker/ngc/jobcache/job_201103021428_0051/attempt_201103021428_0051_m_000069_2/work/tmp/libjniopencv_highgui8910946496817753039.so:
> libopencv_highgui.so.2.2: cannot open shared object file: No such file or
> directory
> >> 11/03/04 13:01:10 INFO mapreduce.Job: Task Id :
> attempt_201103021428_0051_m_000082_2, Status : FAILED
> >> 11/03/04 13:01:15 INFO mapreduce.Job: Job complete:
> job_201103021428_0051
> >> 11/03/04 13:01:15 INFO mapreduce.Job: Counters: 21
> >>      FileInputFormatCounters
> >>              BYTES_READ=0
> >>      FileSystemCounters
> >>              FILE_BYTES_WRITTEN=3040
> >>              HDFS_BYTES_READ=1048281
> >>      Job Counters
> >>              Data-local map tasks=51
> >>              Total time spent by all maps waiting after reserving slots
> (ms)=0
> >>              Total time spent by all reduces waiting after reserving
> slots (ms)=0
> >>              Failed map tasks=1
> >>              Rack-local map tasks=86
> >>              SLOTS_MILLIS_MAPS=1091359
> >>              SLOTS_MILLIS_REDUCES=0
> >>              Launched map tasks=137
> >>              Launched reduce tasks=2
> >>      Map-Reduce Framework
> >>              Combine input records=0
> >>              Failed Shuffles=0
> >>              GC time elapsed (ms)=0
> >>              Map input records=0
> >>              Map output bytes=0
> >>              Map output records=0
> >>              Merged Map outputs=0
> >>              Spilled Records=0
> >>              SPLIT_RAW_BYTES=8960
> >> Exception in thread "main" java.io.IOException: Job failed!
> >>      at org.apache.hadoop.mapred.JobClient.runJob(JobClient.java:782)
> >>      at progs.HannahFace.run(HannahFace.java:137)
> >>      at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:69)
> >>      at progs.HannahFace.main(HannahFace.java:162)
> >>      at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> >>      at
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
> >>      at
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
> >>      at java.lang.reflect.Method.invoke(Method.java:597)
> >>      at org.apache.hadoop.util.RunJar.main(RunJar.java:192)
> >>
> >> 3.      SAME COMMAND BUT AFTER I MODIFIED MAPRED-SITE.XML - FAILURE
> >>
> >> ngc@hadoop1:~/hadoop-0.21.0$ bin/hadoop jar ../eclipse/HannahFace.jar
> progs/HannahFace
> >>>>>> Running Face Program
> >> 11/03/04 15:07:11 INFO security.Groups: Group mapping
> impl=org.apache.hadoop.security.ShellBasedUnixGroupsMapping;
> cacheTimeout=300000
> >> 11/03/04 15:07:11 WARN conf.Configuration: mapred.task.id<
> http://mapred.task.id> is deprecated. Instead, use
> mapreduce.task.attempt.id<http://mapreduce.task.attempt.id>
> >> 11/03/04 15:07:11 WARN mapreduce.JobSubmitter: Use GenericOptionsParser
> for parsing the arguments. Applications should implement Tool for the same.
> >> 11/03/04 15:07:12 INFO mapred.FileInputFormat: Total input paths to
> process : 1
> >> 11/03/04 15:07:12 WARN conf.Configuration: mapred.map.tasks is
> deprecated. Instead, use mapreduce.job.maps
> >> 11/03/04 15:07:12 INFO mapreduce.JobSubmitter: number of splits:100
> >> 11/03/04 15:07:12 INFO mapreduce.JobSubmitter: adding the following
> namenodes' delegation tokens:null
> >> 11/03/04 15:07:12 INFO mapreduce.Job: Running job: job_201103021428_0069
> >> 11/03/04 15:07:13 INFO mapreduce.Job:  map 0% reduce 0%
> >> 11/03/04 15:07:20 INFO mapreduce.Job:  map 6% reduce 0%
> >> 11/03/04 15:07:21 INFO mapreduce.Job:  map 10% reduce 0%
> >> 11/03/04 15:07:36 INFO mapreduce.Job:  map 18% reduce 0%
> >> 11/03/04 15:07:38 INFO mapreduce.Job:  map 28% reduce 0%
> >> 11/03/04 15:07:38 INFO mapreduce.Job: Task Id :
> attempt_201103021428_0069_m_000016_0, Status : FAILED
> >> Error:
> /media/hd4/mapred/local/taskTracker/ngc/jobcache/job_201103021428_0069/attempt_201103021428_0069_m_000016_0/work/tmp/libjniopencv_highgui4138482228584845301.so:
> libxcb.so.1: failed to map segment from shared object: Cannot allocate
> memory
> >> 11/03/04 15:07:39 INFO mapreduce.Job:  map 35% reduce 0%
> >> 11/03/04 15:07:40 INFO mapreduce.Job: Task Id :
> attempt_201103021428_0069_m_000001_0, Status : FAILED
> >> Error:
> /media/hd3/mapred/local/taskTracker/ngc/jobcache/job_201103021428_0069/attempt_201103021428_0069_m_000001_0/work/tmp/libjniopencv_highgui2385564746644347958.so:
> libXau.so.6: failed to map segment from shared object: Cannot allocate
> memory
> >> 11/03/04 15:07:42 INFO mapreduce.Job:  map 39% reduce 0%
> >> 11/03/04 15:07:42 INFO mapreduce.Job: Task Id :
> attempt_201103021428_0069_m_000022_0, Status : FAILED
> >> 11/03/04 15:07:44 INFO mapreduce.Job:  map 50% reduce 0%
> >> 11/03/04 15:07:44 INFO mapreduce.Job: Task Id :
> attempt_201103021428_0069_m_000026_0, Status : FAILED
> >> 11/03/04 15:07:45 INFO mapreduce.Job: Task Id :
> attempt_201103021428_0069_m_000011_0, Status : FAILED
> >> 11/03/04 15:07:46 INFO mapreduce.Job: Task Id :
> attempt_201103021428_0069_m_000034_0, Status : FAILED
> >> 11/03/04 15:07:47 INFO mapreduce.Job:  map 50% reduce 13%
> >> 11/03/04 15:07:47 INFO mapreduce.Job: Task Id :
> attempt_201103021428_0069_m_000039_0, Status : FAILED
> >> 11/03/04 15:07:48 INFO mapreduce.Job:  map 59% reduce 13%
> >> 11/03/04 15:07:48 INFO mapreduce.Job: Task Id :
> attempt_201103021428_0069_m_000082_0, Status : FAILED
> >> Error:
> /media/hd2/mapred/local/taskTracker/ngc/jobcache/job_201103021428_0069/attempt_201103021428_0069_m_000082_0/work/tmp/libjniopencv_highgui2586824844718343743.so:
> libxcb-render.so.0: failed to map segment from shared object: Cannot
> allocate memory
> >> 11/03/04 15:07:50 INFO mapreduce.Job: Task Id :
> attempt_201103021428_0069_m_000006_0, Status : FAILED
> >> 11/03/04 15:07:51 INFO mapreduce.Job:  map 67% reduce 13%
> >> 11/03/04 15:07:51 INFO mapreduce.Job: Task Id :
> attempt_201103021428_0069_m_000044_0, Status : FAILED
> >> 11/03/04 15:07:51 INFO mapreduce.Job: Task Id :
> attempt_201103021428_0069_m_000048_0, Status : FAILED
> >> 11/03/04 15:07:53 INFO mapreduce.Job: Task Id :
> attempt_201103021428_0069_m_000052_0, Status : FAILED
> >> 11/03/04 15:07:53 INFO mapreduce.Job: Task Id :
> attempt_201103021428_0069_m_000076_0, Status : FAILED
> >> Error:
> /media/hd3/mapred/local/taskTracker/ngc/jobcache/job_201103021428_0069/attempt_201103021428_0069_m_000076_0/work/tmp/libjniopencv_highgui6607923753832414434.so:
> libfontconfig.so.1: failed to map segment from shared object: Cannot
> allocate memory
> >> 11/03/04 15:07:54 INFO mapreduce.Job: Task Id :
> attempt_201103021428_0069_m_000057_0, Status : FAILED
> >> 11/03/04 15:07:56 INFO mapreduce.Job: Task Id :
> attempt_201103021428_0069_m_000061_0, Status : FAILED
> >> 11/03/04 15:07:57 INFO mapreduce.Job: Task Id :
> attempt_201103021428_0069_m_000065_0, Status : FAILED
> >> 11/03/04 15:07:59 INFO mapreduce.Job: Task Id :
> attempt_201103021428_0069_m_000069_0, Status : FAILED
> >> attempt_201103021428_0069_m_000069_0: Java HotSpot(TM) 64-Bit Server VM
> warning: Exception java.lang.OutOfMemoryError occurred dispatching signal
> SIGTERM to handler- the VM may need to be forcibly terminated
> >>
> >>
> >>
> >>
> >
>
>

Re: EXT :Re: Problem running a Hadoop program with external libraries

Posted by Lance Norskog <go...@gmail.com>.

I have never heard of putting a native code shared library in a Java jar. I doubt that it works. But it's a cool idea!

A Unix binary program loads shared libraries from the paths given in the environment variable LD_LIBRARY_PATH. This has to be set to the directory with the OpenCV .so file when you start Java.

Lance

On Mar 4, 2011, at 2:13 PM, Brian Bockelman wrote:

> Hi,
> 
> Check your kernel's overcommit settings.  This will prevent the JVM from allocating memory even when there's free RAM.
> 
> Brian
> 
> On Mar 4, 2011, at 3:55 PM, Ratner, Alan S (IS) wrote:
> 
>> Aaron,
>> 
>>  Thanks for the rapid responses.
>> 
>> 
>> *         "ulimit -u unlimited" is in .bashrc.
>> 
>> 
>> *         HADOOP_HEAPSIZE is set to 4000 MB in hadoop-env.sh
>> 
>> 
>> *         Mapred.child.ulimit is set to 2048000 in mapred-site.xml
>> 
>> 
>> *         Mapred.child.java.opts is set to -Xmx1536m in mapred-site.xml
>> 
>>  I take it you are suggesting that I change the java.opts command to:
>> 
>> Mapred.child.java.opts is <value> -Xmx1536m -Djava.library.path=/path/to/native/libs </value>
>> 
>> 
>> Alan Ratner
>> Northrop Grumman Information Systems
>> Manager of Large-Scale Computing
>> 9020 Junction Drive
>> Annapolis Junction, MD 20701
>> (410) 707-8605 (cell)
>> 
>> From: Aaron Kimball [mailto:akimball83@gmail.com]
>> Sent: Friday, March 04, 2011 4:30 PM
>> To: common-user@hadoop.apache.org
>> Cc: Ratner, Alan S (IS)
>> Subject: EXT :Re: Problem running a Hadoop program with external libraries
>> 
>> Actually, I just misread your email and missed the difference between your 2nd and 3rd attempts.
>> 
>> Are you enforcing min/max JVM heap sizes on your tasks? Are you enforcing a ulimit (either through your shell configuration, or through Hadoop itself)? I don't know where these "cannot allocate memory" errors are coming from. If they're from the OS, could it be because it needs to fork() and momentarily exceed the ulimit before loading the native libs?
>> 
>> - Aaron
>> 
>> On Fri, Mar 4, 2011 at 1:26 PM, Aaron Kimball <ak...@gmail.com>> wrote:
>> I don't know if putting native-code .so files inside a jar works. A native-code .so is not "classloaded" in the same way .class files are.
>> 
>> So the correct .so files probably need to exist in some physical directory on the worker machines. You may want to doublecheck that the correct directory on the worker machines is identified in the JVM property 'java.library.path' (instead of / in addition to $LD_LIBRARY_PATH). This can be manipulated in the Hadoop configuration setting mapred.child.java.opts (include '-Djava.library.path=/path/to/native/libs' in the string there.)
>> 
>> Also, if you added your .so files to a directory that is already used by the tasktracker (like hadoop-0.21.0/lib/native/Linux-amd64-64/), you may need to restart the tasktracker instance for it to take effect. (This is true of .jar files in the $HADOOP_HOME/lib directory; I don't know if it is true for native libs as well.)
>> 
>> - Aaron
>> 
>> On Fri, Mar 4, 2011 at 12:53 PM, Ratner, Alan S (IS) <Al...@ngc.com>> wrote:
>> We are having difficulties running a Hadoop program making calls to external libraries - but this occurs only when we run the program on our cluster and not from within Eclipse where we are apparently running in Hadoop's standalone mode.  This program invokes the Open Computer Vision libraries (OpenCV and JavaCV).  (I don't think there is a problem with our cluster - we've run many Hadoop jobs on it without difficulty.)
>> 
>> 1.      I normally use Eclipse to create jar files for our Hadoop programs but I inadvertently hit the "run as Java application" button and the program ran fine, reading the input file from the eclipse workspace rather than HDFS and writing the output file to the same place.  Hadoop's output appears below.  (This occurred on the master Hadoop server.)
>> 
>> 2.      I then "exported" from Eclipse a "runnable jar" which "extracted required libraries" into the generated jar - presumably producing a jar file that incorporated all the required library functions. (The plain jar file for this program is 17 kB while the runnable jar is 30MB.)  When I try to run this on my Hadoop cluster (including my master and slave servers) the program reports that it is unable to locate "libopencv_highgui.so.2.2: cannot open shared object file: No such file or directory".  Now, in addition to this library being incorporated inside the runnable jar file it is also present on each of my servers at hadoop-0.21.0/lib/native/Linux-amd64-64/ where we have loaded the same libraries (to give Hadoop 2 shots at finding them).  These include:
>>    ...
>>    libopencv_highgui_pch_dephelp.a
>>    libopencv_highgui.so
>>    libopencv_highgui.so.2.2
>>    libopencv_highgui.so.2.2.0
>>    ...
>> 
>>    When I poke around inside the runnable jar I find javacv_linux-x86_64.jar which contains:
>>    com/googlecode/javacv/cpp/linux-x86_64/libjniopencv_highgui.so
>> 
>> 3.      I then tried adding the following to mapred-site.xml as suggested in Patch 2838 that's supposed to be included in hadoop 0.21 https://issues.apache.org/jira/browse/HADOOP-2838
>>    <property>
>>      <name>mapred.child.env</name>
>>      <value>LD_LIBRARY_PATH=/home/ngc/hadoop-0.21.0/lib/native/Linux-amd64-64</value>
>>    </property>
>>    The log is included at the bottom of this email with Hadoop now complaining about a different missing library with an out-of-memory error.
>> 
>> Does anyone have any ideas as to what is going wrong here?  Any help would be appreciated.  Thanks.
>> 
>> Alan
>> 
>> 
>> BTW: Each of our servers has 4 hard drives and many of the errors below refer to the 3 drives (/media/hd2 or hd3 or hd4) reserved exclusively for HDFS and thus perhaps not a good place for Hadoop to be looking for a library file.  My slaves have 24 GB RAM, the jar file is 30 MB, and the sequence file being read is 400 KB - so I hope I am not running out of memory.
>> 
>> 
>> 1.      RUNNING DIRECTLY FROM ECLIPSE IN HADOOP'S STANDALONE MODE - SUCCESS
>> 
>>>>>> Running Face Program
>> 11/03/04 12:44:10 INFO security.Groups: Group mapping impl=org.apache.hadoop.security.ShellBasedUnixGroupsMapping; cacheTimeout=300000
>> 11/03/04 12:44:10 INFO jvm.JvmMetrics: Initializing JVM Metrics with processName=JobTracker, sessionId=
>> 11/03/04 12:44:10 WARN mapreduce.JobSubmitter: Use GenericOptionsParser for parsing the arguments. Applications should implement Tool for the same.
>> 11/03/04 12:44:10 WARN mapreduce.JobSubmitter: No job jar file set.  User classes may not be found. See Job or Job#setJar(String).
>> 11/03/04 12:44:10 INFO mapred.FileInputFormat: Total input paths to process : 1
>> 11/03/04 12:44:10 WARN conf.Configuration: mapred.map.tasks is deprecated. Instead, use mapreduce.job.maps
>> 11/03/04 12:44:10 INFO mapreduce.JobSubmitter: number of splits:1
>> 11/03/04 12:44:10 INFO mapreduce.JobSubmitter: adding the following namenodes' delegation tokens:null
>> 11/03/04 12:44:10 WARN security.TokenCache: Overwriting existing token storage with # keys=0
>> 11/03/04 12:44:10 INFO mapreduce.Job: Running job: job_local_0001
>> 11/03/04 12:44:10 INFO mapred.LocalJobRunner: Waiting for map tasks
>> 11/03/04 12:44:10 INFO mapred.LocalJobRunner: Starting task: attempt_local_0001_m_000000_0
>> 11/03/04 12:44:10 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
>> 11/03/04 12:44:10 INFO compress.CodecPool: Got brand-new decompressor
>> 11/03/04 12:44:10 INFO mapred.MapTask: numReduceTasks: 1
>> 11/03/04 12:44:10 INFO mapred.MapTask: (EQUATOR) 0 kvi 26214396(104857584)
>> 11/03/04 12:44:10 INFO mapred.MapTask: mapreduce.task.io.sort.mb: 100
>> 11/03/04 12:44:10 INFO mapred.MapTask: soft limit at 83886080
>> 11/03/04 12:44:10 INFO mapred.MapTask: bufstart = 0; bufvoid = 104857600
>> 11/03/04 12:44:10 INFO mapred.MapTask: kvstart = 26214396; length = 6553600
>> 11/03/04 12:44:11 INFO mapreduce.Job:  map 0% reduce 0%
>> 11/03/04 12:44:16 INFO mapred.LocalJobRunner: file:/home/ngc/eclipse_workspace/HadoopPrograms/Images2/JPGSequenceFile.001:0+411569 > map
>> 11/03/04 12:44:17 INFO mapreduce.Job:  map 57% reduce 0%
>> 11/03/04 12:44:18 INFO mapred.LocalJobRunner: file:/home/ngc/eclipse_workspace/HadoopPrograms/Images2/JPGSequenceFile.001:0+411569 > map
>> 11/03/04 12:44:18 INFO mapred.MapTask: Starting flush of map output
>> 11/03/04 12:44:18 INFO mapred.MapTask: Spilling map output
>> 11/03/04 12:44:18 INFO mapred.MapTask: bufstart = 0; bufend = 1454; bufvoid = 104857600
>> 11/03/04 12:44:18 INFO mapred.MapTask: kvstart = 26214396(104857584); kvend = 26214324(104857296); length = 73/6553600
>> 11/03/04 12:44:18 INFO mapred.MapTask: Finished spill 0
>> 11/03/04 12:44:18 INFO mapred.Task: Task:attempt_local_0001_m_000000_0 is done. And is in the process of commiting
>> 11/03/04 12:44:18 INFO mapred.LocalJobRunner: file:/home/ngc/eclipse_workspace/HadoopPrograms/Images2/JPGSequenceFile.001:0+411569 > sort
>> 11/03/04 12:44:18 INFO mapred.Task: Task 'attempt_local_0001_m_000000_0' done.
>> 11/03/04 12:44:18 INFO mapred.LocalJobRunner: Finishing task: attempt_local_0001_m_000000_0
>> 11/03/04 12:44:18 INFO mapred.LocalJobRunner: Map task executor complete.
>> 11/03/04 12:44:18 INFO mapred.Merger: Merging 1 sorted segments
>> 11/03/04 12:44:18 INFO mapred.Merger: Down to the last merge-pass, with 1 segments left of total size: 1481 bytes
>> 11/03/04 12:44:18 INFO mapred.LocalJobRunner:
>> 11/03/04 12:44:18 INFO mapred.Task: Task:attempt_local_0001_r_000000_0 is done. And is in the process of commiting
>> 11/03/04 12:44:18 INFO mapred.LocalJobRunner:
>> 11/03/04 12:44:18 INFO mapred.Task: Task attempt_local_0001_r_000000_0 is allowed to commit now
>> 11/03/04 12:44:18 INFO mapred.FileOutputCommitter: Saved output of task 'attempt_local_0001_r_000000_0' to file:/home/ngc/eclipse_workspace/HadoopPrograms/FaceOutput
>> 11/03/04 12:44:18 INFO mapred.LocalJobRunner: reduce > sort
>> 11/03/04 12:44:18 INFO mapred.Task: Task 'attempt_local_0001_r_000000_0' done.
>> 11/03/04 12:44:18 INFO mapreduce.Job:  map 100% reduce 100%
>> 11/03/04 12:44:18 INFO mapreduce.Job: Job complete: job_local_0001
>> 11/03/04 12:44:18 INFO mapreduce.Job: Counters: 18
>>      FileInputFormatCounters
>>              BYTES_READ=411439
>>      FileSystemCounters
>>              FILE_BYTES_READ=825005
>>              FILE_BYTES_WRITTEN=127557
>>      Map-Reduce Framework
>>              Combine input records=0
>>              Combine output records=0
>>              Failed Shuffles=0
>>              GC time elapsed (ms)=88
>>              Map input records=20
>>              Map output bytes=1454
>>              Map output records=19
>>              Merged Map outputs=0
>>              Reduce input groups=19
>>              Reduce input records=19
>>              Reduce output records=19
>>              Reduce shuffle bytes=0
>>              Shuffled Maps =0
>>              Spilled Records=38
>>              SPLIT_RAW_BYTES=127
>>>>>> 0.036993828        img_9619.jpg 2 found at [ 35, 201, 37 ], [ 84, 41, 68 ],
>> ...
>>>>>> 0.41283935 img_538.jpg 3 found at [ 265, 44, 80 ], [ 132, 32, 101 ], [ 210, 119, 228 ],
>>>>>> Job Finished in 8.679 seconds
>> 
>> 2.      RUNNING THE SAME PROGRAM IN HADOOP'S DISTRIBUTED MODE - FAILURE
>> 
>> ngc@hadoop1:~$ cd hadoop-0.21.0/
>> ngc@hadoop1:~/hadoop-0.21.0$ bin/hadoop fs -ls Imag*
>> 11/03/04 12:58:18 INFO security.Groups: Group mapping impl=org.apache.hadoop.security.ShellBasedUnixGroupsMapping; cacheTimeout=300000
>> 11/03/04 12:58:18 WARN conf.Configuration: mapred.task.id<http://mapred.task.id> is deprecated. Instead, use mapreduce.task.attempt.id<http://mapreduce.task.attempt.id>
>> Found 1 items
>> -rw-r--r--   2 ngc supergroup     411569 2011-03-02 13:21 /user/ngc/Images2/JPGSequenceFile.001
>> ngc@hadoop1:~/hadoop-0.21.0$ bin/hadoop jar ../eclipse/HannahFace.jar progs/HannahFace
>>>>>> Running Face Program
>> 11/03/04 12:59:39 INFO security.Groups: Group mapping impl=org.apache.hadoop.security.ShellBasedUnixGroupsMapping; cacheTimeout=300000
>> 11/03/04 12:59:40 WARN conf.Configuration: mapred.task.id<http://mapred.task.id> is deprecated. Instead, use mapreduce.task.attempt.id<http://mapreduce.task.attempt.id>
>> 11/03/04 12:59:40 WARN mapreduce.JobSubmitter: Use GenericOptionsParser for parsing the arguments. Applications should implement Tool for the same.
>> 11/03/04 12:59:40 INFO mapred.FileInputFormat: Total input paths to process : 1
>> 11/03/04 12:59:40 WARN conf.Configuration: mapred.map.tasks is deprecated. Instead, use mapreduce.job.maps
>> 11/03/04 12:59:40 INFO mapreduce.JobSubmitter: number of splits:100
>> 11/03/04 12:59:40 INFO mapreduce.JobSubmitter: adding the following namenodes' delegation tokens:null
>> 11/03/04 12:59:41 INFO mapreduce.Job: Running job: job_201103021428_0051
>> 11/03/04 12:59:42 INFO mapreduce.Job:  map 0% reduce 0%
>> 11/03/04 12:59:49 INFO mapreduce.Job:  map 7% reduce 0%
>> 11/03/04 12:59:51 INFO mapreduce.Job:  map 10% reduce 0%
>> 11/03/04 13:00:04 INFO mapreduce.Job:  map 12% reduce 0%
>> 11/03/04 13:00:05 INFO mapreduce.Job:  map 16% reduce 0%
>> 11/03/04 13:00:06 INFO mapreduce.Job:  map 28% reduce 0%
>> 11/03/04 13:00:07 INFO mapreduce.Job:  map 37% reduce 0%
>> 11/03/04 13:00:07 INFO mapreduce.Job: Task Id : attempt_201103021428_0051_m_000016_0, Status : FAILED
>> Error: /tmp/hadoop-ngc/mapred/local/taskTracker/ngc/jobcache/job_201103021428_0051/attempt_201103021428_0051_m_000016_0/work/tmp/libjniopencv_highgui9051044227445275266.so: libopencv_highgui.so.2.2: cannot open shared object file: No such file or directory
>> 11/03/04 13:00:08 INFO mapreduce.Job: Task Id : attempt_201103021428_0051_m_000044_0, Status : FAILED
>> Error: /media/hd3/mapred/local/taskTracker/ngc/jobcache/job_201103021428_0051/attempt_201103021428_0051_m_000044_0/work/tmp/libjniopencv_highgui6446098204420446112.so: libopencv_highgui.so.2.2: cannot open shared object file: No such file or directory
>> 11/03/04 13:00:09 INFO mapreduce.Job:  map 47% reduce 0%
>> 11/03/04 13:00:09 INFO mapreduce.Job: Task Id : attempt_201103021428_0051_m_000048_0, Status : FAILED
>> Error: /media/hd2/mapred/local/taskTracker/ngc/jobcache/job_201103021428_0051/attempt_201103021428_0051_m_000048_0/work/tmp/libjniopencv_objdetect3671939282732993726.so: /media/hd2/mapred/local/taskTracker/ngc/jobcache/job_201103021428_0051/attempt_201103021428_0051_m_000048_0/work/tmp/libjniopencv_objdetect3671939282732993726.so: failed to map segment from shared object: Cannot allocate memory
>> 11/03/04 13:00:09 INFO mapreduce.Job: Task Id : attempt_201103021428_0051_m_000052_0, Status : FAILED
>> Error: /tmp/hadoop-ngc/mapred/local/taskTracker/ngc/jobcache/job_201103021428_0051/attempt_201103021428_0051_m_000052_0/work/tmp/libjniopencv_highgui1579426900682939358.so: libopencv_highgui.so.2.2: cannot open shared object file: No such file or directory
>> 11/03/04 13:00:10 INFO mapreduce.Job: Task Id : attempt_201103021428_0051_m_000001_0, Status : FAILED
>> Error: /media/hd2/mapred/local/taskTracker/ngc/jobcache/job_201103021428_0051/attempt_201103021428_0051_m_000001_0/work/tmp/libjniopencv_objdetect3457632677367330581.so: /media/hd2/mapred/local/taskTracker/ngc/jobcache/job_201103021428_0051/attempt_201103021428_0051_m_000001_0/work/tmp/libjniopencv_objdetect3457632677367330581.so: failed to map segment from shared object: Cannot allocate memory
>> attempt_201103021428_0051_m_000001_0: Java HotSpot(TM) 64-Bit Server VM warning: Exception java.lang.OutOfMemoryError occurred dispatching signal SIGTERM to handler- the VM may need to be forcibly terminated
>> 11/03/04 13:00:11 INFO mapreduce.Job: Task Id : attempt_201103021428_0051_m_000022_0, Status : FAILED
>> 11/03/04 13:00:12 INFO mapreduce.Job: Task Id : attempt_201103021428_0051_m_000026_0, Status : FAILED
>> 11/03/04 13:00:13 INFO mapreduce.Job: Task Id : attempt_201103021428_0051_m_000006_0, Status : FAILED
>> 11/03/04 13:00:14 INFO mapreduce.Job: Task Id : attempt_201103021428_0051_m_000011_0, Status : FAILED
>> 11/03/04 13:00:16 INFO mapreduce.Job:  map 48% reduce 0%
>> 11/03/04 13:00:17 INFO mapreduce.Job:  map 57% reduce 0%
>> 11/03/04 13:00:18 INFO mapreduce.Job:  map 59% reduce 0%
>> 11/03/04 13:00:18 INFO mapreduce.Job: Task Id : attempt_201103021428_0051_m_000061_0, Status : FAILED
>> Error: /media/hd3/mapred/local/taskTracker/ngc/jobcache/job_201103021428_0051/attempt_201103021428_0051_m_000061_0/work/tmp/libjniopencv_highgui3743962684984755257.so: libopencv_highgui.so.2.2: cannot open shared object file: No such file or directory
>> 11/03/04 13:00:19 INFO mapreduce.Job:  map 68% reduce 0%
>> 11/03/04 13:00:19 INFO mapreduce.Job: Task Id : attempt_201103021428_0051_m_000034_0, Status : FAILED
>> 11/03/04 13:00:20 INFO mapreduce.Job:  map 68% reduce 15%
>> 11/03/04 13:00:20 INFO mapreduce.Job: Task Id : attempt_201103021428_0051_m_000039_0, Status : FAILED
>> 11/03/04 13:00:21 INFO mapreduce.Job: Task Id : attempt_201103021428_0051_m_000076_0, Status : FAILED
>> Error: /media/hd4/mapred/local/taskTracker/ngc/jobcache/job_201103021428_0051/attempt_201103021428_0051_m_000076_0/work/tmp/libjniopencv_highgui3438076786756619584.so: libopencv_highgui.so.2.2: cannot open shared object file: No such file or directory
>> 11/03/04 13:00:22 INFO mapreduce.Job: Task Id : attempt_201103021428_0051_m_000057_0, Status : FAILED
>> 11/03/04 13:00:23 INFO mapreduce.Job:  map 68% reduce 23%
>> 11/03/04 13:00:23 INFO mapreduce.Job: Task Id : attempt_201103021428_0051_m_000065_0, Status : FAILED
>> 11/03/04 13:00:24 INFO mapreduce.Job: Task Id : attempt_201103021428_0051_m_000069_0, Status : FAILED
>> 11/03/04 13:00:25 INFO mapreduce.Job: Task Id : attempt_201103021428_0051_m_000082_0, Status : FAILED
>> 11/03/04 13:00:36 INFO mapreduce.Job: Task Id : attempt_201103021428_0051_m_000082_1, Status : FAILED
>> Error: /media/hd4/mapred/local/taskTracker/ngc/jobcache/job_201103021428_0051/attempt_201103021428_0051_m_000082_1/work/tmp/libjniopencv_highgui7180733690064994995.so: libopencv_highgui.so.2.2: cannot open shared object file: No such file or directory
>> 11/03/04 13:00:39 INFO mapreduce.Job: Task Id : attempt_201103021428_0051_m_000011_1, Status : FAILED
>> Error: /media/hd4/mapred/local/taskTracker/ngc/jobcache/job_201103021428_0051/attempt_201103021428_0051_m_000011_1/work/tmp/libjniopencv_highgui8978195121954363506.so: libopencv_highgui.so.2.2: cannot open shared object file: No such file or directory
>> 11/03/04 13:00:41 INFO mapreduce.Job:  map 73% reduce 23%
>> 11/03/04 13:00:42 INFO mapreduce.Job: Task Id : attempt_201103021428_0051_m_000069_1, Status : FAILED
>> 11/03/04 13:00:43 INFO mapreduce.Job:  map 73% reduce 24%
>> 11/03/04 13:00:43 INFO mapreduce.Job: Task Id : attempt_201103021428_0051_m_000048_1, Status : FAILED
>> Error: /media/hd2/mapred/local/taskTracker/ngc/jobcache/job_201103021428_0051/attempt_201103021428_0051_m_000048_1/work/tmp/libjniopencv_highgui7269142293373011624.so: libopencv_highgui.so.2.2: cannot open shared object file: No such file or directory
>> 11/03/04 13:00:44 INFO mapreduce.Job:  map 74% reduce 24%
>> 11/03/04 13:00:46 INFO mapreduce.Job:  map 74% reduce 25%
>> 11/03/04 13:00:46 INFO mapreduce.Job: Task Id : attempt_201103021428_0051_m_000001_1, Status : FAILED
>> 11/03/04 13:00:47 INFO mapreduce.Job:  map 75% reduce 25%
>> 11/03/04 13:00:48 INFO mapreduce.Job: Task Id : attempt_201103021428_0051_m_000052_1, Status : FAILED
>> 11/03/04 13:00:49 INFO mapreduce.Job: Task Id : attempt_201103021428_0051_m_000044_1, Status : FAILED
>> 11/03/04 13:00:49 INFO mapreduce.Job: Task Id : attempt_201103021428_0051_m_000087_0, Status : FAILED
>> 11/03/04 13:00:49 INFO mapreduce.Job: Task Id : attempt_201103021428_0051_m_000011_2, Status : FAILED
>> Error: /tmp/hadoop-ngc/mapred/local/taskTracker/ngc/jobcache/job_201103021428_0051/attempt_201103021428_0051_m_000011_2/work/tmp/libjniopencv_highgui6941559715123481707.so: libopencv_highgui.so.2.2: cannot open shared object file: No such file or directory
>> 11/03/04 13:00:50 INFO mapreduce.Job:  map 79% reduce 25%
>> 11/03/04 13:00:51 INFO mapreduce.Job: Task Id : attempt_201103021428_0051_m_000006_1, Status : FAILED
>> Error: /media/hd2/mapred/local/taskTracker/ngc/jobcache/job_201103021428_0051/attempt_201103021428_0051_m_000006_1/work/tmp/libjniopencv_highgui72992299570368055.so: libopencv_highgui.so.2.2: cannot open shared object file: No such file or directory
>> 11/03/04 13:00:52 INFO mapreduce.Job: Task Id : attempt_201103021428_0051_m_000026_1, Status : FAILED
>> 11/03/04 13:00:54 INFO mapreduce.Job: Task Id : attempt_201103021428_0051_m_000022_1, Status : FAILED
>> 11/03/04 13:00:55 INFO mapreduce.Job:  map 79% reduce 26%
>> 11/03/04 13:00:55 INFO mapreduce.Job: Task Id : attempt_201103021428_0051_m_000016_1, Status : FAILED
>> 11/03/04 13:00:57 INFO mapreduce.Job: Task Id : attempt_201103021428_0051_m_000091_0, Status : FAILED
>> 11/03/04 13:00:58 INFO mapreduce.Job: Task Id : attempt_201103021428_0051_m_000096_0, Status : FAILED
>> 11/03/04 13:00:58 INFO mapreduce.Job: Task Id : attempt_201103021428_0051_m_000034_1, Status : FAILED
>> Error: /media/hd2/mapred/local/taskTracker/ngc/jobcache/job_201103021428_0051/attempt_201103021428_0051_m_000034_1/work/tmp/libjniopencv_highgui3721618225395918920.so: libopencv_highgui.so.2.2: cannot open shared object file: No such file or directory
>> 11/03/04 13:01:01 INFO mapreduce.Job: Task Id : attempt_201103021428_0051_m_000065_1, Status : FAILED
>> 11/03/04 13:01:03 INFO mapreduce.Job: Task Id : attempt_201103021428_0051_m_000057_1, Status : FAILED
>> 11/03/04 13:01:04 INFO mapreduce.Job: Task Id : attempt_201103021428_0051_m_000076_1, Status : FAILED
>> 11/03/04 13:01:06 INFO mapreduce.Job: Task Id : attempt_201103021428_0051_m_000039_1, Status : FAILED
>> 11/03/04 13:01:07 INFO mapreduce.Job: Task Id : attempt_201103021428_0051_m_000061_1, Status : FAILED
>> 11/03/04 13:01:09 INFO mapreduce.Job: Task Id : attempt_201103021428_0051_m_000069_2, Status : FAILED
>> Error: /media/hd3/mapred/local/taskTracker/ngc/jobcache/job_201103021428_0051/attempt_201103021428_0051_m_000069_2/work/tmp/libjniopencv_highgui8910946496817753039.so: libopencv_highgui.so.2.2: cannot open shared object file: No such file or directory
>> 11/03/04 13:01:10 INFO mapreduce.Job: Task Id : attempt_201103021428_0051_m_000082_2, Status : FAILED
>> 11/03/04 13:01:15 INFO mapreduce.Job: Job complete: job_201103021428_0051
>> 11/03/04 13:01:15 INFO mapreduce.Job: Counters: 21
>>      FileInputFormatCounters
>>              BYTES_READ=0
>>      FileSystemCounters
>>              FILE_BYTES_WRITTEN=3040
>>              HDFS_BYTES_READ=1048281
>>      Job Counters
>>              Data-local map tasks=51
>>              Total time spent by all maps waiting after reserving slots (ms)=0
>>              Total time spent by all reduces waiting after reserving slots (ms)=0
>>              Failed map tasks=1
>>              Rack-local map tasks=86
>>              SLOTS_MILLIS_MAPS=1091359
>>              SLOTS_MILLIS_REDUCES=0
>>              Launched map tasks=137
>>              Launched reduce tasks=2
>>      Map-Reduce Framework
>>              Combine input records=0
>>              Failed Shuffles=0
>>              GC time elapsed (ms)=0
>>              Map input records=0
>>              Map output bytes=0
>>              Map output records=0
>>              Merged Map outputs=0
>>              Spilled Records=0
>>              SPLIT_RAW_BYTES=8960
>> Exception in thread "main" java.io.IOException: Job failed!
>>      at org.apache.hadoop.mapred.JobClient.runJob(JobClient.java:782)
>>      at progs.HannahFace.run(HannahFace.java:137)
>>      at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:69)
>>      at progs.HannahFace.main(HannahFace.java:162)
>>      at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>>      at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
>>      at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
>>      at java.lang.reflect.Method.invoke(Method.java:597)
>>      at org.apache.hadoop.util.RunJar.main(RunJar.java:192)
>> 
>> 3.      SAME COMMAND BUT AFTER I MODIFIED MAPRED-SITE.XML - FAILURE
>> 
>> ngc@hadoop1:~/hadoop-0.21.0$ bin/hadoop jar ../eclipse/HannahFace.jar progs/HannahFace
>>>>>> Running Face Program
>> 11/03/04 15:07:11 INFO security.Groups: Group mapping impl=org.apache.hadoop.security.ShellBasedUnixGroupsMapping; cacheTimeout=300000
>> 11/03/04 15:07:11 WARN conf.Configuration: mapred.task.id<http://mapred.task.id> is deprecated. Instead, use mapreduce.task.attempt.id<http://mapreduce.task.attempt.id>
>> 11/03/04 15:07:11 WARN mapreduce.JobSubmitter: Use GenericOptionsParser for parsing the arguments. Applications should implement Tool for the same.
>> 11/03/04 15:07:12 INFO mapred.FileInputFormat: Total input paths to process : 1
>> 11/03/04 15:07:12 WARN conf.Configuration: mapred.map.tasks is deprecated. Instead, use mapreduce.job.maps
>> 11/03/04 15:07:12 INFO mapreduce.JobSubmitter: number of splits:100
>> 11/03/04 15:07:12 INFO mapreduce.JobSubmitter: adding the following namenodes' delegation tokens:null
>> 11/03/04 15:07:12 INFO mapreduce.Job: Running job: job_201103021428_0069
>> 11/03/04 15:07:13 INFO mapreduce.Job:  map 0% reduce 0%
>> 11/03/04 15:07:20 INFO mapreduce.Job:  map 6% reduce 0%
>> 11/03/04 15:07:21 INFO mapreduce.Job:  map 10% reduce 0%
>> 11/03/04 15:07:36 INFO mapreduce.Job:  map 18% reduce 0%
>> 11/03/04 15:07:38 INFO mapreduce.Job:  map 28% reduce 0%
>> 11/03/04 15:07:38 INFO mapreduce.Job: Task Id : attempt_201103021428_0069_m_000016_0, Status : FAILED
>> Error: /media/hd4/mapred/local/taskTracker/ngc/jobcache/job_201103021428_0069/attempt_201103021428_0069_m_000016_0/work/tmp/libjniopencv_highgui4138482228584845301.so: libxcb.so.1: failed to map segment from shared object: Cannot allocate memory
>> 11/03/04 15:07:39 INFO mapreduce.Job:  map 35% reduce 0%
>> 11/03/04 15:07:40 INFO mapreduce.Job: Task Id : attempt_201103021428_0069_m_000001_0, Status : FAILED
>> Error: /media/hd3/mapred/local/taskTracker/ngc/jobcache/job_201103021428_0069/attempt_201103021428_0069_m_000001_0/work/tmp/libjniopencv_highgui2385564746644347958.so: libXau.so.6: failed to map segment from shared object: Cannot allocate memory
>> 11/03/04 15:07:42 INFO mapreduce.Job:  map 39% reduce 0%
>> 11/03/04 15:07:42 INFO mapreduce.Job: Task Id : attempt_201103021428_0069_m_000022_0, Status : FAILED
>> 11/03/04 15:07:44 INFO mapreduce.Job:  map 50% reduce 0%
>> 11/03/04 15:07:44 INFO mapreduce.Job: Task Id : attempt_201103021428_0069_m_000026_0, Status : FAILED
>> 11/03/04 15:07:45 INFO mapreduce.Job: Task Id : attempt_201103021428_0069_m_000011_0, Status : FAILED
>> 11/03/04 15:07:46 INFO mapreduce.Job: Task Id : attempt_201103021428_0069_m_000034_0, Status : FAILED
>> 11/03/04 15:07:47 INFO mapreduce.Job:  map 50% reduce 13%
>> 11/03/04 15:07:47 INFO mapreduce.Job: Task Id : attempt_201103021428_0069_m_000039_0, Status : FAILED
>> 11/03/04 15:07:48 INFO mapreduce.Job:  map 59% reduce 13%
>> 11/03/04 15:07:48 INFO mapreduce.Job: Task Id : attempt_201103021428_0069_m_000082_0, Status : FAILED
>> Error: /media/hd2/mapred/local/taskTracker/ngc/jobcache/job_201103021428_0069/attempt_201103021428_0069_m_000082_0/work/tmp/libjniopencv_highgui2586824844718343743.so: libxcb-render.so.0: failed to map segment from shared object: Cannot allocate memory
>> 11/03/04 15:07:50 INFO mapreduce.Job: Task Id : attempt_201103021428_0069_m_000006_0, Status : FAILED
>> 11/03/04 15:07:51 INFO mapreduce.Job:  map 67% reduce 13%
>> 11/03/04 15:07:51 INFO mapreduce.Job: Task Id : attempt_201103021428_0069_m_000044_0, Status : FAILED
>> 11/03/04 15:07:51 INFO mapreduce.Job: Task Id : attempt_201103021428_0069_m_000048_0, Status : FAILED
>> 11/03/04 15:07:53 INFO mapreduce.Job: Task Id : attempt_201103021428_0069_m_000052_0, Status : FAILED
>> 11/03/04 15:07:53 INFO mapreduce.Job: Task Id : attempt_201103021428_0069_m_000076_0, Status : FAILED
>> Error: /media/hd3/mapred/local/taskTracker/ngc/jobcache/job_201103021428_0069/attempt_201103021428_0069_m_000076_0/work/tmp/libjniopencv_highgui6607923753832414434.so: libfontconfig.so.1: failed to map segment from shared object: Cannot allocate memory
>> 11/03/04 15:07:54 INFO mapreduce.Job: Task Id : attempt_201103021428_0069_m_000057_0, Status : FAILED
>> 11/03/04 15:07:56 INFO mapreduce.Job: Task Id : attempt_201103021428_0069_m_000061_0, Status : FAILED
>> 11/03/04 15:07:57 INFO mapreduce.Job: Task Id : attempt_201103021428_0069_m_000065_0, Status : FAILED
>> 11/03/04 15:07:59 INFO mapreduce.Job: Task Id : attempt_201103021428_0069_m_000069_0, Status : FAILED
>> attempt_201103021428_0069_m_000069_0: Java HotSpot(TM) 64-Bit Server VM warning: Exception java.lang.OutOfMemoryError occurred dispatching signal SIGTERM to handler- the VM may need to be forcibly terminated
>> 
>> 
>> 
>> 
>

Re: EXT :Re: Problem running a Hadoop program with external libraries

Posted by Brian Bockelman <bb...@cse.unl.edu>.

Hi,

Check your kernel's overcommit settings.  This will prevent the JVM from allocating memory even when there's free RAM.

Brian

On Mar 4, 2011, at 3:55 PM, Ratner, Alan S (IS) wrote:

> Aaron,
> 
>   Thanks for the rapid responses.
> 
> 
> *         "ulimit -u unlimited" is in .bashrc.
> 
> 
> *         HADOOP_HEAPSIZE is set to 4000 MB in hadoop-env.sh
> 
> 
> *         Mapred.child.ulimit is set to 2048000 in mapred-site.xml
> 
> 
> *         Mapred.child.java.opts is set to -Xmx1536m in mapred-site.xml
> 
>   I take it you are suggesting that I change the java.opts command to:
> 
> Mapred.child.java.opts is <value> -Xmx1536m -Djava.library.path=/path/to/native/libs </value>
> 
> 
> Alan Ratner
> Northrop Grumman Information Systems
> Manager of Large-Scale Computing
> 9020 Junction Drive
> Annapolis Junction, MD 20701
> (410) 707-8605 (cell)
> 
> From: Aaron Kimball [mailto:akimball83@gmail.com]
> Sent: Friday, March 04, 2011 4:30 PM
> To: common-user@hadoop.apache.org
> Cc: Ratner, Alan S (IS)
> Subject: EXT :Re: Problem running a Hadoop program with external libraries
> 
> Actually, I just misread your email and missed the difference between your 2nd and 3rd attempts.
> 
> Are you enforcing min/max JVM heap sizes on your tasks? Are you enforcing a ulimit (either through your shell configuration, or through Hadoop itself)? I don't know where these "cannot allocate memory" errors are coming from. If they're from the OS, could it be because it needs to fork() and momentarily exceed the ulimit before loading the native libs?
> 
> - Aaron
> 
> On Fri, Mar 4, 2011 at 1:26 PM, Aaron Kimball <ak...@gmail.com>> wrote:
> I don't know if putting native-code .so files inside a jar works. A native-code .so is not "classloaded" in the same way .class files are.
> 
> So the correct .so files probably need to exist in some physical directory on the worker machines. You may want to doublecheck that the correct directory on the worker machines is identified in the JVM property 'java.library.path' (instead of / in addition to $LD_LIBRARY_PATH). This can be manipulated in the Hadoop configuration setting mapred.child.java.opts (include '-Djava.library.path=/path/to/native/libs' in the string there.)
> 
> Also, if you added your .so files to a directory that is already used by the tasktracker (like hadoop-0.21.0/lib/native/Linux-amd64-64/), you may need to restart the tasktracker instance for it to take effect. (This is true of .jar files in the $HADOOP_HOME/lib directory; I don't know if it is true for native libs as well.)
> 
> - Aaron
> 
> On Fri, Mar 4, 2011 at 12:53 PM, Ratner, Alan S (IS) <Al...@ngc.com>> wrote:
> We are having difficulties running a Hadoop program making calls to external libraries - but this occurs only when we run the program on our cluster and not from within Eclipse where we are apparently running in Hadoop's standalone mode.  This program invokes the Open Computer Vision libraries (OpenCV and JavaCV).  (I don't think there is a problem with our cluster - we've run many Hadoop jobs on it without difficulty.)
> 
> 1.      I normally use Eclipse to create jar files for our Hadoop programs but I inadvertently hit the "run as Java application" button and the program ran fine, reading the input file from the eclipse workspace rather than HDFS and writing the output file to the same place.  Hadoop's output appears below.  (This occurred on the master Hadoop server.)
> 
> 2.      I then "exported" from Eclipse a "runnable jar" which "extracted required libraries" into the generated jar - presumably producing a jar file that incorporated all the required library functions. (The plain jar file for this program is 17 kB while the runnable jar is 30MB.)  When I try to run this on my Hadoop cluster (including my master and slave servers) the program reports that it is unable to locate "libopencv_highgui.so.2.2: cannot open shared object file: No such file or directory".  Now, in addition to this library being incorporated inside the runnable jar file it is also present on each of my servers at hadoop-0.21.0/lib/native/Linux-amd64-64/ where we have loaded the same libraries (to give Hadoop 2 shots at finding them).  These include:
>     ...
>     libopencv_highgui_pch_dephelp.a
>     libopencv_highgui.so
>     libopencv_highgui.so.2.2
>     libopencv_highgui.so.2.2.0
>     ...
> 
>     When I poke around inside the runnable jar I find javacv_linux-x86_64.jar which contains:
>     com/googlecode/javacv/cpp/linux-x86_64/libjniopencv_highgui.so
> 
> 3.      I then tried adding the following to mapred-site.xml as suggested in Patch 2838 that's supposed to be included in hadoop 0.21 https://issues.apache.org/jira/browse/HADOOP-2838
>     <property>
>       <name>mapred.child.env</name>
>       <value>LD_LIBRARY_PATH=/home/ngc/hadoop-0.21.0/lib/native/Linux-amd64-64</value>
>     </property>
>     The log is included at the bottom of this email with Hadoop now complaining about a different missing library with an out-of-memory error.
> 
> Does anyone have any ideas as to what is going wrong here?  Any help would be appreciated.  Thanks.
> 
> Alan
> 
> 
> BTW: Each of our servers has 4 hard drives and many of the errors below refer to the 3 drives (/media/hd2 or hd3 or hd4) reserved exclusively for HDFS and thus perhaps not a good place for Hadoop to be looking for a library file.  My slaves have 24 GB RAM, the jar file is 30 MB, and the sequence file being read is 400 KB - so I hope I am not running out of memory.
> 
> 
> 1.      RUNNING DIRECTLY FROM ECLIPSE IN HADOOP'S STANDALONE MODE - SUCCESS
> 
>>>>> Running Face Program
> 11/03/04 12:44:10 INFO security.Groups: Group mapping impl=org.apache.hadoop.security.ShellBasedUnixGroupsMapping; cacheTimeout=300000
> 11/03/04 12:44:10 INFO jvm.JvmMetrics: Initializing JVM Metrics with processName=JobTracker, sessionId=
> 11/03/04 12:44:10 WARN mapreduce.JobSubmitter: Use GenericOptionsParser for parsing the arguments. Applications should implement Tool for the same.
> 11/03/04 12:44:10 WARN mapreduce.JobSubmitter: No job jar file set.  User classes may not be found. See Job or Job#setJar(String).
> 11/03/04 12:44:10 INFO mapred.FileInputFormat: Total input paths to process : 1
> 11/03/04 12:44:10 WARN conf.Configuration: mapred.map.tasks is deprecated. Instead, use mapreduce.job.maps
> 11/03/04 12:44:10 INFO mapreduce.JobSubmitter: number of splits:1
> 11/03/04 12:44:10 INFO mapreduce.JobSubmitter: adding the following namenodes' delegation tokens:null
> 11/03/04 12:44:10 WARN security.TokenCache: Overwriting existing token storage with # keys=0
> 11/03/04 12:44:10 INFO mapreduce.Job: Running job: job_local_0001
> 11/03/04 12:44:10 INFO mapred.LocalJobRunner: Waiting for map tasks
> 11/03/04 12:44:10 INFO mapred.LocalJobRunner: Starting task: attempt_local_0001_m_000000_0
> 11/03/04 12:44:10 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
> 11/03/04 12:44:10 INFO compress.CodecPool: Got brand-new decompressor
> 11/03/04 12:44:10 INFO mapred.MapTask: numReduceTasks: 1
> 11/03/04 12:44:10 INFO mapred.MapTask: (EQUATOR) 0 kvi 26214396(104857584)
> 11/03/04 12:44:10 INFO mapred.MapTask: mapreduce.task.io.sort.mb: 100
> 11/03/04 12:44:10 INFO mapred.MapTask: soft limit at 83886080
> 11/03/04 12:44:10 INFO mapred.MapTask: bufstart = 0; bufvoid = 104857600
> 11/03/04 12:44:10 INFO mapred.MapTask: kvstart = 26214396; length = 6553600
> 11/03/04 12:44:11 INFO mapreduce.Job:  map 0% reduce 0%
> 11/03/04 12:44:16 INFO mapred.LocalJobRunner: file:/home/ngc/eclipse_workspace/HadoopPrograms/Images2/JPGSequenceFile.001:0+411569 > map
> 11/03/04 12:44:17 INFO mapreduce.Job:  map 57% reduce 0%
> 11/03/04 12:44:18 INFO mapred.LocalJobRunner: file:/home/ngc/eclipse_workspace/HadoopPrograms/Images2/JPGSequenceFile.001:0+411569 > map
> 11/03/04 12:44:18 INFO mapred.MapTask: Starting flush of map output
> 11/03/04 12:44:18 INFO mapred.MapTask: Spilling map output
> 11/03/04 12:44:18 INFO mapred.MapTask: bufstart = 0; bufend = 1454; bufvoid = 104857600
> 11/03/04 12:44:18 INFO mapred.MapTask: kvstart = 26214396(104857584); kvend = 26214324(104857296); length = 73/6553600
> 11/03/04 12:44:18 INFO mapred.MapTask: Finished spill 0
> 11/03/04 12:44:18 INFO mapred.Task: Task:attempt_local_0001_m_000000_0 is done. And is in the process of commiting
> 11/03/04 12:44:18 INFO mapred.LocalJobRunner: file:/home/ngc/eclipse_workspace/HadoopPrograms/Images2/JPGSequenceFile.001:0+411569 > sort
> 11/03/04 12:44:18 INFO mapred.Task: Task 'attempt_local_0001_m_000000_0' done.
> 11/03/04 12:44:18 INFO mapred.LocalJobRunner: Finishing task: attempt_local_0001_m_000000_0
> 11/03/04 12:44:18 INFO mapred.LocalJobRunner: Map task executor complete.
> 11/03/04 12:44:18 INFO mapred.Merger: Merging 1 sorted segments
> 11/03/04 12:44:18 INFO mapred.Merger: Down to the last merge-pass, with 1 segments left of total size: 1481 bytes
> 11/03/04 12:44:18 INFO mapred.LocalJobRunner:
> 11/03/04 12:44:18 INFO mapred.Task: Task:attempt_local_0001_r_000000_0 is done. And is in the process of commiting
> 11/03/04 12:44:18 INFO mapred.LocalJobRunner:
> 11/03/04 12:44:18 INFO mapred.Task: Task attempt_local_0001_r_000000_0 is allowed to commit now
> 11/03/04 12:44:18 INFO mapred.FileOutputCommitter: Saved output of task 'attempt_local_0001_r_000000_0' to file:/home/ngc/eclipse_workspace/HadoopPrograms/FaceOutput
> 11/03/04 12:44:18 INFO mapred.LocalJobRunner: reduce > sort
> 11/03/04 12:44:18 INFO mapred.Task: Task 'attempt_local_0001_r_000000_0' done.
> 11/03/04 12:44:18 INFO mapreduce.Job:  map 100% reduce 100%
> 11/03/04 12:44:18 INFO mapreduce.Job: Job complete: job_local_0001
> 11/03/04 12:44:18 INFO mapreduce.Job: Counters: 18
>       FileInputFormatCounters
>               BYTES_READ=411439
>       FileSystemCounters
>               FILE_BYTES_READ=825005
>               FILE_BYTES_WRITTEN=127557
>       Map-Reduce Framework
>               Combine input records=0
>               Combine output records=0
>               Failed Shuffles=0
>               GC time elapsed (ms)=88
>               Map input records=20
>               Map output bytes=1454
>               Map output records=19
>               Merged Map outputs=0
>               Reduce input groups=19
>               Reduce input records=19
>               Reduce output records=19
>               Reduce shuffle bytes=0
>               Shuffled Maps =0
>               Spilled Records=38
>               SPLIT_RAW_BYTES=127
>>>>> 0.036993828        img_9619.jpg 2 found at [ 35, 201, 37 ], [ 84, 41, 68 ],
> ...
>>>>> 0.41283935 img_538.jpg 3 found at [ 265, 44, 80 ], [ 132, 32, 101 ], [ 210, 119, 228 ],
>>>>> Job Finished in 8.679 seconds
> 
> 2.      RUNNING THE SAME PROGRAM IN HADOOP'S DISTRIBUTED MODE - FAILURE
> 
> ngc@hadoop1:~$ cd hadoop-0.21.0/
> ngc@hadoop1:~/hadoop-0.21.0$ bin/hadoop fs -ls Imag*
> 11/03/04 12:58:18 INFO security.Groups: Group mapping impl=org.apache.hadoop.security.ShellBasedUnixGroupsMapping; cacheTimeout=300000
> 11/03/04 12:58:18 WARN conf.Configuration: mapred.task.id<http://mapred.task.id> is deprecated. Instead, use mapreduce.task.attempt.id<http://mapreduce.task.attempt.id>
> Found 1 items
> -rw-r--r--   2 ngc supergroup     411569 2011-03-02 13:21 /user/ngc/Images2/JPGSequenceFile.001
> ngc@hadoop1:~/hadoop-0.21.0$ bin/hadoop jar ../eclipse/HannahFace.jar progs/HannahFace
>>>>> Running Face Program
> 11/03/04 12:59:39 INFO security.Groups: Group mapping impl=org.apache.hadoop.security.ShellBasedUnixGroupsMapping; cacheTimeout=300000
> 11/03/04 12:59:40 WARN conf.Configuration: mapred.task.id<http://mapred.task.id> is deprecated. Instead, use mapreduce.task.attempt.id<http://mapreduce.task.attempt.id>
> 11/03/04 12:59:40 WARN mapreduce.JobSubmitter: Use GenericOptionsParser for parsing the arguments. Applications should implement Tool for the same.
> 11/03/04 12:59:40 INFO mapred.FileInputFormat: Total input paths to process : 1
> 11/03/04 12:59:40 WARN conf.Configuration: mapred.map.tasks is deprecated. Instead, use mapreduce.job.maps
> 11/03/04 12:59:40 INFO mapreduce.JobSubmitter: number of splits:100
> 11/03/04 12:59:40 INFO mapreduce.JobSubmitter: adding the following namenodes' delegation tokens:null
> 11/03/04 12:59:41 INFO mapreduce.Job: Running job: job_201103021428_0051
> 11/03/04 12:59:42 INFO mapreduce.Job:  map 0% reduce 0%
> 11/03/04 12:59:49 INFO mapreduce.Job:  map 7% reduce 0%
> 11/03/04 12:59:51 INFO mapreduce.Job:  map 10% reduce 0%
> 11/03/04 13:00:04 INFO mapreduce.Job:  map 12% reduce 0%
> 11/03/04 13:00:05 INFO mapreduce.Job:  map 16% reduce 0%
> 11/03/04 13:00:06 INFO mapreduce.Job:  map 28% reduce 0%
> 11/03/04 13:00:07 INFO mapreduce.Job:  map 37% reduce 0%
> 11/03/04 13:00:07 INFO mapreduce.Job: Task Id : attempt_201103021428_0051_m_000016_0, Status : FAILED
> Error: /tmp/hadoop-ngc/mapred/local/taskTracker/ngc/jobcache/job_201103021428_0051/attempt_201103021428_0051_m_000016_0/work/tmp/libjniopencv_highgui9051044227445275266.so: libopencv_highgui.so.2.2: cannot open shared object file: No such file or directory
> 11/03/04 13:00:08 INFO mapreduce.Job: Task Id : attempt_201103021428_0051_m_000044_0, Status : FAILED
> Error: /media/hd3/mapred/local/taskTracker/ngc/jobcache/job_201103021428_0051/attempt_201103021428_0051_m_000044_0/work/tmp/libjniopencv_highgui6446098204420446112.so: libopencv_highgui.so.2.2: cannot open shared object file: No such file or directory
> 11/03/04 13:00:09 INFO mapreduce.Job:  map 47% reduce 0%
> 11/03/04 13:00:09 INFO mapreduce.Job: Task Id : attempt_201103021428_0051_m_000048_0, Status : FAILED
> Error: /media/hd2/mapred/local/taskTracker/ngc/jobcache/job_201103021428_0051/attempt_201103021428_0051_m_000048_0/work/tmp/libjniopencv_objdetect3671939282732993726.so: /media/hd2/mapred/local/taskTracker/ngc/jobcache/job_201103021428_0051/attempt_201103021428_0051_m_000048_0/work/tmp/libjniopencv_objdetect3671939282732993726.so: failed to map segment from shared object: Cannot allocate memory
> 11/03/04 13:00:09 INFO mapreduce.Job: Task Id : attempt_201103021428_0051_m_000052_0, Status : FAILED
> Error: /tmp/hadoop-ngc/mapred/local/taskTracker/ngc/jobcache/job_201103021428_0051/attempt_201103021428_0051_m_000052_0/work/tmp/libjniopencv_highgui1579426900682939358.so: libopencv_highgui.so.2.2: cannot open shared object file: No such file or directory
> 11/03/04 13:00:10 INFO mapreduce.Job: Task Id : attempt_201103021428_0051_m_000001_0, Status : FAILED
> Error: /media/hd2/mapred/local/taskTracker/ngc/jobcache/job_201103021428_0051/attempt_201103021428_0051_m_000001_0/work/tmp/libjniopencv_objdetect3457632677367330581.so: /media/hd2/mapred/local/taskTracker/ngc/jobcache/job_201103021428_0051/attempt_201103021428_0051_m_000001_0/work/tmp/libjniopencv_objdetect3457632677367330581.so: failed to map segment from shared object: Cannot allocate memory
> attempt_201103021428_0051_m_000001_0: Java HotSpot(TM) 64-Bit Server VM warning: Exception java.lang.OutOfMemoryError occurred dispatching signal SIGTERM to handler- the VM may need to be forcibly terminated
> 11/03/04 13:00:11 INFO mapreduce.Job: Task Id : attempt_201103021428_0051_m_000022_0, Status : FAILED
> 11/03/04 13:00:12 INFO mapreduce.Job: Task Id : attempt_201103021428_0051_m_000026_0, Status : FAILED
> 11/03/04 13:00:13 INFO mapreduce.Job: Task Id : attempt_201103021428_0051_m_000006_0, Status : FAILED
> 11/03/04 13:00:14 INFO mapreduce.Job: Task Id : attempt_201103021428_0051_m_000011_0, Status : FAILED
> 11/03/04 13:00:16 INFO mapreduce.Job:  map 48% reduce 0%
> 11/03/04 13:00:17 INFO mapreduce.Job:  map 57% reduce 0%
> 11/03/04 13:00:18 INFO mapreduce.Job:  map 59% reduce 0%
> 11/03/04 13:00:18 INFO mapreduce.Job: Task Id : attempt_201103021428_0051_m_000061_0, Status : FAILED
> Error: /media/hd3/mapred/local/taskTracker/ngc/jobcache/job_201103021428_0051/attempt_201103021428_0051_m_000061_0/work/tmp/libjniopencv_highgui3743962684984755257.so: libopencv_highgui.so.2.2: cannot open shared object file: No such file or directory
> 11/03/04 13:00:19 INFO mapreduce.Job:  map 68% reduce 0%
> 11/03/04 13:00:19 INFO mapreduce.Job: Task Id : attempt_201103021428_0051_m_000034_0, Status : FAILED
> 11/03/04 13:00:20 INFO mapreduce.Job:  map 68% reduce 15%
> 11/03/04 13:00:20 INFO mapreduce.Job: Task Id : attempt_201103021428_0051_m_000039_0, Status : FAILED
> 11/03/04 13:00:21 INFO mapreduce.Job: Task Id : attempt_201103021428_0051_m_000076_0, Status : FAILED
> Error: /media/hd4/mapred/local/taskTracker/ngc/jobcache/job_201103021428_0051/attempt_201103021428_0051_m_000076_0/work/tmp/libjniopencv_highgui3438076786756619584.so: libopencv_highgui.so.2.2: cannot open shared object file: No such file or directory
> 11/03/04 13:00:22 INFO mapreduce.Job: Task Id : attempt_201103021428_0051_m_000057_0, Status : FAILED
> 11/03/04 13:00:23 INFO mapreduce.Job:  map 68% reduce 23%
> 11/03/04 13:00:23 INFO mapreduce.Job: Task Id : attempt_201103021428_0051_m_000065_0, Status : FAILED
> 11/03/04 13:00:24 INFO mapreduce.Job: Task Id : attempt_201103021428_0051_m_000069_0, Status : FAILED
> 11/03/04 13:00:25 INFO mapreduce.Job: Task Id : attempt_201103021428_0051_m_000082_0, Status : FAILED
> 11/03/04 13:00:36 INFO mapreduce.Job: Task Id : attempt_201103021428_0051_m_000082_1, Status : FAILED
> Error: /media/hd4/mapred/local/taskTracker/ngc/jobcache/job_201103021428_0051/attempt_201103021428_0051_m_000082_1/work/tmp/libjniopencv_highgui7180733690064994995.so: libopencv_highgui.so.2.2: cannot open shared object file: No such file or directory
> 11/03/04 13:00:39 INFO mapreduce.Job: Task Id : attempt_201103021428_0051_m_000011_1, Status : FAILED
> Error: /media/hd4/mapred/local/taskTracker/ngc/jobcache/job_201103021428_0051/attempt_201103021428_0051_m_000011_1/work/tmp/libjniopencv_highgui8978195121954363506.so: libopencv_highgui.so.2.2: cannot open shared object file: No such file or directory
> 11/03/04 13:00:41 INFO mapreduce.Job:  map 73% reduce 23%
> 11/03/04 13:00:42 INFO mapreduce.Job: Task Id : attempt_201103021428_0051_m_000069_1, Status : FAILED
> 11/03/04 13:00:43 INFO mapreduce.Job:  map 73% reduce 24%
> 11/03/04 13:00:43 INFO mapreduce.Job: Task Id : attempt_201103021428_0051_m_000048_1, Status : FAILED
> Error: /media/hd2/mapred/local/taskTracker/ngc/jobcache/job_201103021428_0051/attempt_201103021428_0051_m_000048_1/work/tmp/libjniopencv_highgui7269142293373011624.so: libopencv_highgui.so.2.2: cannot open shared object file: No such file or directory
> 11/03/04 13:00:44 INFO mapreduce.Job:  map 74% reduce 24%
> 11/03/04 13:00:46 INFO mapreduce.Job:  map 74% reduce 25%
> 11/03/04 13:00:46 INFO mapreduce.Job: Task Id : attempt_201103021428_0051_m_000001_1, Status : FAILED
> 11/03/04 13:00:47 INFO mapreduce.Job:  map 75% reduce 25%
> 11/03/04 13:00:48 INFO mapreduce.Job: Task Id : attempt_201103021428_0051_m_000052_1, Status : FAILED
> 11/03/04 13:00:49 INFO mapreduce.Job: Task Id : attempt_201103021428_0051_m_000044_1, Status : FAILED
> 11/03/04 13:00:49 INFO mapreduce.Job: Task Id : attempt_201103021428_0051_m_000087_0, Status : FAILED
> 11/03/04 13:00:49 INFO mapreduce.Job: Task Id : attempt_201103021428_0051_m_000011_2, Status : FAILED
> Error: /tmp/hadoop-ngc/mapred/local/taskTracker/ngc/jobcache/job_201103021428_0051/attempt_201103021428_0051_m_000011_2/work/tmp/libjniopencv_highgui6941559715123481707.so: libopencv_highgui.so.2.2: cannot open shared object file: No such file or directory
> 11/03/04 13:00:50 INFO mapreduce.Job:  map 79% reduce 25%
> 11/03/04 13:00:51 INFO mapreduce.Job: Task Id : attempt_201103021428_0051_m_000006_1, Status : FAILED
> Error: /media/hd2/mapred/local/taskTracker/ngc/jobcache/job_201103021428_0051/attempt_201103021428_0051_m_000006_1/work/tmp/libjniopencv_highgui72992299570368055.so: libopencv_highgui.so.2.2: cannot open shared object file: No such file or directory
> 11/03/04 13:00:52 INFO mapreduce.Job: Task Id : attempt_201103021428_0051_m_000026_1, Status : FAILED
> 11/03/04 13:00:54 INFO mapreduce.Job: Task Id : attempt_201103021428_0051_m_000022_1, Status : FAILED
> 11/03/04 13:00:55 INFO mapreduce.Job:  map 79% reduce 26%
> 11/03/04 13:00:55 INFO mapreduce.Job: Task Id : attempt_201103021428_0051_m_000016_1, Status : FAILED
> 11/03/04 13:00:57 INFO mapreduce.Job: Task Id : attempt_201103021428_0051_m_000091_0, Status : FAILED
> 11/03/04 13:00:58 INFO mapreduce.Job: Task Id : attempt_201103021428_0051_m_000096_0, Status : FAILED
> 11/03/04 13:00:58 INFO mapreduce.Job: Task Id : attempt_201103021428_0051_m_000034_1, Status : FAILED
> Error: /media/hd2/mapred/local/taskTracker/ngc/jobcache/job_201103021428_0051/attempt_201103021428_0051_m_000034_1/work/tmp/libjniopencv_highgui3721618225395918920.so: libopencv_highgui.so.2.2: cannot open shared object file: No such file or directory
> 11/03/04 13:01:01 INFO mapreduce.Job: Task Id : attempt_201103021428_0051_m_000065_1, Status : FAILED
> 11/03/04 13:01:03 INFO mapreduce.Job: Task Id : attempt_201103021428_0051_m_000057_1, Status : FAILED
> 11/03/04 13:01:04 INFO mapreduce.Job: Task Id : attempt_201103021428_0051_m_000076_1, Status : FAILED
> 11/03/04 13:01:06 INFO mapreduce.Job: Task Id : attempt_201103021428_0051_m_000039_1, Status : FAILED
> 11/03/04 13:01:07 INFO mapreduce.Job: Task Id : attempt_201103021428_0051_m_000061_1, Status : FAILED
> 11/03/04 13:01:09 INFO mapreduce.Job: Task Id : attempt_201103021428_0051_m_000069_2, Status : FAILED
> Error: /media/hd3/mapred/local/taskTracker/ngc/jobcache/job_201103021428_0051/attempt_201103021428_0051_m_000069_2/work/tmp/libjniopencv_highgui8910946496817753039.so: libopencv_highgui.so.2.2: cannot open shared object file: No such file or directory
> 11/03/04 13:01:10 INFO mapreduce.Job: Task Id : attempt_201103021428_0051_m_000082_2, Status : FAILED
> 11/03/04 13:01:15 INFO mapreduce.Job: Job complete: job_201103021428_0051
> 11/03/04 13:01:15 INFO mapreduce.Job: Counters: 21
>       FileInputFormatCounters
>               BYTES_READ=0
>       FileSystemCounters
>               FILE_BYTES_WRITTEN=3040
>               HDFS_BYTES_READ=1048281
>       Job Counters
>               Data-local map tasks=51
>               Total time spent by all maps waiting after reserving slots (ms)=0
>               Total time spent by all reduces waiting after reserving slots (ms)=0
>               Failed map tasks=1
>               Rack-local map tasks=86
>               SLOTS_MILLIS_MAPS=1091359
>               SLOTS_MILLIS_REDUCES=0
>               Launched map tasks=137
>               Launched reduce tasks=2
>       Map-Reduce Framework
>               Combine input records=0
>               Failed Shuffles=0
>               GC time elapsed (ms)=0
>               Map input records=0
>               Map output bytes=0
>               Map output records=0
>               Merged Map outputs=0
>               Spilled Records=0
>               SPLIT_RAW_BYTES=8960
> Exception in thread "main" java.io.IOException: Job failed!
>       at org.apache.hadoop.mapred.JobClient.runJob(JobClient.java:782)
>       at progs.HannahFace.run(HannahFace.java:137)
>       at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:69)
>       at progs.HannahFace.main(HannahFace.java:162)
>       at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>       at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
>       at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
>       at java.lang.reflect.Method.invoke(Method.java:597)
>       at org.apache.hadoop.util.RunJar.main(RunJar.java:192)
> 
> 3.      SAME COMMAND BUT AFTER I MODIFIED MAPRED-SITE.XML - FAILURE
> 
> ngc@hadoop1:~/hadoop-0.21.0$ bin/hadoop jar ../eclipse/HannahFace.jar progs/HannahFace
>>>>> Running Face Program
> 11/03/04 15:07:11 INFO security.Groups: Group mapping impl=org.apache.hadoop.security.ShellBasedUnixGroupsMapping; cacheTimeout=300000
> 11/03/04 15:07:11 WARN conf.Configuration: mapred.task.id<http://mapred.task.id> is deprecated. Instead, use mapreduce.task.attempt.id<http://mapreduce.task.attempt.id>
> 11/03/04 15:07:11 WARN mapreduce.JobSubmitter: Use GenericOptionsParser for parsing the arguments. Applications should implement Tool for the same.
> 11/03/04 15:07:12 INFO mapred.FileInputFormat: Total input paths to process : 1
> 11/03/04 15:07:12 WARN conf.Configuration: mapred.map.tasks is deprecated. Instead, use mapreduce.job.maps
> 11/03/04 15:07:12 INFO mapreduce.JobSubmitter: number of splits:100
> 11/03/04 15:07:12 INFO mapreduce.JobSubmitter: adding the following namenodes' delegation tokens:null
> 11/03/04 15:07:12 INFO mapreduce.Job: Running job: job_201103021428_0069
> 11/03/04 15:07:13 INFO mapreduce.Job:  map 0% reduce 0%
> 11/03/04 15:07:20 INFO mapreduce.Job:  map 6% reduce 0%
> 11/03/04 15:07:21 INFO mapreduce.Job:  map 10% reduce 0%
> 11/03/04 15:07:36 INFO mapreduce.Job:  map 18% reduce 0%
> 11/03/04 15:07:38 INFO mapreduce.Job:  map 28% reduce 0%
> 11/03/04 15:07:38 INFO mapreduce.Job: Task Id : attempt_201103021428_0069_m_000016_0, Status : FAILED
> Error: /media/hd4/mapred/local/taskTracker/ngc/jobcache/job_201103021428_0069/attempt_201103021428_0069_m_000016_0/work/tmp/libjniopencv_highgui4138482228584845301.so: libxcb.so.1: failed to map segment from shared object: Cannot allocate memory
> 11/03/04 15:07:39 INFO mapreduce.Job:  map 35% reduce 0%
> 11/03/04 15:07:40 INFO mapreduce.Job: Task Id : attempt_201103021428_0069_m_000001_0, Status : FAILED
> Error: /media/hd3/mapred/local/taskTracker/ngc/jobcache/job_201103021428_0069/attempt_201103021428_0069_m_000001_0/work/tmp/libjniopencv_highgui2385564746644347958.so: libXau.so.6: failed to map segment from shared object: Cannot allocate memory
> 11/03/04 15:07:42 INFO mapreduce.Job:  map 39% reduce 0%
> 11/03/04 15:07:42 INFO mapreduce.Job: Task Id : attempt_201103021428_0069_m_000022_0, Status : FAILED
> 11/03/04 15:07:44 INFO mapreduce.Job:  map 50% reduce 0%
> 11/03/04 15:07:44 INFO mapreduce.Job: Task Id : attempt_201103021428_0069_m_000026_0, Status : FAILED
> 11/03/04 15:07:45 INFO mapreduce.Job: Task Id : attempt_201103021428_0069_m_000011_0, Status : FAILED
> 11/03/04 15:07:46 INFO mapreduce.Job: Task Id : attempt_201103021428_0069_m_000034_0, Status : FAILED
> 11/03/04 15:07:47 INFO mapreduce.Job:  map 50% reduce 13%
> 11/03/04 15:07:47 INFO mapreduce.Job: Task Id : attempt_201103021428_0069_m_000039_0, Status : FAILED
> 11/03/04 15:07:48 INFO mapreduce.Job:  map 59% reduce 13%
> 11/03/04 15:07:48 INFO mapreduce.Job: Task Id : attempt_201103021428_0069_m_000082_0, Status : FAILED
> Error: /media/hd2/mapred/local/taskTracker/ngc/jobcache/job_201103021428_0069/attempt_201103021428_0069_m_000082_0/work/tmp/libjniopencv_highgui2586824844718343743.so: libxcb-render.so.0: failed to map segment from shared object: Cannot allocate memory
> 11/03/04 15:07:50 INFO mapreduce.Job: Task Id : attempt_201103021428_0069_m_000006_0, Status : FAILED
> 11/03/04 15:07:51 INFO mapreduce.Job:  map 67% reduce 13%
> 11/03/04 15:07:51 INFO mapreduce.Job: Task Id : attempt_201103021428_0069_m_000044_0, Status : FAILED
> 11/03/04 15:07:51 INFO mapreduce.Job: Task Id : attempt_201103021428_0069_m_000048_0, Status : FAILED
> 11/03/04 15:07:53 INFO mapreduce.Job: Task Id : attempt_201103021428_0069_m_000052_0, Status : FAILED
> 11/03/04 15:07:53 INFO mapreduce.Job: Task Id : attempt_201103021428_0069_m_000076_0, Status : FAILED
> Error: /media/hd3/mapred/local/taskTracker/ngc/jobcache/job_201103021428_0069/attempt_201103021428_0069_m_000076_0/work/tmp/libjniopencv_highgui6607923753832414434.so: libfontconfig.so.1: failed to map segment from shared object: Cannot allocate memory
> 11/03/04 15:07:54 INFO mapreduce.Job: Task Id : attempt_201103021428_0069_m_000057_0, Status : FAILED
> 11/03/04 15:07:56 INFO mapreduce.Job: Task Id : attempt_201103021428_0069_m_000061_0, Status : FAILED
> 11/03/04 15:07:57 INFO mapreduce.Job: Task Id : attempt_201103021428_0069_m_000065_0, Status : FAILED
> 11/03/04 15:07:59 INFO mapreduce.Job: Task Id : attempt_201103021428_0069_m_000069_0, Status : FAILED
> attempt_201103021428_0069_m_000069_0: Java HotSpot(TM) 64-Bit Server VM warning: Exception java.lang.OutOfMemoryError occurred dispatching signal SIGTERM to handler- the VM may need to be forcibly terminated
> 
> 
> 
>