You are viewing a plain text version of this content. The canonical link for it is here.

Posted to common-dev@hadoop.apache.org by Steve Gao <st...@yahoo.com> on 2009/08/27 21:42:13 UTC

Why "java.util.zip.ZipOutputStream" need to use /tmp?


The hadoop version is 0.18.3 . Recently we got "out of space" issue. It's from "java.util.zip.ZipOutputStream".
We found that /tmp is full and after cleaning /tmp the problem is solved.

However why hadoop needs to use /tmp? We had already configured hadoop tmp to a local disk in: hadoop-site.xml

<property>
  <name>hadoop.tmp.dir</name>
  <value> ... some large local disk ... </value>
</property>


Could it because java.util.zip.ZipOutputStream uses /tmp even if we configured hadoop.tmp.dir to a large local disk?

The error log is here FYI:

java.io.IOException: No space left on device         
at java.io.FileOutputStream.write(Native Method)        
 at java.util.zip.ZipOutputStream.writeInt(ZipOutputStream.java:445)         
at java.util.zip.ZipOutputStream.writeEXT(ZipOutputStream.java:362)         
at java.util.zip.ZipOutputStream.closeEntry(ZipOutputStream.java:220)         
at java.util.zip.ZipOutputStream.finish(ZipOutputStream.java:301)         
at java.util.zip.DeflaterOutputStream.close(DeflaterOutputStream.java:146)         
at java.util.zip.ZipOutputStream.close(ZipOutputStream.java:321)         
at org.apache.hadoop.streaming.JarBuilder.merge(JarBuilder.java:79)         
at org.apache.hadoop.streaming.StreamJob.packageJobJar(StreamJob.java:628)         
at org.apache.hadoop.streaming.StreamJob.setJobConf(StreamJob.java:843)         
at org.apache.hadoop.streaming.StreamJob.go(StreamJob.java:110)         
at org.apache.hadoop.streaming.HadoopStreaming.main(HadoopStreaming.java:33)         
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)         
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)         
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)         
at java.lang.reflect.Method.invoke(Method.java:597)         
at org.apache.hadoop.util.RunJar.main(RunJar.java:155)         
at org.apache.hadoop.mapred.JobShell.run(JobShell.java:194)         
at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)         
at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:79)         
at org.apache.hadoop.mapred.JobShell.main(JobShell.java:220)  
Executing Hadoop job failure

Re: Why "java.util.zip.ZipOutputStream" need to use /tmp?

Posted by "Oliver B. Fischer" <o....@swe-blog.net>.

Hello Steve,

I assume what the java.io.FileOutputStream uses /tmp as tempdir. As you
can see, the errors occurs in a native method. As far I know, /tmp is
standard temp directory on UNIX systems automatically used by many
native library calls. May you can set $TEMPDIR
(http://en.wikipedia.org/wiki/TMPDIR) to another directory?

Best regards,

Oliver

Steve Gao schrieb:
> 
> The hadoop version is 0.18.3 . Recently we got "out of space" issue. It's from "java.util.zip.ZipOutputStream".
> We found that /tmp is full and after cleaning /tmp the problem is solved.
> 
> However why hadoop needs to use /tmp? We had already configured hadoop tmp to a local disk in: hadoop-site.xml
> 
> <property>
>   <name>hadoop.tmp.dir</name>
>   <value> ... some large local disk ... </value>
> </property>
> 
> 
> Could it because java.util.zip.ZipOutputStream uses /tmp even if we configured hadoop.tmp.dir to a large local disk?
> 
> The error log is here FYI:
> 
> java.io.IOException: No space left on device         
> at java.io.FileOutputStream.write(Native Method)        
>  at java.util.zip.ZipOutputStream.writeInt(ZipOutputStream.java:445)         
> at java.util.zip.ZipOutputStream.writeEXT(ZipOutputStream.java:362)         
> at java.util.zip.ZipOutputStream.closeEntry(ZipOutputStream.java:220)         
> at java.util.zip.ZipOutputStream.finish(ZipOutputStream.java:301)         
> at java.util.zip.DeflaterOutputStream.close(DeflaterOutputStream.java:146)         
> at java.util.zip.ZipOutputStream.close(ZipOutputStream.java:321)         
> at org.apache.hadoop.streaming.JarBuilder.merge(JarBuilder.java:79)         
> at org.apache.hadoop.streaming.StreamJob.packageJobJar(StreamJob.java:628)         
> at org.apache.hadoop.streaming.StreamJob.setJobConf(StreamJob.java:843)         
> at org.apache.hadoop.streaming.StreamJob.go(StreamJob.java:110)         
> at org.apache.hadoop.streaming.HadoopStreaming.main(HadoopStreaming.java:33)         
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)         
> at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)         
> at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)         
> at java.lang.reflect.Method.invoke(Method.java:597)         
> at org.apache.hadoop.util.RunJar.main(RunJar.java:155)         
> at org.apache.hadoop.mapred.JobShell.run(JobShell.java:194)         
> at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)         
> at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:79)         
> at org.apache.hadoop.mapred.JobShell.main(JobShell.java:220)  
> Executing Hadoop job failure
> 
> 
> 
> 
>       


-- 
Oliver B. Fischer, Schönhauser Allee 64, 10437 Berlin
Tel. +49 30 44793251, Mobil: +49 178 7903538
Mail: o.b.fischer@swe-blog.net Blog: http://www.swe-blog.net

Re: Why "java.util.zip.ZipOutputStream" need to use /tmp?

Posted by "Oliver B. Fischer" <o....@swe-blog.net>.

Hello Steve,

I assume what the java.io.FileOutputStream uses /tmp as tempdir. As you
can see, the errors occurs in a native method. As far I know, /tmp is
standard temp directory on UNIX systems automatically used by many
native library calls. May you can set $TEMPDIR
(http://en.wikipedia.org/wiki/TMPDIR) to another directory?

Best regards,

Oliver

Steve Gao schrieb:
> 
> The hadoop version is 0.18.3 . Recently we got "out of space" issue. It's from "java.util.zip.ZipOutputStream".
> We found that /tmp is full and after cleaning /tmp the problem is solved.
> 
> However why hadoop needs to use /tmp? We had already configured hadoop tmp to a local disk in: hadoop-site.xml
> 
> <property>
>   <name>hadoop.tmp.dir</name>
>   <value> ... some large local disk ... </value>
> </property>
> 
> 
> Could it because java.util.zip.ZipOutputStream uses /tmp even if we configured hadoop.tmp.dir to a large local disk?
> 
> The error log is here FYI:
> 
> java.io.IOException: No space left on device         
> at java.io.FileOutputStream.write(Native Method)        
>  at java.util.zip.ZipOutputStream.writeInt(ZipOutputStream.java:445)         
> at java.util.zip.ZipOutputStream.writeEXT(ZipOutputStream.java:362)         
> at java.util.zip.ZipOutputStream.closeEntry(ZipOutputStream.java:220)         
> at java.util.zip.ZipOutputStream.finish(ZipOutputStream.java:301)         
> at java.util.zip.DeflaterOutputStream.close(DeflaterOutputStream.java:146)         
> at java.util.zip.ZipOutputStream.close(ZipOutputStream.java:321)         
> at org.apache.hadoop.streaming.JarBuilder.merge(JarBuilder.java:79)         
> at org.apache.hadoop.streaming.StreamJob.packageJobJar(StreamJob.java:628)         
> at org.apache.hadoop.streaming.StreamJob.setJobConf(StreamJob.java:843)         
> at org.apache.hadoop.streaming.StreamJob.go(StreamJob.java:110)         
> at org.apache.hadoop.streaming.HadoopStreaming.main(HadoopStreaming.java:33)         
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)         
> at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)         
> at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)         
> at java.lang.reflect.Method.invoke(Method.java:597)         
> at org.apache.hadoop.util.RunJar.main(RunJar.java:155)         
> at org.apache.hadoop.mapred.JobShell.run(JobShell.java:194)         
> at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)         
> at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:79)         
> at org.apache.hadoop.mapred.JobShell.main(JobShell.java:220)  
> Executing Hadoop job failure
> 
> 
> 
> 
>       


-- 
Oliver B. Fischer, Schönhauser Allee 64, 10437 Berlin
Tel. +49 30 44793251, Mobil: +49 178 7903538
Mail: o.b.fischer@swe-blog.net Blog: http://www.swe-blog.net

Re: [Help] Why "java.util.zip.ZipOutputStream" need to use /tmp?

Posted by James Cipar <jc...@andrew.cmu.edu>.

I would agree with removing it from the default build for now.

I only used thrift because that's what we were using for all of the
RPC at the time.  I'd rather that we just settle on one RPC to rule
them all, and I will change the code accordingly.


On Aug 28, 2009, at 3:04 PM, Steve Gao wrote:

> Thanks, Brian. Would you tell me what is the filename of the code  
> snippet?
>
> --- On Fri, 8/28/09, Brian Bockelman <bb...@cse.unl.edu> wrote:
>
> From: Brian Bockelman <bb...@cse.unl.edu>
> Subject: Re: [Help] Why "java.util.zip.ZipOutputStream" need to use / 
> tmp?
> To: common-user@hadoop.apache.org
> Date: Friday, August 28, 2009, 2:37 PM
>
> Actually, poking the code, it seems that the streaming package does  
> set this value:
>
>     String tmp = jobConf_.get("stream.tmpdir"); //, "/tmp/$ 
> {user.name}/"
>
> Try setting stream.tmpdir to a different directory maybe?
>
> Brian
>
> On Aug 28, 2009, at 1:31 PM, Steve Gao wrote:
>
>> Thanks lot, Brian. It seems to be a design flaw of hadoop that it  
>> can not manage (or pass in) the temp of "java.util.zip". Can we  
>> create a jira ticket for this?
>>
>> --- On Fri, 8/28/09, Brian Bockelman <bb...@cse.unl.edu> wrote:
>>
>> From: Brian Bockelman <bb...@cse.unl.edu>
>> Subject: Re: [Help] Why "java.util.zip.ZipOutputStream" need to  
>> use /tmp?
>> To:
>> Cc: common-user@hadoop.apache.org
>> Date: Friday, August 28, 2009, 2:27 PM
>>
>> Hey Steve,
>>
>> Correct, java.util.zip.* does not necessarily respect hadoop  
>> settings.
>>
>> Try setting TMPDIR in the environment to your large local disk  
>> space.  It might respect that, if Java decides to act like a unix  
>> utility.
>>
>> http://en.wikipedia.org/wiki/TMPDIR
>>
>> Brian
>>
>> On Aug 28, 2009, at 1:19 PM, Steve Gao wrote:
>>
>>> would someone give us a hint? Thanks.
>>> Why "java.util.zip.ZipOutputStream" need to use /tmp?
>>>
>>> The hadoop version is 0.18.3 . Recently we got "out of space"  
>>> issue. It's from "java.util.zip.ZipOutputStream".
>>> We found that /tmp is full and after cleaning /tmp the problem is  
>>> solved.
>>>
>>> However why hadoop needs to use /tmp? We had already configured  
>>> hadoop tmp to a local disk in: hadoop-site.xml
>>>
>>> <property>
>>>     <name>hadoop.tmp.dir</name>
>>>     <value> ... some large local disk ... </value>
>>> </property>
>>>
>>>
>>> Could it because java.util.zip.ZipOutputStream uses /tmp even if  
>>> we configured hadoop.tmp.dir to a large local disk?
>>>
>>> The error log is here FYI:
>>>
>>> java.io.IOException: No space left on device
>>> at java.io.FileOutputStream.write(Native Method)
>>>    at java.util.zip.ZipOutputStream.writeInt(ZipOutputStream.java: 
>>> 445)
>>> at java.util.zip.ZipOutputStream.writeEXT(ZipOutputStream.java:362)
>>> at java.util.zip.ZipOutputStream.closeEntry(ZipOutputStream.java: 
>>> 220)
>>> at java.util.zip.ZipOutputStream.finish(ZipOutputStream.java:301)
>>> at  
>>> java.util.zip.DeflaterOutputStream.close(DeflaterOutputStream.java: 
>>> 146)
>>> at java.util.zip.ZipOutputStream.close(ZipOutputStream.java:321)
>>> at org.apache.hadoop.streaming.JarBuilder.merge(JarBuilder.java:79)
>>> at  
>>> org.apache.hadoop.streaming.StreamJob.packageJobJar(StreamJob.java: 
>>> 628)
>>> at org.apache.hadoop.streaming.StreamJob.setJobConf(StreamJob.java: 
>>> 843)
>>> at org.apache.hadoop.streaming.StreamJob.go(StreamJob.java:110)
>>> at  
>>> org 
>>> .apache.hadoop.streaming.HadoopStreaming.main(HadoopStreaming.java: 
>>> 33)
>>> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>>> at  
>>> sun 
>>> .reflect 
>>> .NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
>>> at  
>>> sun 
>>> .reflect 
>>> .DelegatingMethodAccessorImpl 
>>> .invoke(DelegatingMethodAccessorImpl.java:25)
>>> at java.lang.reflect.Method.invoke(Method.java:597)
>>> at org.apache.hadoop.util.RunJar.main(RunJar.java:155)
>>> at org.apache.hadoop.mapred.JobShell.run(JobShell.java:194)
>>> at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)
>>> at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:79)
>>> at org.apache.hadoop.mapred.JobShell.main(JobShell.java:220)
>>> Executing Hadoop job failure
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>
>>
>>
>>
>
>
>
>

Re: [Help] Why "java.util.zip.ZipOutputStream" need to use /tmp?

Posted by James Cipar <jc...@andrew.cmu.edu>.

Sorry that last one, I replied to the wrong message.



On Aug 28, 2009, at 3:04 PM, Steve Gao wrote:

> Thanks, Brian. Would you tell me what is the filename of the code  
> snippet?
>
> --- On Fri, 8/28/09, Brian Bockelman <bb...@cse.unl.edu> wrote:
>
> From: Brian Bockelman <bb...@cse.unl.edu>
> Subject: Re: [Help] Why "java.util.zip.ZipOutputStream" need to use / 
> tmp?
> To: common-user@hadoop.apache.org
> Date: Friday, August 28, 2009, 2:37 PM
>
> Actually, poking the code, it seems that the streaming package does  
> set this value:
>
>     String tmp = jobConf_.get("stream.tmpdir"); //, "/tmp/$ 
> {user.name}/"
>
> Try setting stream.tmpdir to a different directory maybe?
>
> Brian
>
> On Aug 28, 2009, at 1:31 PM, Steve Gao wrote:
>
>> Thanks lot, Brian. It seems to be a design flaw of hadoop that it  
>> can not manage (or pass in) the temp of "java.util.zip". Can we  
>> create a jira ticket for this?
>>
>> --- On Fri, 8/28/09, Brian Bockelman <bb...@cse.unl.edu> wrote:
>>
>> From: Brian Bockelman <bb...@cse.unl.edu>
>> Subject: Re: [Help] Why "java.util.zip.ZipOutputStream" need to  
>> use /tmp?
>> To:
>> Cc: common-user@hadoop.apache.org
>> Date: Friday, August 28, 2009, 2:27 PM
>>
>> Hey Steve,
>>
>> Correct, java.util.zip.* does not necessarily respect hadoop  
>> settings.
>>
>> Try setting TMPDIR in the environment to your large local disk  
>> space.  It might respect that, if Java decides to act like a unix  
>> utility.
>>
>> http://en.wikipedia.org/wiki/TMPDIR
>>
>> Brian
>>
>> On Aug 28, 2009, at 1:19 PM, Steve Gao wrote:
>>
>>> would someone give us a hint? Thanks.
>>> Why "java.util.zip.ZipOutputStream" need to use /tmp?
>>>
>>> The hadoop version is 0.18.3 . Recently we got "out of space"  
>>> issue. It's from "java.util.zip.ZipOutputStream".
>>> We found that /tmp is full and after cleaning /tmp the problem is  
>>> solved.
>>>
>>> However why hadoop needs to use /tmp? We had already configured  
>>> hadoop tmp to a local disk in: hadoop-site.xml
>>>
>>> <property>
>>>     <name>hadoop.tmp.dir</name>
>>>     <value> ... some large local disk ... </value>
>>> </property>
>>>
>>>
>>> Could it because java.util.zip.ZipOutputStream uses /tmp even if  
>>> we configured hadoop.tmp.dir to a large local disk?
>>>
>>> The error log is here FYI:
>>>
>>> java.io.IOException: No space left on device
>>> at java.io.FileOutputStream.write(Native Method)
>>>    at java.util.zip.ZipOutputStream.writeInt(ZipOutputStream.java: 
>>> 445)
>>> at java.util.zip.ZipOutputStream.writeEXT(ZipOutputStream.java:362)
>>> at java.util.zip.ZipOutputStream.closeEntry(ZipOutputStream.java: 
>>> 220)
>>> at java.util.zip.ZipOutputStream.finish(ZipOutputStream.java:301)
>>> at  
>>> java.util.zip.DeflaterOutputStream.close(DeflaterOutputStream.java: 
>>> 146)
>>> at java.util.zip.ZipOutputStream.close(ZipOutputStream.java:321)
>>> at org.apache.hadoop.streaming.JarBuilder.merge(JarBuilder.java:79)
>>> at  
>>> org.apache.hadoop.streaming.StreamJob.packageJobJar(StreamJob.java: 
>>> 628)
>>> at org.apache.hadoop.streaming.StreamJob.setJobConf(StreamJob.java: 
>>> 843)
>>> at org.apache.hadoop.streaming.StreamJob.go(StreamJob.java:110)
>>> at  
>>> org 
>>> .apache.hadoop.streaming.HadoopStreaming.main(HadoopStreaming.java: 
>>> 33)
>>> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>>> at  
>>> sun 
>>> .reflect 
>>> .NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
>>> at  
>>> sun 
>>> .reflect 
>>> .DelegatingMethodAccessorImpl 
>>> .invoke(DelegatingMethodAccessorImpl.java:25)
>>> at java.lang.reflect.Method.invoke(Method.java:597)
>>> at org.apache.hadoop.util.RunJar.main(RunJar.java:155)
>>> at org.apache.hadoop.mapred.JobShell.run(JobShell.java:194)
>>> at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)
>>> at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:79)
>>> at org.apache.hadoop.mapred.JobShell.main(JobShell.java:220)
>>> Executing Hadoop job failure
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>
>>
>>
>>
>
>
>
>

Re: [Help] Why "java.util.zip.ZipOutputStream" need to use /tmp?

Posted by Brian Bockelman <bb...@cse.unl.edu>.

I saw this in:

>> org.apache.hadoop.streaming.StreamJob.packageJobJar

Brian

On Aug 28, 2009, at 2:04 PM, Steve Gao wrote:

> Thanks, Brian. Would you tell me what is the filename of the code  
> snippet?
>
> --- On Fri, 8/28/09, Brian Bockelman <bb...@cse.unl.edu> wrote:
>
> From: Brian Bockelman <bb...@cse.unl.edu>
> Subject: Re: [Help] Why "java.util.zip.ZipOutputStream" need to use / 
> tmp?
> To: common-user@hadoop.apache.org
> Date: Friday, August 28, 2009, 2:37 PM
>
> Actually, poking the code, it seems that the streaming package does  
> set this value:
>
>     String tmp = jobConf_.get("stream.tmpdir"); //, "/tmp/$ 
> {user.name}/"
>
> Try setting stream.tmpdir to a different directory maybe?
>
> Brian
>
> On Aug 28, 2009, at 1:31 PM, Steve Gao wrote:
>
>> Thanks lot, Brian. It seems to be a design flaw of hadoop that it  
>> can not manage (or pass in) the temp of "java.util.zip". Can we  
>> create a jira ticket for this?
>>
>> --- On Fri, 8/28/09, Brian Bockelman <bb...@cse.unl.edu> wrote:
>>
>> From: Brian Bockelman <bb...@cse.unl.edu>
>> Subject: Re: [Help] Why "java.util.zip.ZipOutputStream" need to  
>> use /tmp?
>> To:
>> Cc: common-user@hadoop.apache.org
>> Date: Friday, August 28, 2009, 2:27 PM
>>
>> Hey Steve,
>>
>> Correct, java.util.zip.* does not necessarily respect hadoop  
>> settings.
>>
>> Try setting TMPDIR in the environment to your large local disk  
>> space.  It might respect that, if Java decides to act like a unix  
>> utility.
>>
>> http://en.wikipedia.org/wiki/TMPDIR
>>
>> Brian
>>
>> On Aug 28, 2009, at 1:19 PM, Steve Gao wrote:
>>
>>> would someone give us a hint? Thanks.
>>> Why "java.util.zip.ZipOutputStream" need to use /tmp?
>>>
>>> The hadoop version is 0.18.3 . Recently we got "out of space"  
>>> issue. It's from "java.util.zip.ZipOutputStream".
>>> We found that /tmp is full and after cleaning /tmp the problem is  
>>> solved.
>>>
>>> However why hadoop needs to use /tmp? We had already configured  
>>> hadoop tmp to a local disk in: hadoop-site.xml
>>>
>>> <property>
>>>     <name>hadoop.tmp.dir</name>
>>>     <value> ... some large local disk ... </value>
>>> </property>
>>>
>>>
>>> Could it because java.util.zip.ZipOutputStream uses /tmp even if  
>>> we configured hadoop.tmp.dir to a large local disk?
>>>
>>> The error log is here FYI:
>>>
>>> java.io.IOException: No space left on device
>>> at java.io.FileOutputStream.write(Native Method)
>>>    at java.util.zip.ZipOutputStream.writeInt(ZipOutputStream.java: 
>>> 445)
>>> at java.util.zip.ZipOutputStream.writeEXT(ZipOutputStream.java:362)
>>> at java.util.zip.ZipOutputStream.closeEntry(ZipOutputStream.java: 
>>> 220)
>>> at java.util.zip.ZipOutputStream.finish(ZipOutputStream.java:301)
>>> at  
>>> java.util.zip.DeflaterOutputStream.close(DeflaterOutputStream.java: 
>>> 146)
>>> at java.util.zip.ZipOutputStream.close(ZipOutputStream.java:321)
>>> at org.apache.hadoop.streaming.JarBuilder.merge(JarBuilder.java:79)
>>> at  
>>> org.apache.hadoop.streaming.StreamJob.packageJobJar(StreamJob.java: 
>>> 628)
>>> at org.apache.hadoop.streaming.StreamJob.setJobConf(StreamJob.java: 
>>> 843)
>>> at org.apache.hadoop.streaming.StreamJob.go(StreamJob.java:110)
>>> at  
>>> org 
>>> .apache.hadoop.streaming.HadoopStreaming.main(HadoopStreaming.java: 
>>> 33)
>>> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>>> at  
>>> sun 
>>> .reflect 
>>> .NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
>>> at  
>>> sun 
>>> .reflect 
>>> .DelegatingMethodAccessorImpl 
>>> .invoke(DelegatingMethodAccessorImpl.java:25)
>>> at java.lang.reflect.Method.invoke(Method.java:597)
>>> at org.apache.hadoop.util.RunJar.main(RunJar.java:155)
>>> at org.apache.hadoop.mapred.JobShell.run(JobShell.java:194)
>>> at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)
>>> at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:79)
>>> at org.apache.hadoop.mapred.JobShell.main(JobShell.java:220)
>>> Executing Hadoop job failure
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>
>>
>>
>>
>
>
>
>

Re: [Help] Why "java.util.zip.ZipOutputStream" need to use /tmp?

Posted by Steve Gao <st...@yahoo.com>.

Thanks, Brian. Would you tell me what is the filename of the code snippet?

--- On Fri, 8/28/09, Brian Bockelman <bb...@cse.unl.edu> wrote:

From: Brian Bockelman <bb...@cse.unl.edu>
Subject: Re: [Help] Why "java.util.zip.ZipOutputStream" need to use /tmp?
To: common-user@hadoop.apache.org
Date: Friday, August 28, 2009, 2:37 PM

Actually, poking the code, it seems that the streaming package does set this value:

    String tmp = jobConf_.get("stream.tmpdir"); //, "/tmp/${user.name}/"

Try setting stream.tmpdir to a different directory maybe?

Brian

On Aug 28, 2009, at 1:31 PM, Steve Gao wrote:

> Thanks lot, Brian. It seems to be a design flaw of hadoop that it can not manage (or pass in) the temp of "java.util.zip". Can we create a jira ticket for this?
> 
> --- On Fri, 8/28/09, Brian Bockelman <bb...@cse.unl.edu> wrote:
> 
> From: Brian Bockelman <bb...@cse.unl.edu>
> Subject: Re: [Help] Why "java.util.zip.ZipOutputStream" need to use /tmp?
> To:
> Cc: common-user@hadoop.apache.org
> Date: Friday, August 28, 2009, 2:27 PM
> 
> Hey Steve,
> 
> Correct, java.util.zip.* does not necessarily respect hadoop settings.
> 
> Try setting TMPDIR in the environment to your large local disk space.  It might respect that, if Java decides to act like a unix utility.
> 
> http://en.wikipedia.org/wiki/TMPDIR
> 
> Brian
> 
> On Aug 28, 2009, at 1:19 PM, Steve Gao wrote:
> 
>> would someone give us a hint? Thanks.
>> Why "java.util.zip.ZipOutputStream" need to use /tmp?
>> 
>> The hadoop version is 0.18.3 . Recently we got "out of space" issue. It's from "java.util.zip.ZipOutputStream".
>> We found that /tmp is full and after cleaning /tmp the problem is solved.
>> 
>> However why hadoop needs to use /tmp? We had already configured hadoop tmp to a local disk in: hadoop-site.xml
>> 
>> <property>
>>    <name>hadoop.tmp.dir</name>
>>    <value> ... some large local disk ... </value>
>> </property>
>> 
>> 
>> Could it because java.util.zip.ZipOutputStream uses /tmp even if we configured hadoop.tmp.dir to a large local disk?
>> 
>> The error log is here FYI:
>> 
>> java.io.IOException: No space left on device
>> at java.io.FileOutputStream.write(Native Method)
>>   at java.util.zip.ZipOutputStream.writeInt(ZipOutputStream.java:445)
>> at java.util.zip.ZipOutputStream.writeEXT(ZipOutputStream.java:362)
>> at java.util.zip.ZipOutputStream.closeEntry(ZipOutputStream.java:220)
>> at java.util.zip.ZipOutputStream.finish(ZipOutputStream.java:301)
>> at java.util.zip.DeflaterOutputStream.close(DeflaterOutputStream.java:146)
>> at java.util.zip.ZipOutputStream.close(ZipOutputStream.java:321)
>> at org.apache.hadoop.streaming.JarBuilder.merge(JarBuilder.java:79)
>> at org.apache.hadoop.streaming.StreamJob.packageJobJar(StreamJob.java:628)
>> at org.apache.hadoop.streaming.StreamJob.setJobConf(StreamJob.java:843)
>> at org.apache.hadoop.streaming.StreamJob.go(StreamJob.java:110)
>> at org.apache.hadoop.streaming.HadoopStreaming.main(HadoopStreaming.java:33)
>> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>> at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
>> at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
>> at java.lang.reflect.Method.invoke(Method.java:597)
>> at org.apache.hadoop.util.RunJar.main(RunJar.java:155)
>> at org.apache.hadoop.mapred.JobShell.run(JobShell.java:194)
>> at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)
>> at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:79)
>> at org.apache.hadoop.mapred.JobShell.main(JobShell.java:220)
>> Executing Hadoop job failure
>> 
>> 
>> 
>> 
>> 
>> 
>> 
> 
> 
> 
>

Re: [Help] Why "java.util.zip.ZipOutputStream" need to use /tmp?

Posted by Brian Bockelman <bb...@cse.unl.edu>.

Actually, poking the code, it seems that the streaming package does  
set this value:

     String tmp = jobConf_.get("stream.tmpdir"); //, "/tmp/$ 
{user.name}/"

Try setting stream.tmpdir to a different directory maybe?

Brian

On Aug 28, 2009, at 1:31 PM, Steve Gao wrote:

> Thanks lot, Brian. It seems to be a design flaw of hadoop that it  
> can not manage (or pass in) the temp of "java.util.zip". Can we  
> create a jira ticket for this?
>
> --- On Fri, 8/28/09, Brian Bockelman <bb...@cse.unl.edu> wrote:
>
> From: Brian Bockelman <bb...@cse.unl.edu>
> Subject: Re: [Help] Why "java.util.zip.ZipOutputStream" need to use / 
> tmp?
> To:
> Cc: common-user@hadoop.apache.org
> Date: Friday, August 28, 2009, 2:27 PM
>
> Hey Steve,
>
> Correct, java.util.zip.* does not necessarily respect hadoop settings.
>
> Try setting TMPDIR in the environment to your large local disk  
> space.  It might respect that, if Java decides to act like a unix  
> utility.
>
> http://en.wikipedia.org/wiki/TMPDIR
>
> Brian
>
> On Aug 28, 2009, at 1:19 PM, Steve Gao wrote:
>
>> would someone give us a hint? Thanks.
>> Why "java.util.zip.ZipOutputStream" need to use /tmp?
>>
>> The hadoop version is 0.18.3 . Recently we got "out of space"  
>> issue. It's from "java.util.zip.ZipOutputStream".
>> We found that /tmp is full and after cleaning /tmp the problem is  
>> solved.
>>
>> However why hadoop needs to use /tmp? We had already configured  
>> hadoop tmp to a local disk in: hadoop-site.xml
>>
>> <property>
>>    <name>hadoop.tmp.dir</name>
>>    <value> ... some large local disk ... </value>
>> </property>
>>
>>
>> Could it because java.util.zip.ZipOutputStream uses /tmp even if we  
>> configured hadoop.tmp.dir to a large local disk?
>>
>> The error log is here FYI:
>>
>> java.io.IOException: No space left on device
>> at java.io.FileOutputStream.write(Native Method)
>>   at java.util.zip.ZipOutputStream.writeInt(ZipOutputStream.java:445)
>> at java.util.zip.ZipOutputStream.writeEXT(ZipOutputStream.java:362)
>> at java.util.zip.ZipOutputStream.closeEntry(ZipOutputStream.java:220)
>> at java.util.zip.ZipOutputStream.finish(ZipOutputStream.java:301)
>> at  
>> java.util.zip.DeflaterOutputStream.close(DeflaterOutputStream.java: 
>> 146)
>> at java.util.zip.ZipOutputStream.close(ZipOutputStream.java:321)
>> at org.apache.hadoop.streaming.JarBuilder.merge(JarBuilder.java:79)
>> at  
>> org.apache.hadoop.streaming.StreamJob.packageJobJar(StreamJob.java: 
>> 628)
>> at org.apache.hadoop.streaming.StreamJob.setJobConf(StreamJob.java: 
>> 843)
>> at org.apache.hadoop.streaming.StreamJob.go(StreamJob.java:110)
>> at  
>> org 
>> .apache.hadoop.streaming.HadoopStreaming.main(HadoopStreaming.java: 
>> 33)
>> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>> at  
>> sun 
>> .reflect 
>> .NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
>> at  
>> sun 
>> .reflect 
>> .DelegatingMethodAccessorImpl 
>> .invoke(DelegatingMethodAccessorImpl.java:25)
>> at java.lang.reflect.Method.invoke(Method.java:597)
>> at org.apache.hadoop.util.RunJar.main(RunJar.java:155)
>> at org.apache.hadoop.mapred.JobShell.run(JobShell.java:194)
>> at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)
>> at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:79)
>> at org.apache.hadoop.mapred.JobShell.main(JobShell.java:220)
>> Executing Hadoop job failure
>>
>>
>>
>>
>>
>>
>>
>
>
>
>

Re: [Help] Why "java.util.zip.ZipOutputStream" need to use /tmp?

Posted by Steve Gao <st...@yahoo.com>.

Thanks lot, Brian. It seems to be a design flaw of hadoop that it can not manage (or pass in) the temp of "java.util.zip". Can we create a jira ticket for this?

--- On Fri, 8/28/09, Brian Bockelman <bb...@cse.unl.edu> wrote:

From: Brian Bockelman <bb...@cse.unl.edu>
Subject: Re: [Help] Why "java.util.zip.ZipOutputStream" need to use /tmp?
To: 
Cc: common-user@hadoop.apache.org
Date: Friday, August 28, 2009, 2:27 PM

Hey Steve,

Correct, java.util.zip.* does not necessarily respect hadoop settings.

Try setting TMPDIR in the environment to your large local disk space.  It might respect that, if Java decides to act like a unix utility.

http://en.wikipedia.org/wiki/TMPDIR

Brian

On Aug 28, 2009, at 1:19 PM, Steve Gao wrote:

> would someone give us a hint? Thanks.
> Why "java.util.zip.ZipOutputStream" need to use /tmp?
> 
> The hadoop version is 0.18.3 . Recently we got "out of space" issue. It's from "java.util.zip.ZipOutputStream".
> We found that /tmp is full and after cleaning /tmp the problem is solved.
> 
> However why hadoop needs to use /tmp? We had already configured hadoop tmp to a local disk in: hadoop-site.xml
> 
> <property>
>   <name>hadoop.tmp.dir</name>
>   <value> ... some large local disk ... </value>
> </property>
> 
> 
> Could it because java.util.zip.ZipOutputStream uses /tmp even if we configured hadoop.tmp.dir to a large local disk?
> 
> The error log is here FYI:
> 
> java.io.IOException: No space left on device
> at java.io.FileOutputStream.write(Native Method)
>  at java.util.zip.ZipOutputStream.writeInt(ZipOutputStream.java:445)
> at java.util.zip.ZipOutputStream.writeEXT(ZipOutputStream.java:362)
> at java.util.zip.ZipOutputStream.closeEntry(ZipOutputStream.java:220)
> at java.util.zip.ZipOutputStream.finish(ZipOutputStream.java:301)
> at java.util.zip.DeflaterOutputStream.close(DeflaterOutputStream.java:146)
> at java.util.zip.ZipOutputStream.close(ZipOutputStream.java:321)
> at org.apache.hadoop.streaming.JarBuilder.merge(JarBuilder.java:79)
> at org.apache.hadoop.streaming.StreamJob.packageJobJar(StreamJob.java:628)
> at org.apache.hadoop.streaming.StreamJob.setJobConf(StreamJob.java:843)
> at org.apache.hadoop.streaming.StreamJob.go(StreamJob.java:110)
> at org.apache.hadoop.streaming.HadoopStreaming.main(HadoopStreaming.java:33)
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
> at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
> at java.lang.reflect.Method.invoke(Method.java:597)
> at org.apache.hadoop.util.RunJar.main(RunJar.java:155)
> at org.apache.hadoop.mapred.JobShell.run(JobShell.java:194)
> at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)
> at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:79)
> at org.apache.hadoop.mapred.JobShell.main(JobShell.java:220)
> Executing Hadoop job failure
> 
> 
> 
> 
> 
> 
>

Re: [Help] Why "java.util.zip.ZipOutputStream" need to use /tmp?

Posted by Brian Bockelman <bb...@cse.unl.edu>.

Hey Steve,

Correct, java.util.zip.* does not necessarily respect hadoop settings.

Try setting TMPDIR in the environment to your large local disk space.   
It might respect that, if Java decides to act like a unix utility.

http://en.wikipedia.org/wiki/TMPDIR

Brian

On Aug 28, 2009, at 1:19 PM, Steve Gao wrote:

> would someone give us a hint? Thanks.
> Why "java.util.zip.ZipOutputStream" need to use /tmp?
>
> The hadoop version is 0.18.3 . Recently we got "out of space" issue.  
> It's from "java.util.zip.ZipOutputStream".
> We found that /tmp is full and after cleaning /tmp the problem is  
> solved.
>
> However why hadoop needs to use /tmp? We had already configured  
> hadoop tmp to a local disk in: hadoop-site.xml
>
> <property>
>   <name>hadoop.tmp.dir</name>
>   <value> ... some large local disk ... </value>
> </property>
>
>
> Could it because java.util.zip.ZipOutputStream uses /tmp even if we  
> configured hadoop.tmp.dir to a large local disk?
>
> The error log is here FYI:
>
> java.io.IOException: No space left on device
> at java.io.FileOutputStream.write(Native Method)
>  at java.util.zip.ZipOutputStream.writeInt(ZipOutputStream.java:445)
> at java.util.zip.ZipOutputStream.writeEXT(ZipOutputStream.java:362)
> at java.util.zip.ZipOutputStream.closeEntry(ZipOutputStream.java:220)
> at java.util.zip.ZipOutputStream.finish(ZipOutputStream.java:301)
> at  
> java.util.zip.DeflaterOutputStream.close(DeflaterOutputStream.java: 
> 146)
> at java.util.zip.ZipOutputStream.close(ZipOutputStream.java:321)
> at org.apache.hadoop.streaming.JarBuilder.merge(JarBuilder.java:79)
> at  
> org.apache.hadoop.streaming.StreamJob.packageJobJar(StreamJob.java: 
> 628)
> at org.apache.hadoop.streaming.StreamJob.setJobConf(StreamJob.java: 
> 843)
> at org.apache.hadoop.streaming.StreamJob.go(StreamJob.java:110)
> at  
> org 
> .apache.hadoop.streaming.HadoopStreaming.main(HadoopStreaming.java:33)
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> at  
> sun 
> .reflect 
> .NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
> at  
> sun 
> .reflect 
> .DelegatingMethodAccessorImpl 
> .invoke(DelegatingMethodAccessorImpl.java:25)
> at java.lang.reflect.Method.invoke(Method.java:597)
> at org.apache.hadoop.util.RunJar.main(RunJar.java:155)
> at org.apache.hadoop.mapred.JobShell.run(JobShell.java:194)
> at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)
> at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:79)
> at org.apache.hadoop.mapred.JobShell.main(JobShell.java:220)
> Executing Hadoop job failure
>
>
>
>
>
>
>

Who are the gurus in Hive and/or Hbase?

Posted by Gopal Gandhi <go...@yahoo.com>.

We are inviting gurus or major contributors of Hive and/or Hbase (or anything related to Hadoop) to give us presentations about the products. Would you name a few names? The gurus must be in bay area. 
Thanks.

Who are the gurus in Hive and/or Hbase?

Posted by Gopal Gandhi <go...@yahoo.com>.

We are inviting gurus or major contributors of Hive and/or Hbase (or anything related to Hadoop) to give us presentations about the products. Would you name a few names? The gurus must be in bay area. 
Thanks.

[Help] Why "java.util.zip.ZipOutputStream" need to use /tmp?

Posted by Steve Gao <st...@yahoo.com>.

would someone give us a hint? Thanks.
Why "java.util.zip.ZipOutputStream" need to use /tmp?

The hadoop version is 0.18.3 . Recently we got "out of space" issue. It's from "java.util.zip.ZipOutputStream".
We found that /tmp is full and after cleaning /tmp the problem is solved.

However why hadoop needs to use /tmp? We had already configured hadoop tmp to a local disk in: hadoop-site.xml

<property>
  <name>hadoop.tmp.dir</name>
  <value> ... some large local disk ... </value>
</property>


Could it because java.util.zip.ZipOutputStream uses /tmp even if we configured hadoop.tmp.dir to a large local disk?

The error log is here FYI:

java.io.IOException: No space left on device         
at java.io.FileOutputStream.write(Native Method)        
 at java.util.zip.ZipOutputStream.writeInt(ZipOutputStream.java:445)         
at java.util.zip.ZipOutputStream.writeEXT(ZipOutputStream.java:362)         
at java.util.zip.ZipOutputStream.closeEntry(ZipOutputStream.java:220)         
at java.util.zip.ZipOutputStream.finish(ZipOutputStream.java:301)         
at java.util.zip.DeflaterOutputStream.close(DeflaterOutputStream.java:146)         
at java.util.zip.ZipOutputStream.close(ZipOutputStream.java:321)         
at org.apache.hadoop.streaming.JarBuilder.merge(JarBuilder.java:79)         
at org.apache.hadoop.streaming.StreamJob.packageJobJar(StreamJob.java:628)         
at org.apache.hadoop.streaming.StreamJob.setJobConf(StreamJob.java:843)         
at org.apache.hadoop.streaming.StreamJob.go(StreamJob.java:110)         
at org.apache.hadoop.streaming.HadoopStreaming.main(HadoopStreaming.java:33)         
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)         
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)         
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)         
at java.lang.reflect.Method.invoke(Method.java:597)         
at org.apache.hadoop.util.RunJar.main(RunJar.java:155)         
at org.apache.hadoop.mapred.JobShell.run(JobShell.java:194)         
at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)         
at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:79)         
at org.apache.hadoop.mapred.JobShell.main(JobShell.java:220)  
Executing Hadoop job failure

[Help] Why "java.util.zip.ZipOutputStream" need to use /tmp?

Posted by Steve Gao <st...@yahoo.com>.

would someone give us a hint? Thanks.
Why "java.util.zip.ZipOutputStream" need to use /tmp?

The hadoop version is 0.18.3 . Recently we got "out of space" issue. It's from "java.util.zip.ZipOutputStream".
We found that /tmp is full and after cleaning /tmp the problem is solved.

However why hadoop needs to use /tmp? We had already configured hadoop tmp to a local disk in: hadoop-site.xml

<property>
  <name>hadoop.tmp.dir</name>
  <value> ... some large local disk ... </value>
</property>


Could it because java.util.zip.ZipOutputStream uses /tmp even if we configured hadoop.tmp.dir to a large local disk?

The error log is here FYI:

java.io.IOException: No space left on device         
at java.io.FileOutputStream.write(Native Method)        
 at java.util.zip.ZipOutputStream.writeInt(ZipOutputStream.java:445)         
at java.util.zip.ZipOutputStream.writeEXT(ZipOutputStream.java:362)         
at java.util.zip.ZipOutputStream.closeEntry(ZipOutputStream.java:220)         
at java.util.zip.ZipOutputStream.finish(ZipOutputStream.java:301)         
at java.util.zip.DeflaterOutputStream.close(DeflaterOutputStream.java:146)         
at java.util.zip.ZipOutputStream.close(ZipOutputStream.java:321)         
at org.apache.hadoop.streaming.JarBuilder.merge(JarBuilder.java:79)         
at org.apache.hadoop.streaming.StreamJob.packageJobJar(StreamJob.java:628)         
at org.apache.hadoop.streaming.StreamJob.setJobConf(StreamJob.java:843)         
at org.apache.hadoop.streaming.StreamJob.go(StreamJob.java:110)         
at org.apache.hadoop.streaming.HadoopStreaming.main(HadoopStreaming.java:33)         
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)         
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)         
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)         
at java.lang.reflect.Method.invoke(Method.java:597)         
at org.apache.hadoop.util.RunJar.main(RunJar.java:155)         
at org.apache.hadoop.mapred.JobShell.run(JobShell.java:194)         
at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)         
at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:79)         
at org.apache.hadoop.mapred.JobShell.main(JobShell.java:220)  
Executing Hadoop job failure