You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@hive.apache.org by Kiwon Lee <ki...@gmail.com> on 2012/08/28 16:23:05 UTC

Unexpected end of input stream

Hi

I have a lot of compressed gzip files on hdfs.
An exception has occurred at TaskTracker, during processing of MR.
If any file is invalid, may I know that?


2012-08-28 09:17:56,320 INFO ExecMapper: ExecMapper: processed 0 rows: used
memory = 125190136
2012-08-28 09:17:56,324 INFO org.apache.hadoop.mapred.TaskLogsTruncater:
Initializing logs' truncater with mapRetainSize=-1 and reduceRetainSize=-1
2012-08-28 09:17:56,326 ERROR
org.apache.hadoop.security.UserGroupInformation: PriviledgedActionException
as:ubuntu (auth:SIMPLE) cause:java.io.IOException: java.io.EOFException:
Unexpected end of input stream
2012-08-28 09:17:56,326 WARN org.apache.hadoop.mapred.Child: Error running
child
java.io.IOException: java.io.EOFException: Unexpected end of input stream
        at
org.apache.hadoop.hive.io.HiveIOExceptionHandlerChain.handleRecordReaderNextException(HiveIOExceptionHandlerChain.java:121)
        at
org.apache.hadoop.hive.io.HiveIOExceptionHandlerUtil.handleRecordReaderNextException(HiveIOExceptionHandlerUtil.java:77)
        at
org.apache.hadoop.hive.ql.io.HiveContextAwareRecordReader.doNext(HiveContextAwareRecordReader.java:275)
        at
org.apache.hadoop.hive.ql.io.HiveRecordReader.doNext(HiveRecordReader.java:79)
        at
org.apache.hadoop.hive.ql.io.HiveRecordReader.doNext(HiveRecordReader.java:33)
        at
org.apache.hadoop.hive.ql.io.HiveContextAwareRecordReader.next(HiveContextAwareRecordReader.java:108)
        at
org.apache.hadoop.mapred.MapTask$TrackedRecordReader.moveToNext(MapTask.java:210)
        at
org.apache.hadoop.mapred.MapTask$TrackedRecordReader.next(MapTask.java:195)
        at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:48)
        at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:393)
        at org.apache.hadoop.mapred.MapTask.run(MapTask.java:327)
        at org.apache.hadoop.mapred.Child$4.run(Child.java:270)
        at java.security.AccessController.doPrivileged(Native Method)
        at javax.security.auth.Subject.doAs(Subject.java:396)
        at
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1232)
        at org.apache.hadoop.mapred.Child.main(Child.java:264)
Caused by: java.io.EOFException: Unexpected end of input stream
        at
org.apache.hadoop.io.compress.DecompressorStream.decompress(DecompressorStream.java:143)
        at
org.apache.hadoop.io.compress.DecompressorStream.read(DecompressorStream.java:83)
        at java.io.InputStream.read(InputStream.java:82)
        at
org.apache.hadoop.util.LineReader.readDefaultLine(LineReader.java:209)
        at org.apache.hadoop.util.LineReader.readLine(LineReader.java:173)
        at
org.apache.hadoop.mapred.LineRecordReader.next(LineRecordReader.java:160)
        at
org.apache.hadoop.mapred.LineRecordReader.next(LineRecordReader.java:38)
        at
org.apache.hadoop.hive.ql.io.HiveContextAwareRecordReader.doNext(HiveContextAwareRecordReader.java:273)
        ... 13 more


-- 

*Best Regards.** Ethan (Kiwon Lee)*
   kiwoni.lee@gmail.com

Re: Unexpected end of input stream

Posted by Jagat Singh <ja...@gmail.com>.
Hi,

I had same error few days back.

Now difficulty we have is to find which gz file is corrupt. Its not corrupt
as such but some how hadoop says it is. If you made the file in Windows and
then transfer to hadoop of can give. This error. If you want to see which
file is corrupt do select count query and watch job tracker for error , it
would give name of gz file currently processed and then of it fails you can
find and remove that file. You can then again gzip that in some Linux
machine and upload it would work.

Thanks,

-----------
Sent from Mobile , short and crisp.
On 29-Aug-2012 12:23 AM, "Kiwon Lee" <ki...@gmail.com> wrote:

> Hi
>
> I have a lot of compressed gzip files on hdfs.
> An exception has occurred at TaskTracker, during processing of MR.
> If any file is invalid, may I know that?
>
>
> 2012-08-28 09:17:56,320 INFO ExecMapper: ExecMapper: processed 0 rows:
> used memory = 125190136
> 2012-08-28 09:17:56,324 INFO org.apache.hadoop.mapred.TaskLogsTruncater:
> Initializing logs' truncater with mapRetainSize=-1 and reduceRetainSize=-1
>  2012-08-28 09:17:56,326 ERROR
> org.apache.hadoop.security.UserGroupInformation: PriviledgedActionException
> as:ubuntu (auth:SIMPLE) cause:java.io.IOException: java.io.EOFException:
> Unexpected end of input stream
> 2012-08-28 09:17:56,326 WARN org.apache.hadoop.mapred.Child: Error running
> child
> java.io.IOException: java.io.EOFException: Unexpected end of input stream
>         at
> org.apache.hadoop.hive.io.HiveIOExceptionHandlerChain.handleRecordReaderNextException(HiveIOExceptionHandlerChain.java:121)
>         at
> org.apache.hadoop.hive.io.HiveIOExceptionHandlerUtil.handleRecordReaderNextException(HiveIOExceptionHandlerUtil.java:77)
>         at
> org.apache.hadoop.hive.ql.io.HiveContextAwareRecordReader.doNext(HiveContextAwareRecordReader.java:275)
>         at
> org.apache.hadoop.hive.ql.io.HiveRecordReader.doNext(HiveRecordReader.java:79)
>         at
> org.apache.hadoop.hive.ql.io.HiveRecordReader.doNext(HiveRecordReader.java:33)
>         at
> org.apache.hadoop.hive.ql.io.HiveContextAwareRecordReader.next(HiveContextAwareRecordReader.java:108)
>         at
> org.apache.hadoop.mapred.MapTask$TrackedRecordReader.moveToNext(MapTask.java:210)
>         at
> org.apache.hadoop.mapred.MapTask$TrackedRecordReader.next(MapTask.java:195)
>         at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:48)
>         at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:393)
>         at org.apache.hadoop.mapred.MapTask.run(MapTask.java:327)
>         at org.apache.hadoop.mapred.Child$4.run(Child.java:270)
>         at java.security.AccessController.doPrivileged(Native Method)
>         at javax.security.auth.Subject.doAs(Subject.java:396)
>         at
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1232)
>         at org.apache.hadoop.mapred.Child.main(Child.java:264)
> Caused by: java.io.EOFException: Unexpected end of input stream
>         at
> org.apache.hadoop.io.compress.DecompressorStream.decompress(DecompressorStream.java:143)
>         at
> org.apache.hadoop.io.compress.DecompressorStream.read(DecompressorStream.java:83)
>         at java.io.InputStream.read(InputStream.java:82)
>         at
> org.apache.hadoop.util.LineReader.readDefaultLine(LineReader.java:209)
>         at org.apache.hadoop.util.LineReader.readLine(LineReader.java:173)
>         at
> org.apache.hadoop.mapred.LineRecordReader.next(LineRecordReader.java:160)
>         at
> org.apache.hadoop.mapred.LineRecordReader.next(LineRecordReader.java:38)
>         at
> org.apache.hadoop.hive.ql.io.HiveContextAwareRecordReader.doNext(HiveContextAwareRecordReader.java:273)
>         ... 13 more
>
>
> --
>
> *Best Regards.** Ethan (Kiwon Lee)*
>    kiwoni.lee@gmail.com
>
>
>

Re: Unexpected end of input stream

Posted by Bejoy KS <be...@yahoo.com>.
Hi Kiwon

You can get this information from the jobdetails web page itself. Browse to your failed task and there you can see the details on which file/block it had processed and failed with the error.

Regards
Bejoy KS

Sent from handheld, please excuse typos.

-----Original Message-----
From: Raihan Jamal <ja...@gmail.com>
Date: Tue, 28 Aug 2012 09:27:39 
To: <us...@hive.apache.org>
Reply-To: user@hive.apache.org
Subject: Re: Unexpected end of input stream

That basically means your data was not in the correct format when you move
or copied the data to HDFS. So there is one file which is corrupted, you
can find the file name in your error logs.



*Raihan Jamal*



On Tue, Aug 28, 2012 at 7:23 AM, Kiwon Lee <ki...@gmail.com> wrote:

> Hi
>
> I have a lot of compressed gzip files on hdfs.
> An exception has occurred at TaskTracker, during processing of MR.
> If any file is invalid, may I know that?
>
>
> 2012-08-28 09:17:56,320 INFO ExecMapper: ExecMapper: processed 0 rows:
> used memory = 125190136
> 2012-08-28 09:17:56,324 INFO org.apache.hadoop.mapred.TaskLogsTruncater:
> Initializing logs' truncater with mapRetainSize=-1 and reduceRetainSize=-1
>  2012-08-28 09:17:56,326 ERROR
> org.apache.hadoop.security.UserGroupInformation: PriviledgedActionException
> as:ubuntu (auth:SIMPLE) cause:java.io.IOException: java.io.EOFException:
> Unexpected end of input stream
> 2012-08-28 09:17:56,326 WARN org.apache.hadoop.mapred.Child: Error running
> child
> java.io.IOException: java.io.EOFException: Unexpected end of input stream
>         at
> org.apache.hadoop.hive.io.HiveIOExceptionHandlerChain.handleRecordReaderNextException(HiveIOExceptionHandlerChain.java:121)
>         at
> org.apache.hadoop.hive.io.HiveIOExceptionHandlerUtil.handleRecordReaderNextException(HiveIOExceptionHandlerUtil.java:77)
>         at
> org.apache.hadoop.hive.ql.io.HiveContextAwareRecordReader.doNext(HiveContextAwareRecordReader.java:275)
>         at
> org.apache.hadoop.hive.ql.io.HiveRecordReader.doNext(HiveRecordReader.java:79)
>         at
> org.apache.hadoop.hive.ql.io.HiveRecordReader.doNext(HiveRecordReader.java:33)
>         at
> org.apache.hadoop.hive.ql.io.HiveContextAwareRecordReader.next(HiveContextAwareRecordReader.java:108)
>         at
> org.apache.hadoop.mapred.MapTask$TrackedRecordReader.moveToNext(MapTask.java:210)
>         at
> org.apache.hadoop.mapred.MapTask$TrackedRecordReader.next(MapTask.java:195)
>         at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:48)
>         at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:393)
>         at org.apache.hadoop.mapred.MapTask.run(MapTask.java:327)
>         at org.apache.hadoop.mapred.Child$4.run(Child.java:270)
>         at java.security.AccessController.doPrivileged(Native Method)
>         at javax.security.auth.Subject.doAs(Subject.java:396)
>         at
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1232)
>         at org.apache.hadoop.mapred.Child.main(Child.java:264)
> Caused by: java.io.EOFException: Unexpected end of input stream
>         at
> org.apache.hadoop.io.compress.DecompressorStream.decompress(DecompressorStream.java:143)
>         at
> org.apache.hadoop.io.compress.DecompressorStream.read(DecompressorStream.java:83)
>         at java.io.InputStream.read(InputStream.java:82)
>         at
> org.apache.hadoop.util.LineReader.readDefaultLine(LineReader.java:209)
>         at org.apache.hadoop.util.LineReader.readLine(LineReader.java:173)
>         at
> org.apache.hadoop.mapred.LineRecordReader.next(LineRecordReader.java:160)
>         at
> org.apache.hadoop.mapred.LineRecordReader.next(LineRecordReader.java:38)
>         at
> org.apache.hadoop.hive.ql.io.HiveContextAwareRecordReader.doNext(HiveContextAwareRecordReader.java:273)
>         ... 13 more
>
>
> --
>
> *Best Regards.** Ethan (Kiwon Lee)*
>    kiwoni.lee@gmail.com
>
>
>


Re: Unexpected end of input stream

Posted by Raihan Jamal <ja...@gmail.com>.
That basically means your data was not in the correct format when you move
or copied the data to HDFS. So there is one file which is corrupted, you
can find the file name in your error logs.



*Raihan Jamal*



On Tue, Aug 28, 2012 at 7:23 AM, Kiwon Lee <ki...@gmail.com> wrote:

> Hi
>
> I have a lot of compressed gzip files on hdfs.
> An exception has occurred at TaskTracker, during processing of MR.
> If any file is invalid, may I know that?
>
>
> 2012-08-28 09:17:56,320 INFO ExecMapper: ExecMapper: processed 0 rows:
> used memory = 125190136
> 2012-08-28 09:17:56,324 INFO org.apache.hadoop.mapred.TaskLogsTruncater:
> Initializing logs' truncater with mapRetainSize=-1 and reduceRetainSize=-1
>  2012-08-28 09:17:56,326 ERROR
> org.apache.hadoop.security.UserGroupInformation: PriviledgedActionException
> as:ubuntu (auth:SIMPLE) cause:java.io.IOException: java.io.EOFException:
> Unexpected end of input stream
> 2012-08-28 09:17:56,326 WARN org.apache.hadoop.mapred.Child: Error running
> child
> java.io.IOException: java.io.EOFException: Unexpected end of input stream
>         at
> org.apache.hadoop.hive.io.HiveIOExceptionHandlerChain.handleRecordReaderNextException(HiveIOExceptionHandlerChain.java:121)
>         at
> org.apache.hadoop.hive.io.HiveIOExceptionHandlerUtil.handleRecordReaderNextException(HiveIOExceptionHandlerUtil.java:77)
>         at
> org.apache.hadoop.hive.ql.io.HiveContextAwareRecordReader.doNext(HiveContextAwareRecordReader.java:275)
>         at
> org.apache.hadoop.hive.ql.io.HiveRecordReader.doNext(HiveRecordReader.java:79)
>         at
> org.apache.hadoop.hive.ql.io.HiveRecordReader.doNext(HiveRecordReader.java:33)
>         at
> org.apache.hadoop.hive.ql.io.HiveContextAwareRecordReader.next(HiveContextAwareRecordReader.java:108)
>         at
> org.apache.hadoop.mapred.MapTask$TrackedRecordReader.moveToNext(MapTask.java:210)
>         at
> org.apache.hadoop.mapred.MapTask$TrackedRecordReader.next(MapTask.java:195)
>         at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:48)
>         at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:393)
>         at org.apache.hadoop.mapred.MapTask.run(MapTask.java:327)
>         at org.apache.hadoop.mapred.Child$4.run(Child.java:270)
>         at java.security.AccessController.doPrivileged(Native Method)
>         at javax.security.auth.Subject.doAs(Subject.java:396)
>         at
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1232)
>         at org.apache.hadoop.mapred.Child.main(Child.java:264)
> Caused by: java.io.EOFException: Unexpected end of input stream
>         at
> org.apache.hadoop.io.compress.DecompressorStream.decompress(DecompressorStream.java:143)
>         at
> org.apache.hadoop.io.compress.DecompressorStream.read(DecompressorStream.java:83)
>         at java.io.InputStream.read(InputStream.java:82)
>         at
> org.apache.hadoop.util.LineReader.readDefaultLine(LineReader.java:209)
>         at org.apache.hadoop.util.LineReader.readLine(LineReader.java:173)
>         at
> org.apache.hadoop.mapred.LineRecordReader.next(LineRecordReader.java:160)
>         at
> org.apache.hadoop.mapred.LineRecordReader.next(LineRecordReader.java:38)
>         at
> org.apache.hadoop.hive.ql.io.HiveContextAwareRecordReader.doNext(HiveContextAwareRecordReader.java:273)
>         ... 13 more
>
>
> --
>
> *Best Regards.** Ethan (Kiwon Lee)*
>    kiwoni.lee@gmail.com
>
>
>