You are viewing a plain text version of this content. The canonical link for it is here.

Posted to dev@pig.apache.org by "Richard Ding (JIRA)" <ji...@apache.org> on 2011/04/08 00:29:05 UTC

[jira] [Created] (PIG-1977) "Stream closed" error while reading Pig temp files (results of intermediate jobs)

"Stream closed" error while reading Pig temp files (results of intermediate jobs)
---------------------------------------------------------------------------------

                 Key: PIG-1977
                 URL: https://issues.apache.org/jira/browse/PIG-1977
             Project: Pig
          Issue Type: Bug
    Affects Versions: 0.8.0
            Reporter: Richard Ding
            Assignee: Richard Ding
             Fix For: 0.9.0, 0.8.0


In certain cases when compression of temporary files is on Pig scripts fail with following exception:

{code}
java.io.IOException: Stream closed at java.io.BufferedInputStream.getBufIfOpen(BufferedInputStream.java:145) at
java.io.BufferedInputStream.fill(BufferedInputStream.java:189) at
java.io.BufferedInputStream.read(BufferedInputStream.java:237) at
java.io.DataInputStream.readByte(DataInputStream.java:248) at
org.apache.hadoop.io.file.tfile.Utils.readVLong(Utils.java:196) at
org.apache.hadoop.io.file.tfile.Utils.readVInt(Utils.java:168) at
org.apache.hadoop.io.file.tfile.Chunk$ChunkDecoder.readLength(Chunk.java:103) at
org.apache.hadoop.io.file.tfile.Chunk$ChunkDecoder.checkEOF(Chunk.java:124) at
org.apache.hadoop.io.file.tfile.Chunk$ChunkDecoder.close(Chunk.java:190) at
java.io.FilterInputStream.close(FilterInputStream.java:155) at
org.apache.pig.impl.io.TFileRecordReader.nextKeyValue(TFileRecordReader.java:85) at
org.apache.pig.impl.io.TFileStorage.getNext(TFileStorage.java:76) at
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigRecordReader.nextKeyValue(PigRecordReader.java:187) at
org.apache.hadoop.mapred.MapTask$NewTrackingRecordReader.nextKeyValue(MapTask.java:474) at
org.apache.hadoop.mapreduce.MapContext.nextKeyValue(MapContext.java:67) at
org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:143) at
org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:676) at
org.apache.hadoop.mapred.MapTask.run(MapTask.java:336) at org.apache.hadoop.mapred.Child$4.run(Child.java:242) at
java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:396) at
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1059) at
org.apache.hadoop.mapred.Child.main(Child.java:236)
{code}

The workaround is to turn off the compression (pig.tmpfilecompression=false).



--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (PIG-1977) "Stream closed" error while reading Pig temp files (results of intermediate jobs)

Posted by "Thejas M Nair (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/PIG-1977?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13017734#comment-13017734 ] 

Thejas M Nair commented on PIG-1977:
------------------------------------

Looks good . +1

> "Stream closed" error while reading Pig temp files (results of intermediate jobs)
> ---------------------------------------------------------------------------------
>
>                 Key: PIG-1977
>                 URL: https://issues.apache.org/jira/browse/PIG-1977
>             Project: Pig
>          Issue Type: Bug
>    Affects Versions: 0.8.0
>            Reporter: Richard Ding
>            Assignee: Richard Ding
>             Fix For: 0.8.0, 0.9.0
>
>         Attachments: PIG-1977.patch
>
>
> In certain cases when compression of temporary files is on Pig scripts fail with following exception:
> {code}
> java.io.IOException: Stream closed at java.io.BufferedInputStream.getBufIfOpen(BufferedInputStream.java:145) at
> java.io.BufferedInputStream.fill(BufferedInputStream.java:189) at
> java.io.BufferedInputStream.read(BufferedInputStream.java:237) at
> java.io.DataInputStream.readByte(DataInputStream.java:248) at
> org.apache.hadoop.io.file.tfile.Utils.readVLong(Utils.java:196) at
> org.apache.hadoop.io.file.tfile.Utils.readVInt(Utils.java:168) at
> org.apache.hadoop.io.file.tfile.Chunk$ChunkDecoder.readLength(Chunk.java:103) at
> org.apache.hadoop.io.file.tfile.Chunk$ChunkDecoder.checkEOF(Chunk.java:124) at
> org.apache.hadoop.io.file.tfile.Chunk$ChunkDecoder.close(Chunk.java:190) at
> java.io.FilterInputStream.close(FilterInputStream.java:155) at
> org.apache.pig.impl.io.TFileRecordReader.nextKeyValue(TFileRecordReader.java:85) at
> org.apache.pig.impl.io.TFileStorage.getNext(TFileStorage.java:76) at
> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigRecordReader.nextKeyValue(PigRecordReader.java:187) at
> org.apache.hadoop.mapred.MapTask$NewTrackingRecordReader.nextKeyValue(MapTask.java:474) at
> org.apache.hadoop.mapreduce.MapContext.nextKeyValue(MapContext.java:67) at
> org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:143) at
> org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:676) at
> org.apache.hadoop.mapred.MapTask.run(MapTask.java:336) at org.apache.hadoop.mapred.Child$4.run(Child.java:242) at
> java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:396) at
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1059) at
> org.apache.hadoop.mapred.Child.main(Child.java:236)
> {code}
> The workaround is to turn off the compression (pig.tmpfilecompression=false).

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Resolved] (PIG-1977) "Stream closed" error while reading Pig temp files (results of intermediate jobs)

Posted by "Richard Ding (JIRA)" <ji...@apache.org>.

     [ https://issues.apache.org/jira/browse/PIG-1977?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Richard Ding resolved PIG-1977.
-------------------------------

      Resolution: Fixed
    Hadoop Flags: [Reviewed]

Unit tests pass. Patch committed to trunk and 0.8 branch.

> "Stream closed" error while reading Pig temp files (results of intermediate jobs)
> ---------------------------------------------------------------------------------
>
>                 Key: PIG-1977
>                 URL: https://issues.apache.org/jira/browse/PIG-1977
>             Project: Pig
>          Issue Type: Bug
>    Affects Versions: 0.8.0
>            Reporter: Richard Ding
>            Assignee: Richard Ding
>             Fix For: 0.9.0, 0.8.0
>
>         Attachments: PIG-1977.patch
>
>
> In certain cases when compression of temporary files is on Pig scripts fail with following exception:
> {code}
> java.io.IOException: Stream closed at java.io.BufferedInputStream.getBufIfOpen(BufferedInputStream.java:145) at
> java.io.BufferedInputStream.fill(BufferedInputStream.java:189) at
> java.io.BufferedInputStream.read(BufferedInputStream.java:237) at
> java.io.DataInputStream.readByte(DataInputStream.java:248) at
> org.apache.hadoop.io.file.tfile.Utils.readVLong(Utils.java:196) at
> org.apache.hadoop.io.file.tfile.Utils.readVInt(Utils.java:168) at
> org.apache.hadoop.io.file.tfile.Chunk$ChunkDecoder.readLength(Chunk.java:103) at
> org.apache.hadoop.io.file.tfile.Chunk$ChunkDecoder.checkEOF(Chunk.java:124) at
> org.apache.hadoop.io.file.tfile.Chunk$ChunkDecoder.close(Chunk.java:190) at
> java.io.FilterInputStream.close(FilterInputStream.java:155) at
> org.apache.pig.impl.io.TFileRecordReader.nextKeyValue(TFileRecordReader.java:85) at
> org.apache.pig.impl.io.TFileStorage.getNext(TFileStorage.java:76) at
> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigRecordReader.nextKeyValue(PigRecordReader.java:187) at
> org.apache.hadoop.mapred.MapTask$NewTrackingRecordReader.nextKeyValue(MapTask.java:474) at
> org.apache.hadoop.mapreduce.MapContext.nextKeyValue(MapContext.java:67) at
> org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:143) at
> org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:676) at
> org.apache.hadoop.mapred.MapTask.run(MapTask.java:336) at org.apache.hadoop.mapred.Child$4.run(Child.java:242) at
> java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:396) at
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1059) at
> org.apache.hadoop.mapred.Child.main(Child.java:236)
> {code}
> The workaround is to turn off the compression (pig.tmpfilecompression=false).

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (PIG-1977) "Stream closed" error while reading Pig temp files (results of intermediate jobs)

Posted by "Richard Ding (JIRA)" <ji...@apache.org>.

     [ https://issues.apache.org/jira/browse/PIG-1977?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Richard Ding updated PIG-1977:
------------------------------

    Attachment: PIG-1977.patch

TFile stores records in chunk encoded format. After reading a record, the cursor must be moved to the end of record.

> "Stream closed" error while reading Pig temp files (results of intermediate jobs)
> ---------------------------------------------------------------------------------
>
>                 Key: PIG-1977
>                 URL: https://issues.apache.org/jira/browse/PIG-1977
>             Project: Pig
>          Issue Type: Bug
>    Affects Versions: 0.8.0
>            Reporter: Richard Ding
>            Assignee: Richard Ding
>             Fix For: 0.8.0, 0.9.0
>
>         Attachments: PIG-1977.patch
>
>
> In certain cases when compression of temporary files is on Pig scripts fail with following exception:
> {code}
> java.io.IOException: Stream closed at java.io.BufferedInputStream.getBufIfOpen(BufferedInputStream.java:145) at
> java.io.BufferedInputStream.fill(BufferedInputStream.java:189) at
> java.io.BufferedInputStream.read(BufferedInputStream.java:237) at
> java.io.DataInputStream.readByte(DataInputStream.java:248) at
> org.apache.hadoop.io.file.tfile.Utils.readVLong(Utils.java:196) at
> org.apache.hadoop.io.file.tfile.Utils.readVInt(Utils.java:168) at
> org.apache.hadoop.io.file.tfile.Chunk$ChunkDecoder.readLength(Chunk.java:103) at
> org.apache.hadoop.io.file.tfile.Chunk$ChunkDecoder.checkEOF(Chunk.java:124) at
> org.apache.hadoop.io.file.tfile.Chunk$ChunkDecoder.close(Chunk.java:190) at
> java.io.FilterInputStream.close(FilterInputStream.java:155) at
> org.apache.pig.impl.io.TFileRecordReader.nextKeyValue(TFileRecordReader.java:85) at
> org.apache.pig.impl.io.TFileStorage.getNext(TFileStorage.java:76) at
> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigRecordReader.nextKeyValue(PigRecordReader.java:187) at
> org.apache.hadoop.mapred.MapTask$NewTrackingRecordReader.nextKeyValue(MapTask.java:474) at
> org.apache.hadoop.mapreduce.MapContext.nextKeyValue(MapContext.java:67) at
> org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:143) at
> org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:676) at
> org.apache.hadoop.mapred.MapTask.run(MapTask.java:336) at org.apache.hadoop.mapred.Child$4.run(Child.java:242) at
> java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:396) at
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1059) at
> org.apache.hadoop.mapred.Child.main(Child.java:236)
> {code}
> The workaround is to turn off the compression (pig.tmpfilecompression=false).

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira