You are viewing a plain text version of this content. The canonical link for it is here.
Posted to common-dev@hadoop.apache.org by "Nigel Daley (JIRA)" <ji...@apache.org> on 2007/07/16 23:02:04 UTC

[jira] Created: (HADOOP-1619) FSInputChecker attempts to seek past EOF

FSInputChecker attempts to seek past EOF
----------------------------------------

                 Key: HADOOP-1619
                 URL: https://issues.apache.org/jira/browse/HADOOP-1619
             Project: Hadoop
          Issue Type: Bug
          Components: fs
    Affects Versions: 0.14.0
            Reporter: Nigel Daley
            Priority: Blocker
             Fix For: 0.14.0


I'm not sure which class in the stack trace below is responsible for attempting to seek past the end of file. 

2007-07-16 20:31:40,598 INFO org.apache.hadoop.mapred.TaskInProgress: Error from task_200707162028_0014_m_000000_0: java.io.IOException: Cannot seek after EOF
	at org.apache.hadoop.dfs.DFSClient$DFSInputStream.seek(DFSClient.java:1040)
	at org.apache.hadoop.fs.FSDataInputStream.seek(FSDataInputStream.java:37)
	at org.apache.hadoop.fs.ChecksumFileSystem$ChecksumFSInputChecker.readChunk(ChecksumFileSystem.java:188)
	at org.apache.hadoop.fs.FSInputChecker.readChecksumChunk(FSInputChecker.java:234)
	at org.apache.hadoop.fs.FSInputChecker.fill(FSInputChecker.java:176)
	at org.apache.hadoop.fs.FSInputChecker.read1(FSInputChecker.java:193)
	at org.apache.hadoop.fs.FSInputChecker.read(FSInputChecker.java:157)
	at org.apache.hadoop.fs.FSInputChecker.readFully(FSInputChecker.java:353)
	at org.apache.hadoop.fs.FSInputChecker.seek(FSInputChecker.java:331)
	at org.apache.hadoop.fs.FSInputChecker.skip(FSInputChecker.java:306)
	at java.io.FilterInputStream.skip(FilterInputStream.java:125)
	at java.io.FilterInputStream.skip(FilterInputStream.java:125)
	at com.yahoo.pig.impl.io.InputStreamPosition.skip(InputStreamPosition.java:55)
	at java.io.BufferedInputStream.skip(BufferedInputStream.java:349)
	at java.io.FilterInputStream.skip(FilterInputStream.java:125)
	at com.yahoo.pig.impl.builtin.RandomSampleLoader.getNext(RandomSampleLoader.java:34)
	at com.yahoo.pig.impl.mapreduceExec.PigInputFormat$PigRecordReader.next(PigInputFormat.java:169)
	at org.apache.hadoop.mapred.MapTask$1.next(MapTask.java:171)
	at com.yahoo.pig.impl.mapreduceExec.PigMapReduce.run(PigMapReduce.java:98)
	at org.apache.hadoop.mapred.MapTask.run(MapTask.java:189)
	at org.apache.hadoop.mapred.TaskTracker$Child.main(TaskTracker.java:1771)

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HADOOP-1619) FSInputChecker attempts to seek past EOF

Posted by "Raghu Angadi (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-1619?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12513039 ] 

Raghu Angadi commented on HADOOP-1619:
--------------------------------------

Can someone familiar with PigInputFormat comment if it is actually not trying to skip beyond the file length. It might be depending on skip() return to identify EOF (nothing wrong with that).. But with HADOOP-1470, skip() is implemented as a wrapper over seek(), which throws an EOFException  when we try to seek past. 

If skip in Pig package is supposed to be valid then we can check if there is some other bug.



> FSInputChecker attempts to seek past EOF
> ----------------------------------------
>
>                 Key: HADOOP-1619
>                 URL: https://issues.apache.org/jira/browse/HADOOP-1619
>             Project: Hadoop
>          Issue Type: Bug
>          Components: fs
>    Affects Versions: 0.14.0
>            Reporter: Nigel Daley
>            Priority: Blocker
>             Fix For: 0.14.0
>
>
> I'm not sure which class in the stack trace below is responsible for attempting to seek past the end of file. 
> 2007-07-16 20:31:40,598 INFO org.apache.hadoop.mapred.TaskInProgress: Error from task_200707162028_0014_m_000000_0: java.io.IOException: Cannot seek after EOF
> 	at org.apache.hadoop.dfs.DFSClient$DFSInputStream.seek(DFSClient.java:1040)
> 	at org.apache.hadoop.fs.FSDataInputStream.seek(FSDataInputStream.java:37)
> 	at org.apache.hadoop.fs.ChecksumFileSystem$ChecksumFSInputChecker.readChunk(ChecksumFileSystem.java:188)
> 	at org.apache.hadoop.fs.FSInputChecker.readChecksumChunk(FSInputChecker.java:234)
> 	at org.apache.hadoop.fs.FSInputChecker.fill(FSInputChecker.java:176)
> 	at org.apache.hadoop.fs.FSInputChecker.read1(FSInputChecker.java:193)
> 	at org.apache.hadoop.fs.FSInputChecker.read(FSInputChecker.java:157)
> 	at org.apache.hadoop.fs.FSInputChecker.readFully(FSInputChecker.java:353)
> 	at org.apache.hadoop.fs.FSInputChecker.seek(FSInputChecker.java:331)
> 	at org.apache.hadoop.fs.FSInputChecker.skip(FSInputChecker.java:306)
> 	at java.io.FilterInputStream.skip(FilterInputStream.java:125)
> 	at java.io.FilterInputStream.skip(FilterInputStream.java:125)
> 	at com.yahoo.pig.impl.io.InputStreamPosition.skip(InputStreamPosition.java:55)
> 	at java.io.BufferedInputStream.skip(BufferedInputStream.java:349)
> 	at java.io.FilterInputStream.skip(FilterInputStream.java:125)
> 	at com.yahoo.pig.impl.builtin.RandomSampleLoader.getNext(RandomSampleLoader.java:34)
> 	at com.yahoo.pig.impl.mapreduceExec.PigInputFormat$PigRecordReader.next(PigInputFormat.java:169)
> 	at org.apache.hadoop.mapred.MapTask$1.next(MapTask.java:171)
> 	at com.yahoo.pig.impl.mapreduceExec.PigMapReduce.run(PigMapReduce.java:98)
> 	at org.apache.hadoop.mapred.MapTask.run(MapTask.java:189)
> 	at org.apache.hadoop.mapred.TaskTracker$Child.main(TaskTracker.java:1771)

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Assigned: (HADOOP-1619) FSInputChecker attempts to seek past EOF

Posted by "Nigel Daley (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HADOOP-1619?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Nigel Daley reassigned HADOOP-1619:
-----------------------------------

    Assignee: Hairong Kuang

> FSInputChecker attempts to seek past EOF
> ----------------------------------------
>
>                 Key: HADOOP-1619
>                 URL: https://issues.apache.org/jira/browse/HADOOP-1619
>             Project: Hadoop
>          Issue Type: Bug
>          Components: fs
>    Affects Versions: 0.14.0
>            Reporter: Nigel Daley
>            Assignee: Hairong Kuang
>            Priority: Blocker
>             Fix For: 0.14.0
>
>
> I'm not sure which class in the stack trace below is responsible for attempting to seek past the end of file. 
> 2007-07-16 20:31:40,598 INFO org.apache.hadoop.mapred.TaskInProgress: Error from task_200707162028_0014_m_000000_0: java.io.IOException: Cannot seek after EOF
> 	at org.apache.hadoop.dfs.DFSClient$DFSInputStream.seek(DFSClient.java:1040)
> 	at org.apache.hadoop.fs.FSDataInputStream.seek(FSDataInputStream.java:37)
> 	at org.apache.hadoop.fs.ChecksumFileSystem$ChecksumFSInputChecker.readChunk(ChecksumFileSystem.java:188)
> 	at org.apache.hadoop.fs.FSInputChecker.readChecksumChunk(FSInputChecker.java:234)
> 	at org.apache.hadoop.fs.FSInputChecker.fill(FSInputChecker.java:176)
> 	at org.apache.hadoop.fs.FSInputChecker.read1(FSInputChecker.java:193)
> 	at org.apache.hadoop.fs.FSInputChecker.read(FSInputChecker.java:157)
> 	at org.apache.hadoop.fs.FSInputChecker.readFully(FSInputChecker.java:353)
> 	at org.apache.hadoop.fs.FSInputChecker.seek(FSInputChecker.java:331)
> 	at org.apache.hadoop.fs.FSInputChecker.skip(FSInputChecker.java:306)
> 	at java.io.FilterInputStream.skip(FilterInputStream.java:125)
> 	at java.io.FilterInputStream.skip(FilterInputStream.java:125)
> 	at com.yahoo.pig.impl.io.InputStreamPosition.skip(InputStreamPosition.java:55)
> 	at java.io.BufferedInputStream.skip(BufferedInputStream.java:349)
> 	at java.io.FilterInputStream.skip(FilterInputStream.java:125)
> 	at com.yahoo.pig.impl.builtin.RandomSampleLoader.getNext(RandomSampleLoader.java:34)
> 	at com.yahoo.pig.impl.mapreduceExec.PigInputFormat$PigRecordReader.next(PigInputFormat.java:169)
> 	at org.apache.hadoop.mapred.MapTask$1.next(MapTask.java:171)
> 	at com.yahoo.pig.impl.mapreduceExec.PigMapReduce.run(PigMapReduce.java:98)
> 	at org.apache.hadoop.mapred.MapTask.run(MapTask.java:189)
> 	at org.apache.hadoop.mapred.TaskTracker$Child.main(TaskTracker.java:1771)

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HADOOP-1619) FSInputChecker attempts to seek past EOF

Posted by "Devaraj Das (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-1619?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12513294 ] 

Devaraj Das commented on HADOOP-1619:
-------------------------------------

Here is a similar exception i saw with a regular sort run with the current trunk (as of July 17, 2007). This happened during the map output serving from the local filesystem.

Map output lost, rescheduling: getMapOutput(task_200707171649_0001_m_000996_0,411) failed :
java.io.IOException: Illegal seek
	at sun.nio.ch.FileChannelImpl.position0(Native Method)
	at sun.nio.ch.FileChannelImpl.position(FileChannelImpl.java:266)
	at org.apache.hadoop.fs.RawLocalFileSystem$LocalFSFileInputStream.getPos(RawLocalFileSystem.java:96)
	at org.apache.hadoop.fs.BufferedFSInputStream.getPos(BufferedFSInputStream.java:48)
	at org.apache.hadoop.fs.FSDataInputStream.getPos(FSDataInputStream.java:41)
	at org.apache.hadoop.fs.ChecksumFileSystem$ChecksumFSInputChecker.readChunk(ChecksumFileSystem.java:196)
	at org.apache.hadoop.fs.FSInputChecker.readChecksumChunk(FSInputChecker.java:234)
	at org.apache.hadoop.fs.FSInputChecker.read1(FSInputChecker.java:189)
	at org.apache.hadoop.fs.FSInputChecker.read(FSInputChecker.java:157)
	at java.io.DataInputStream.read(DataInputStream.java:132)
	at org.apache.hadoop.mapred.TaskTracker$MapOutputServlet.doGet(TaskTracker.java:1980)
	at javax.servlet.http.HttpServlet.service(HttpServlet.java:707)
	at javax.servlet.http.HttpServlet.service(HttpServlet.java:820)

> FSInputChecker attempts to seek past EOF
> ----------------------------------------
>
>                 Key: HADOOP-1619
>                 URL: https://issues.apache.org/jira/browse/HADOOP-1619
>             Project: Hadoop
>          Issue Type: Bug
>          Components: fs
>    Affects Versions: 0.14.0
>            Reporter: Nigel Daley
>            Assignee: Hairong Kuang
>            Priority: Blocker
>             Fix For: 0.14.0
>
>
> I'm not sure which class in the stack trace below is responsible for attempting to seek past the end of file. 
> 2007-07-16 20:31:40,598 INFO org.apache.hadoop.mapred.TaskInProgress: Error from task_200707162028_0014_m_000000_0: java.io.IOException: Cannot seek after EOF
> 	at org.apache.hadoop.dfs.DFSClient$DFSInputStream.seek(DFSClient.java:1040)
> 	at org.apache.hadoop.fs.FSDataInputStream.seek(FSDataInputStream.java:37)
> 	at org.apache.hadoop.fs.ChecksumFileSystem$ChecksumFSInputChecker.readChunk(ChecksumFileSystem.java:188)
> 	at org.apache.hadoop.fs.FSInputChecker.readChecksumChunk(FSInputChecker.java:234)
> 	at org.apache.hadoop.fs.FSInputChecker.fill(FSInputChecker.java:176)
> 	at org.apache.hadoop.fs.FSInputChecker.read1(FSInputChecker.java:193)
> 	at org.apache.hadoop.fs.FSInputChecker.read(FSInputChecker.java:157)
> 	at org.apache.hadoop.fs.FSInputChecker.readFully(FSInputChecker.java:353)
> 	at org.apache.hadoop.fs.FSInputChecker.seek(FSInputChecker.java:331)
> 	at org.apache.hadoop.fs.FSInputChecker.skip(FSInputChecker.java:306)
> 	at java.io.FilterInputStream.skip(FilterInputStream.java:125)
> 	at java.io.FilterInputStream.skip(FilterInputStream.java:125)
> 	at com.yahoo.pig.impl.io.InputStreamPosition.skip(InputStreamPosition.java:55)
> 	at java.io.BufferedInputStream.skip(BufferedInputStream.java:349)
> 	at java.io.FilterInputStream.skip(FilterInputStream.java:125)
> 	at com.yahoo.pig.impl.builtin.RandomSampleLoader.getNext(RandomSampleLoader.java:34)
> 	at com.yahoo.pig.impl.mapreduceExec.PigInputFormat$PigRecordReader.next(PigInputFormat.java:169)
> 	at org.apache.hadoop.mapred.MapTask$1.next(MapTask.java:171)
> 	at com.yahoo.pig.impl.mapreduceExec.PigMapReduce.run(PigMapReduce.java:98)
> 	at org.apache.hadoop.mapred.MapTask.run(MapTask.java:189)
> 	at org.apache.hadoop.mapred.TaskTracker$Child.main(TaskTracker.java:1771)

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HADOOP-1619) FSInputChecker attempts to seek past EOF

Posted by "Raghu Angadi (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-1619?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12514349 ] 

Raghu Angadi commented on HADOOP-1619:
--------------------------------------

+1.

For some reason I did not make skip() in DFSClient.cc synchronized. Could you add this in this patch? Thanks.

> FSInputChecker attempts to seek past EOF
> ----------------------------------------
>
>                 Key: HADOOP-1619
>                 URL: https://issues.apache.org/jira/browse/HADOOP-1619
>             Project: Hadoop
>          Issue Type: Bug
>          Components: fs
>    Affects Versions: 0.14.0
>            Reporter: Nigel Daley
>            Assignee: Hairong Kuang
>            Priority: Blocker
>             Fix For: 0.14.0
>
>         Attachments: skip.patch
>
>
> I'm not sure which class in the stack trace below is responsible for attempting to seek past the end of file. 
> 2007-07-16 20:31:40,598 INFO org.apache.hadoop.mapred.TaskInProgress: Error from task_200707162028_0014_m_000000_0: java.io.IOException: Cannot seek after EOF
> 	at org.apache.hadoop.dfs.DFSClient$DFSInputStream.seek(DFSClient.java:1040)
> 	at org.apache.hadoop.fs.FSDataInputStream.seek(FSDataInputStream.java:37)
> 	at org.apache.hadoop.fs.ChecksumFileSystem$ChecksumFSInputChecker.readChunk(ChecksumFileSystem.java:188)
> 	at org.apache.hadoop.fs.FSInputChecker.readChecksumChunk(FSInputChecker.java:234)
> 	at org.apache.hadoop.fs.FSInputChecker.fill(FSInputChecker.java:176)
> 	at org.apache.hadoop.fs.FSInputChecker.read1(FSInputChecker.java:193)
> 	at org.apache.hadoop.fs.FSInputChecker.read(FSInputChecker.java:157)
> 	at org.apache.hadoop.fs.FSInputChecker.readFully(FSInputChecker.java:353)
> 	at org.apache.hadoop.fs.FSInputChecker.seek(FSInputChecker.java:331)
> 	at org.apache.hadoop.fs.FSInputChecker.skip(FSInputChecker.java:306)
> 	at java.io.FilterInputStream.skip(FilterInputStream.java:125)
> 	at java.io.FilterInputStream.skip(FilterInputStream.java:125)
> 	at com.yahoo.pig.impl.io.InputStreamPosition.skip(InputStreamPosition.java:55)
> 	at java.io.BufferedInputStream.skip(BufferedInputStream.java:349)
> 	at java.io.FilterInputStream.skip(FilterInputStream.java:125)
> 	at com.yahoo.pig.impl.builtin.RandomSampleLoader.getNext(RandomSampleLoader.java:34)
> 	at com.yahoo.pig.impl.mapreduceExec.PigInputFormat$PigRecordReader.next(PigInputFormat.java:169)
> 	at org.apache.hadoop.mapred.MapTask$1.next(MapTask.java:171)
> 	at com.yahoo.pig.impl.mapreduceExec.PigMapReduce.run(PigMapReduce.java:98)
> 	at org.apache.hadoop.mapred.MapTask.run(MapTask.java:189)
> 	at org.apache.hadoop.mapred.TaskTracker$Child.main(TaskTracker.java:1771)

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (HADOOP-1619) FSInputChecker attempts to seek past EOF

Posted by "Hairong Kuang (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HADOOP-1619?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Hairong Kuang updated HADOOP-1619:
----------------------------------

    Status: Patch Available  (was: Open)

> FSInputChecker attempts to seek past EOF
> ----------------------------------------
>
>                 Key: HADOOP-1619
>                 URL: https://issues.apache.org/jira/browse/HADOOP-1619
>             Project: Hadoop
>          Issue Type: Bug
>          Components: fs
>    Affects Versions: 0.14.0
>            Reporter: Nigel Daley
>            Assignee: Hairong Kuang
>            Priority: Blocker
>             Fix For: 0.14.0
>
>         Attachments: skip.patch, skip.patch
>
>
> I'm not sure which class in the stack trace below is responsible for attempting to seek past the end of file. 
> 2007-07-16 20:31:40,598 INFO org.apache.hadoop.mapred.TaskInProgress: Error from task_200707162028_0014_m_000000_0: java.io.IOException: Cannot seek after EOF
> 	at org.apache.hadoop.dfs.DFSClient$DFSInputStream.seek(DFSClient.java:1040)
> 	at org.apache.hadoop.fs.FSDataInputStream.seek(FSDataInputStream.java:37)
> 	at org.apache.hadoop.fs.ChecksumFileSystem$ChecksumFSInputChecker.readChunk(ChecksumFileSystem.java:188)
> 	at org.apache.hadoop.fs.FSInputChecker.readChecksumChunk(FSInputChecker.java:234)
> 	at org.apache.hadoop.fs.FSInputChecker.fill(FSInputChecker.java:176)
> 	at org.apache.hadoop.fs.FSInputChecker.read1(FSInputChecker.java:193)
> 	at org.apache.hadoop.fs.FSInputChecker.read(FSInputChecker.java:157)
> 	at org.apache.hadoop.fs.FSInputChecker.readFully(FSInputChecker.java:353)
> 	at org.apache.hadoop.fs.FSInputChecker.seek(FSInputChecker.java:331)
> 	at org.apache.hadoop.fs.FSInputChecker.skip(FSInputChecker.java:306)
> 	at java.io.FilterInputStream.skip(FilterInputStream.java:125)
> 	at java.io.FilterInputStream.skip(FilterInputStream.java:125)
> 	at com.yahoo.pig.impl.io.InputStreamPosition.skip(InputStreamPosition.java:55)
> 	at java.io.BufferedInputStream.skip(BufferedInputStream.java:349)
> 	at java.io.FilterInputStream.skip(FilterInputStream.java:125)
> 	at com.yahoo.pig.impl.builtin.RandomSampleLoader.getNext(RandomSampleLoader.java:34)
> 	at com.yahoo.pig.impl.mapreduceExec.PigInputFormat$PigRecordReader.next(PigInputFormat.java:169)
> 	at org.apache.hadoop.mapred.MapTask$1.next(MapTask.java:171)
> 	at com.yahoo.pig.impl.mapreduceExec.PigMapReduce.run(PigMapReduce.java:98)
> 	at org.apache.hadoop.mapred.MapTask.run(MapTask.java:189)
> 	at org.apache.hadoop.mapred.TaskTracker$Child.main(TaskTracker.java:1771)

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HADOOP-1619) FSInputChecker attempts to seek past EOF

Posted by "Nigel Daley (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-1619?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12513048 ] 

Nigel Daley commented on HADOOP-1619:
-------------------------------------

Regardless of PigInputFormat, skip should not throw an EOFException.   
The InputStream.skip contract is that it returns the actual number of  
bytes skipped.






> FSInputChecker attempts to seek past EOF
> ----------------------------------------
>
>                 Key: HADOOP-1619
>                 URL: https://issues.apache.org/jira/browse/HADOOP-1619
>             Project: Hadoop
>          Issue Type: Bug
>          Components: fs
>    Affects Versions: 0.14.0
>            Reporter: Nigel Daley
>            Priority: Blocker
>             Fix For: 0.14.0
>
>
> I'm not sure which class in the stack trace below is responsible for attempting to seek past the end of file. 
> 2007-07-16 20:31:40,598 INFO org.apache.hadoop.mapred.TaskInProgress: Error from task_200707162028_0014_m_000000_0: java.io.IOException: Cannot seek after EOF
> 	at org.apache.hadoop.dfs.DFSClient$DFSInputStream.seek(DFSClient.java:1040)
> 	at org.apache.hadoop.fs.FSDataInputStream.seek(FSDataInputStream.java:37)
> 	at org.apache.hadoop.fs.ChecksumFileSystem$ChecksumFSInputChecker.readChunk(ChecksumFileSystem.java:188)
> 	at org.apache.hadoop.fs.FSInputChecker.readChecksumChunk(FSInputChecker.java:234)
> 	at org.apache.hadoop.fs.FSInputChecker.fill(FSInputChecker.java:176)
> 	at org.apache.hadoop.fs.FSInputChecker.read1(FSInputChecker.java:193)
> 	at org.apache.hadoop.fs.FSInputChecker.read(FSInputChecker.java:157)
> 	at org.apache.hadoop.fs.FSInputChecker.readFully(FSInputChecker.java:353)
> 	at org.apache.hadoop.fs.FSInputChecker.seek(FSInputChecker.java:331)
> 	at org.apache.hadoop.fs.FSInputChecker.skip(FSInputChecker.java:306)
> 	at java.io.FilterInputStream.skip(FilterInputStream.java:125)
> 	at java.io.FilterInputStream.skip(FilterInputStream.java:125)
> 	at com.yahoo.pig.impl.io.InputStreamPosition.skip(InputStreamPosition.java:55)
> 	at java.io.BufferedInputStream.skip(BufferedInputStream.java:349)
> 	at java.io.FilterInputStream.skip(FilterInputStream.java:125)
> 	at com.yahoo.pig.impl.builtin.RandomSampleLoader.getNext(RandomSampleLoader.java:34)
> 	at com.yahoo.pig.impl.mapreduceExec.PigInputFormat$PigRecordReader.next(PigInputFormat.java:169)
> 	at org.apache.hadoop.mapred.MapTask$1.next(MapTask.java:171)
> 	at com.yahoo.pig.impl.mapreduceExec.PigMapReduce.run(PigMapReduce.java:98)
> 	at org.apache.hadoop.mapred.MapTask.run(MapTask.java:189)
> 	at org.apache.hadoop.mapred.TaskTracker$Child.main(TaskTracker.java:1771)

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (HADOOP-1619) FSInputChecker attempts to seek past EOF

Posted by "Doug Cutting (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HADOOP-1619?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Doug Cutting updated HADOOP-1619:
---------------------------------

    Resolution: Fixed
        Status: Resolved  (was: Patch Available)

I just committed this.  Thanks, Hairong!

> FSInputChecker attempts to seek past EOF
> ----------------------------------------
>
>                 Key: HADOOP-1619
>                 URL: https://issues.apache.org/jira/browse/HADOOP-1619
>             Project: Hadoop
>          Issue Type: Bug
>          Components: fs
>    Affects Versions: 0.14.0
>            Reporter: Nigel Daley
>            Assignee: Hairong Kuang
>            Priority: Blocker
>             Fix For: 0.14.0
>
>         Attachments: skip.patch, skip.patch
>
>
> I'm not sure which class in the stack trace below is responsible for attempting to seek past the end of file. 
> 2007-07-16 20:31:40,598 INFO org.apache.hadoop.mapred.TaskInProgress: Error from task_200707162028_0014_m_000000_0: java.io.IOException: Cannot seek after EOF
> 	at org.apache.hadoop.dfs.DFSClient$DFSInputStream.seek(DFSClient.java:1040)
> 	at org.apache.hadoop.fs.FSDataInputStream.seek(FSDataInputStream.java:37)
> 	at org.apache.hadoop.fs.ChecksumFileSystem$ChecksumFSInputChecker.readChunk(ChecksumFileSystem.java:188)
> 	at org.apache.hadoop.fs.FSInputChecker.readChecksumChunk(FSInputChecker.java:234)
> 	at org.apache.hadoop.fs.FSInputChecker.fill(FSInputChecker.java:176)
> 	at org.apache.hadoop.fs.FSInputChecker.read1(FSInputChecker.java:193)
> 	at org.apache.hadoop.fs.FSInputChecker.read(FSInputChecker.java:157)
> 	at org.apache.hadoop.fs.FSInputChecker.readFully(FSInputChecker.java:353)
> 	at org.apache.hadoop.fs.FSInputChecker.seek(FSInputChecker.java:331)
> 	at org.apache.hadoop.fs.FSInputChecker.skip(FSInputChecker.java:306)
> 	at java.io.FilterInputStream.skip(FilterInputStream.java:125)
> 	at java.io.FilterInputStream.skip(FilterInputStream.java:125)
> 	at com.yahoo.pig.impl.io.InputStreamPosition.skip(InputStreamPosition.java:55)
> 	at java.io.BufferedInputStream.skip(BufferedInputStream.java:349)
> 	at java.io.FilterInputStream.skip(FilterInputStream.java:125)
> 	at com.yahoo.pig.impl.builtin.RandomSampleLoader.getNext(RandomSampleLoader.java:34)
> 	at com.yahoo.pig.impl.mapreduceExec.PigInputFormat$PigRecordReader.next(PigInputFormat.java:169)
> 	at org.apache.hadoop.mapred.MapTask$1.next(MapTask.java:171)
> 	at com.yahoo.pig.impl.mapreduceExec.PigMapReduce.run(PigMapReduce.java:98)
> 	at org.apache.hadoop.mapred.MapTask.run(MapTask.java:189)
> 	at org.apache.hadoop.mapred.TaskTracker$Child.main(TaskTracker.java:1771)

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (HADOOP-1619) FSInputChecker attempts to seek past EOF

Posted by "Hairong Kuang (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HADOOP-1619?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Hairong Kuang updated HADOOP-1619:
----------------------------------

    Attachment: skip.patch

skip of DFSClient is synchronized.

> FSInputChecker attempts to seek past EOF
> ----------------------------------------
>
>                 Key: HADOOP-1619
>                 URL: https://issues.apache.org/jira/browse/HADOOP-1619
>             Project: Hadoop
>          Issue Type: Bug
>          Components: fs
>    Affects Versions: 0.14.0
>            Reporter: Nigel Daley
>            Assignee: Hairong Kuang
>            Priority: Blocker
>             Fix For: 0.14.0
>
>         Attachments: skip.patch, skip.patch
>
>
> I'm not sure which class in the stack trace below is responsible for attempting to seek past the end of file. 
> 2007-07-16 20:31:40,598 INFO org.apache.hadoop.mapred.TaskInProgress: Error from task_200707162028_0014_m_000000_0: java.io.IOException: Cannot seek after EOF
> 	at org.apache.hadoop.dfs.DFSClient$DFSInputStream.seek(DFSClient.java:1040)
> 	at org.apache.hadoop.fs.FSDataInputStream.seek(FSDataInputStream.java:37)
> 	at org.apache.hadoop.fs.ChecksumFileSystem$ChecksumFSInputChecker.readChunk(ChecksumFileSystem.java:188)
> 	at org.apache.hadoop.fs.FSInputChecker.readChecksumChunk(FSInputChecker.java:234)
> 	at org.apache.hadoop.fs.FSInputChecker.fill(FSInputChecker.java:176)
> 	at org.apache.hadoop.fs.FSInputChecker.read1(FSInputChecker.java:193)
> 	at org.apache.hadoop.fs.FSInputChecker.read(FSInputChecker.java:157)
> 	at org.apache.hadoop.fs.FSInputChecker.readFully(FSInputChecker.java:353)
> 	at org.apache.hadoop.fs.FSInputChecker.seek(FSInputChecker.java:331)
> 	at org.apache.hadoop.fs.FSInputChecker.skip(FSInputChecker.java:306)
> 	at java.io.FilterInputStream.skip(FilterInputStream.java:125)
> 	at java.io.FilterInputStream.skip(FilterInputStream.java:125)
> 	at com.yahoo.pig.impl.io.InputStreamPosition.skip(InputStreamPosition.java:55)
> 	at java.io.BufferedInputStream.skip(BufferedInputStream.java:349)
> 	at java.io.FilterInputStream.skip(FilterInputStream.java:125)
> 	at com.yahoo.pig.impl.builtin.RandomSampleLoader.getNext(RandomSampleLoader.java:34)
> 	at com.yahoo.pig.impl.mapreduceExec.PigInputFormat$PigRecordReader.next(PigInputFormat.java:169)
> 	at org.apache.hadoop.mapred.MapTask$1.next(MapTask.java:171)
> 	at com.yahoo.pig.impl.mapreduceExec.PigMapReduce.run(PigMapReduce.java:98)
> 	at org.apache.hadoop.mapred.MapTask.run(MapTask.java:189)
> 	at org.apache.hadoop.mapred.TaskTracker$Child.main(TaskTracker.java:1771)

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HADOOP-1619) FSInputChecker attempts to seek past EOF

Posted by "Hadoop QA (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-1619?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12514393 ] 

Hadoop QA commented on HADOOP-1619:
-----------------------------------

+1

http://issues.apache.org/jira/secure/attachment/12362273/skip.patch applied and successfully tested against trunk revision r558243.

Test results:   http://lucene.zones.apache.org:8080/hudson/job/Hadoop-Patch/448/testReport/
Console output: http://lucene.zones.apache.org:8080/hudson/job/Hadoop-Patch/448/console

> FSInputChecker attempts to seek past EOF
> ----------------------------------------
>
>                 Key: HADOOP-1619
>                 URL: https://issues.apache.org/jira/browse/HADOOP-1619
>             Project: Hadoop
>          Issue Type: Bug
>          Components: fs
>    Affects Versions: 0.14.0
>            Reporter: Nigel Daley
>            Assignee: Hairong Kuang
>            Priority: Blocker
>             Fix For: 0.14.0
>
>         Attachments: skip.patch, skip.patch
>
>
> I'm not sure which class in the stack trace below is responsible for attempting to seek past the end of file. 
> 2007-07-16 20:31:40,598 INFO org.apache.hadoop.mapred.TaskInProgress: Error from task_200707162028_0014_m_000000_0: java.io.IOException: Cannot seek after EOF
> 	at org.apache.hadoop.dfs.DFSClient$DFSInputStream.seek(DFSClient.java:1040)
> 	at org.apache.hadoop.fs.FSDataInputStream.seek(FSDataInputStream.java:37)
> 	at org.apache.hadoop.fs.ChecksumFileSystem$ChecksumFSInputChecker.readChunk(ChecksumFileSystem.java:188)
> 	at org.apache.hadoop.fs.FSInputChecker.readChecksumChunk(FSInputChecker.java:234)
> 	at org.apache.hadoop.fs.FSInputChecker.fill(FSInputChecker.java:176)
> 	at org.apache.hadoop.fs.FSInputChecker.read1(FSInputChecker.java:193)
> 	at org.apache.hadoop.fs.FSInputChecker.read(FSInputChecker.java:157)
> 	at org.apache.hadoop.fs.FSInputChecker.readFully(FSInputChecker.java:353)
> 	at org.apache.hadoop.fs.FSInputChecker.seek(FSInputChecker.java:331)
> 	at org.apache.hadoop.fs.FSInputChecker.skip(FSInputChecker.java:306)
> 	at java.io.FilterInputStream.skip(FilterInputStream.java:125)
> 	at java.io.FilterInputStream.skip(FilterInputStream.java:125)
> 	at com.yahoo.pig.impl.io.InputStreamPosition.skip(InputStreamPosition.java:55)
> 	at java.io.BufferedInputStream.skip(BufferedInputStream.java:349)
> 	at java.io.FilterInputStream.skip(FilterInputStream.java:125)
> 	at com.yahoo.pig.impl.builtin.RandomSampleLoader.getNext(RandomSampleLoader.java:34)
> 	at com.yahoo.pig.impl.mapreduceExec.PigInputFormat$PigRecordReader.next(PigInputFormat.java:169)
> 	at org.apache.hadoop.mapred.MapTask$1.next(MapTask.java:171)
> 	at com.yahoo.pig.impl.mapreduceExec.PigMapReduce.run(PigMapReduce.java:98)
> 	at org.apache.hadoop.mapred.MapTask.run(MapTask.java:189)
> 	at org.apache.hadoop.mapred.TaskTracker$Child.main(TaskTracker.java:1771)

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (HADOOP-1619) FSInputChecker attempts to seek past EOF

Posted by "Hairong Kuang (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HADOOP-1619?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Hairong Kuang updated HADOOP-1619:
----------------------------------

    Attachment: skip.patch

Devaraj's error case is different from the reported error. Devaraj, please file a different jira if you see your error again.

The attached patch fixes the reported error in the following approach:
1. FSInputChecker allows both seek & skip to pass EOF without throwing an EOFException;
2. Both ChecksumFileSystem and dfs disallow seek & skip to pass EOF; Skip returns the actuall number of bytes skipped. If it passes EOF, it returns less number of bytes;
3. Add unit test cases to make skip returns the right value and does not throw EOFException;

> FSInputChecker attempts to seek past EOF
> ----------------------------------------
>
>                 Key: HADOOP-1619
>                 URL: https://issues.apache.org/jira/browse/HADOOP-1619
>             Project: Hadoop
>          Issue Type: Bug
>          Components: fs
>    Affects Versions: 0.14.0
>            Reporter: Nigel Daley
>            Assignee: Hairong Kuang
>            Priority: Blocker
>             Fix For: 0.14.0
>
>         Attachments: skip.patch
>
>
> I'm not sure which class in the stack trace below is responsible for attempting to seek past the end of file. 
> 2007-07-16 20:31:40,598 INFO org.apache.hadoop.mapred.TaskInProgress: Error from task_200707162028_0014_m_000000_0: java.io.IOException: Cannot seek after EOF
> 	at org.apache.hadoop.dfs.DFSClient$DFSInputStream.seek(DFSClient.java:1040)
> 	at org.apache.hadoop.fs.FSDataInputStream.seek(FSDataInputStream.java:37)
> 	at org.apache.hadoop.fs.ChecksumFileSystem$ChecksumFSInputChecker.readChunk(ChecksumFileSystem.java:188)
> 	at org.apache.hadoop.fs.FSInputChecker.readChecksumChunk(FSInputChecker.java:234)
> 	at org.apache.hadoop.fs.FSInputChecker.fill(FSInputChecker.java:176)
> 	at org.apache.hadoop.fs.FSInputChecker.read1(FSInputChecker.java:193)
> 	at org.apache.hadoop.fs.FSInputChecker.read(FSInputChecker.java:157)
> 	at org.apache.hadoop.fs.FSInputChecker.readFully(FSInputChecker.java:353)
> 	at org.apache.hadoop.fs.FSInputChecker.seek(FSInputChecker.java:331)
> 	at org.apache.hadoop.fs.FSInputChecker.skip(FSInputChecker.java:306)
> 	at java.io.FilterInputStream.skip(FilterInputStream.java:125)
> 	at java.io.FilterInputStream.skip(FilterInputStream.java:125)
> 	at com.yahoo.pig.impl.io.InputStreamPosition.skip(InputStreamPosition.java:55)
> 	at java.io.BufferedInputStream.skip(BufferedInputStream.java:349)
> 	at java.io.FilterInputStream.skip(FilterInputStream.java:125)
> 	at com.yahoo.pig.impl.builtin.RandomSampleLoader.getNext(RandomSampleLoader.java:34)
> 	at com.yahoo.pig.impl.mapreduceExec.PigInputFormat$PigRecordReader.next(PigInputFormat.java:169)
> 	at org.apache.hadoop.mapred.MapTask$1.next(MapTask.java:171)
> 	at com.yahoo.pig.impl.mapreduceExec.PigMapReduce.run(PigMapReduce.java:98)
> 	at org.apache.hadoop.mapred.MapTask.run(MapTask.java:189)
> 	at org.apache.hadoop.mapred.TaskTracker$Child.main(TaskTracker.java:1771)

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HADOOP-1619) FSInputChecker attempts to seek past EOF

Posted by "Raghu Angadi (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-1619?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12513779 ] 

Raghu Angadi commented on HADOOP-1619:
--------------------------------------


The above two traces are very different. The second one is not expected. It is just doing a sequential read. 

The first might be or might  not be expected with the current FSInputStream.. I could not get hold of the Pig code that causes it ( where is RandomSampleLoader.java ?). 

skip can be corrected to be inline to comply with standard contract of skip(). But that won't fix  the second trace. Devaraj, please let us know if you see this again.


> FSInputChecker attempts to seek past EOF
> ----------------------------------------
>
>                 Key: HADOOP-1619
>                 URL: https://issues.apache.org/jira/browse/HADOOP-1619
>             Project: Hadoop
>          Issue Type: Bug
>          Components: fs
>    Affects Versions: 0.14.0
>            Reporter: Nigel Daley
>            Assignee: Hairong Kuang
>            Priority: Blocker
>             Fix For: 0.14.0
>
>
> I'm not sure which class in the stack trace below is responsible for attempting to seek past the end of file. 
> 2007-07-16 20:31:40,598 INFO org.apache.hadoop.mapred.TaskInProgress: Error from task_200707162028_0014_m_000000_0: java.io.IOException: Cannot seek after EOF
> 	at org.apache.hadoop.dfs.DFSClient$DFSInputStream.seek(DFSClient.java:1040)
> 	at org.apache.hadoop.fs.FSDataInputStream.seek(FSDataInputStream.java:37)
> 	at org.apache.hadoop.fs.ChecksumFileSystem$ChecksumFSInputChecker.readChunk(ChecksumFileSystem.java:188)
> 	at org.apache.hadoop.fs.FSInputChecker.readChecksumChunk(FSInputChecker.java:234)
> 	at org.apache.hadoop.fs.FSInputChecker.fill(FSInputChecker.java:176)
> 	at org.apache.hadoop.fs.FSInputChecker.read1(FSInputChecker.java:193)
> 	at org.apache.hadoop.fs.FSInputChecker.read(FSInputChecker.java:157)
> 	at org.apache.hadoop.fs.FSInputChecker.readFully(FSInputChecker.java:353)
> 	at org.apache.hadoop.fs.FSInputChecker.seek(FSInputChecker.java:331)
> 	at org.apache.hadoop.fs.FSInputChecker.skip(FSInputChecker.java:306)
> 	at java.io.FilterInputStream.skip(FilterInputStream.java:125)
> 	at java.io.FilterInputStream.skip(FilterInputStream.java:125)
> 	at com.yahoo.pig.impl.io.InputStreamPosition.skip(InputStreamPosition.java:55)
> 	at java.io.BufferedInputStream.skip(BufferedInputStream.java:349)
> 	at java.io.FilterInputStream.skip(FilterInputStream.java:125)
> 	at com.yahoo.pig.impl.builtin.RandomSampleLoader.getNext(RandomSampleLoader.java:34)
> 	at com.yahoo.pig.impl.mapreduceExec.PigInputFormat$PigRecordReader.next(PigInputFormat.java:169)
> 	at org.apache.hadoop.mapred.MapTask$1.next(MapTask.java:171)
> 	at com.yahoo.pig.impl.mapreduceExec.PigMapReduce.run(PigMapReduce.java:98)
> 	at org.apache.hadoop.mapred.MapTask.run(MapTask.java:189)
> 	at org.apache.hadoop.mapred.TaskTracker$Child.main(TaskTracker.java:1771)

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.