You are viewing a plain text version of this content. The canonical link for it is here.
Posted to common-dev@hadoop.apache.org by "dhruba borthakur (JIRA)" <ji...@apache.org> on 2009/05/22 03:07:45 UTC

[jira] Created: (HADOOP-5892) Implement seek for HftpFileSystem

Implement seek for HftpFileSystem
---------------------------------

                 Key: HADOOP-5892
                 URL: https://issues.apache.org/jira/browse/HADOOP-5892
             Project: Hadoop Core
          Issue Type: Improvement
          Components: fs
            Reporter: dhruba borthakur
            Assignee: dhruba borthakur


Support seek in the HftpFileSystem. This is useful for a host of applications that need to access data from a hadoop cluster running a different version of hadoop.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (HADOOP-5892) Implement seek for HftpFileSystem

Posted by "dhruba borthakur (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HADOOP-5892?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

dhruba borthakur updated HADOOP-5892:
-------------------------------------

    Attachment: hftpSeek.txt

The seek call eats up unused bytes from the inputStream. This can be made more efficient by reading in larger chunks rather than byte-at-a-time

> Implement seek for HftpFileSystem
> ---------------------------------
>
>                 Key: HADOOP-5892
>                 URL: https://issues.apache.org/jira/browse/HADOOP-5892
>             Project: Hadoop Core
>          Issue Type: Improvement
>          Components: dfs
>            Reporter: dhruba borthakur
>            Assignee: dhruba borthakur
>         Attachments: hftpSeek.txt
>
>
> Support seek in the HftpFileSystem. This is useful for a host of applications that need to access data from a hadoop cluster running a different version of hadoop.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HADOOP-5892) Implement seek for HftpFileSystem

Posted by "Doug Cutting (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-5892?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12713263#action_12713263 ] 

Doug Cutting commented on HADOOP-5892:
--------------------------------------

This isn't really implementing seek(), but rather skip(), since it only goes forward.  Even then, it will be prohibitively slow for large files.

It would be better to implement a real seek.  This can be done by having the seek method close the existing connection and open a new connection specifying a start position as a url parameter.  Then change FileDataServlet to pass this parameter along, and StreamFile to perform a seek before reading data.


> Implement seek for HftpFileSystem
> ---------------------------------
>
>                 Key: HADOOP-5892
>                 URL: https://issues.apache.org/jira/browse/HADOOP-5892
>             Project: Hadoop Core
>          Issue Type: Improvement
>          Components: dfs
>            Reporter: dhruba borthakur
>            Assignee: dhruba borthakur
>         Attachments: hftpSeek.txt
>
>
> Support seek in the HftpFileSystem. This is useful for a host of applications that need to access data from a hadoop cluster running a different version of hadoop.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (HADOOP-5892) Implement seek for HftpFileSystem

Posted by "dhruba borthakur (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HADOOP-5892?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

dhruba borthakur updated HADOOP-5892:
-------------------------------------

    Component/s:     (was: fs)
                 dfs

> Implement seek for HftpFileSystem
> ---------------------------------
>
>                 Key: HADOOP-5892
>                 URL: https://issues.apache.org/jira/browse/HADOOP-5892
>             Project: Hadoop Core
>          Issue Type: Improvement
>          Components: dfs
>            Reporter: dhruba borthakur
>            Assignee: dhruba borthakur
>         Attachments: hftpSeek.txt
>
>
> Support seek in the HftpFileSystem. This is useful for a host of applications that need to access data from a hadoop cluster running a different version of hadoop.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.