You are viewing a plain text version of this content. The canonical link for it is here.
Posted to common-dev@hadoop.apache.org by "Tom White (JIRA)" <ji...@apache.org> on 2008/09/16 15:00:44 UTC

[jira] Created: (HADOOP-4188) Remove Task's dependency on concrete file systems

Remove Task's dependency on concrete file systems
-------------------------------------------------

                 Key: HADOOP-4188
                 URL: https://issues.apache.org/jira/browse/HADOOP-4188
             Project: Hadoop Core
          Issue Type: Sub-task
            Reporter: Tom White




-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HADOOP-4188) Remove Task's dependency on concrete file systems

Posted by "Sharad Agarwal (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-4188?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12646526#action_12646526 ] 

Sharad Agarwal commented on HADOOP-4188:
----------------------------------------

javadoc warning was due to HADOOP-4621

> Remove Task's dependency on concrete file systems
> -------------------------------------------------
>
>                 Key: HADOOP-4188
>                 URL: https://issues.apache.org/jira/browse/HADOOP-4188
>             Project: Hadoop Core
>          Issue Type: Sub-task
>            Reporter: Tom White
>            Assignee: Sharad Agarwal
>             Fix For: 0.20.0
>
>         Attachments: 4188_v1.patch, 4188_v2.patch
>
>


-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (HADOOP-4188) Remove Task's dependency on concrete file systems

Posted by "Robert Chansler (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HADOOP-4188?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Robert Chansler updated HADOOP-4188:
------------------------------------

     Component/s: mapred
    Hadoop Flags: [Incompatible change, Reviewed]  (was: [Reviewed, Incompatible change])

> Remove Task's dependency on concrete file systems
> -------------------------------------------------
>
>                 Key: HADOOP-4188
>                 URL: https://issues.apache.org/jira/browse/HADOOP-4188
>             Project: Hadoop Core
>          Issue Type: Sub-task
>          Components: mapred
>            Reporter: Tom White
>            Assignee: Sharad Agarwal
>             Fix For: 0.20.0
>
>         Attachments: 4188_v1.patch, 4188_v2.patch
>
>


-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HADOOP-4188) Remove Task's dependency on concrete file systems

Posted by "Owen O'Malley (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-4188?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12633803#action_12633803 ] 

Owen O'Malley commented on HADOOP-4188:
---------------------------------------

Is it just local file system? Or is it something more specific. It is clearly ok for mapred to depend on core.

> Remove Task's dependency on concrete file systems
> -------------------------------------------------
>
>                 Key: HADOOP-4188
>                 URL: https://issues.apache.org/jira/browse/HADOOP-4188
>             Project: Hadoop Core
>          Issue Type: Sub-task
>            Reporter: Tom White
>


-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HADOOP-4188) Remove Task's dependency on concrete file systems

Posted by "Hadoop QA (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-4188?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12646263#action_12646263 ] 

Hadoop QA commented on HADOOP-4188:
-----------------------------------

-1 overall.  Here are the results of testing the latest attachment 
  http://issues.apache.org/jira/secure/attachment/12393617/4188_v2.patch
  against trunk revision 712615.

    +1 @author.  The patch does not contain any @author tags.

    +1 tests included.  The patch appears to include 9 new or modified tests.

    -1 javadoc.  The javadoc tool appears to have generated 1 warning messages.

    +1 javac.  The applied patch does not increase the total number of javac compiler warnings.

    +1 findbugs.  The patch does not introduce any new Findbugs warnings.

    +1 Eclipse classpath. The patch retains Eclipse classpath integrity.

    +1 core tests.  The patch passed core unit tests.

    +1 contrib tests.  The patch passed contrib unit tests.

Test results: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/3568/testReport/
Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/3568/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
Checkstyle results: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/3568/artifact/trunk/build/test/checkstyle-errors.html
Console output: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/3568/console

This message is automatically generated.

> Remove Task's dependency on concrete file systems
> -------------------------------------------------
>
>                 Key: HADOOP-4188
>                 URL: https://issues.apache.org/jira/browse/HADOOP-4188
>             Project: Hadoop Core
>          Issue Type: Sub-task
>            Reporter: Tom White
>            Assignee: Sharad Agarwal
>             Fix For: 0.20.0
>
>         Attachments: 4188_v1.patch, 4188_v2.patch
>
>


-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HADOOP-4188) Remove Task's dependency on concrete file systems

Posted by "Sharad Agarwal (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-4188?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12645475#action_12645475 ] 

Sharad Agarwal commented on HADOOP-4188:
----------------------------------------

The Task.FileSystemCounter enum has the read/write counters for all concrete filesystems. So if we want Task to be completely agnostic to filesystems, the framework counters also somehow needs to be dynamically created. Currently there is a static mapping between these - Task_FileSystemCounter.properties
We would then need to associate counter names based on filesystem URI schemes, which is not quite possible if using current FileSystem#statisticsTable. 
We can add Map<String, Statistics> statsByUriScheme to FileSystem as suggested by Doug and use that. This map can be populated in createFileSystem(URI uri, Configuration conf) call as :
statsByUriScheme.put(uri.getScheme(), fs.statistics);

Other very straightforward alternative which looks good to me:
Just break the compile time dependency on the concrete file system and use conf.getClassByName; as anyway Task is aware of the concrete filesystems in Task.FileSystemCounter. Making it fully agnostic would require refactoring of Task.FileSystemCounter etc.

> Remove Task's dependency on concrete file systems
> -------------------------------------------------
>
>                 Key: HADOOP-4188
>                 URL: https://issues.apache.org/jira/browse/HADOOP-4188
>             Project: Hadoop Core
>          Issue Type: Sub-task
>            Reporter: Tom White
>            Assignee: Sharad Agarwal
>


-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (HADOOP-4188) Remove Task's dependency on concrete file systems

Posted by "Owen O'Malley (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HADOOP-4188?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Owen O'Malley updated HADOOP-4188:
----------------------------------

    Status: Open  (was: Patch Available)

I think that we should have a new constructor for FSInputStream (or use the long one that is already there) that takes the boolean for whether it needs to be checked and use that in ChecksumFileSystem. So that you can't turn checksums on and off while it is being read.

Other than that, I think this patch looks good.

> Remove Task's dependency on concrete file systems
> -------------------------------------------------
>
>                 Key: HADOOP-4188
>                 URL: https://issues.apache.org/jira/browse/HADOOP-4188
>             Project: Hadoop Core
>          Issue Type: Sub-task
>            Reporter: Tom White
>            Assignee: Sharad Agarwal
>             Fix For: 0.20.0
>
>         Attachments: 4188_v1.patch, 4188_v2.patch
>
>


-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (HADOOP-4188) Remove Task's dependency on concrete file systems

Posted by "Sharad Agarwal (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HADOOP-4188?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Sharad Agarwal updated HADOOP-4188:
-----------------------------------

    Attachment: 4188_v2.patch

changes from the last patch:
- fixed statistics for LocalFileSystem. LocalFileSystem wraps RawLocalFileSystem. Statistics gets updated for only RawLocalFileSystem. LocalFileSystem#statistics should return the statistics object from RawLocalFileSystem. For this I have set the statistics in FilterFileSystem to the wrapped FileSystem's statistics.
- deprecated FileSytem#getStatistics(Class<? extends FileSystem> cls)

> Remove Task's dependency on concrete file systems
> -------------------------------------------------
>
>                 Key: HADOOP-4188
>                 URL: https://issues.apache.org/jira/browse/HADOOP-4188
>             Project: Hadoop Core
>          Issue Type: Sub-task
>            Reporter: Tom White
>            Assignee: Sharad Agarwal
>         Attachments: 4188_v1.patch, 4188_v2.patch
>
>


-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HADOOP-4188) Remove Task's dependency on concrete file systems

Posted by "Tom White (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-4188?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12643952#action_12643952 ] 

Tom White commented on HADOOP-4188:
-----------------------------------

Task depends on HDFS's DistributedFileSystem for updating statistics. It also has a list of file systems that it updates statistics for, it should get this list from FileSystem. See https://issues.apache.org/jira/browse/HADOOP-3750?focusedCommentId=12615367#action_12615367

> Remove Task's dependency on concrete file systems
> -------------------------------------------------
>
>                 Key: HADOOP-4188
>                 URL: https://issues.apache.org/jira/browse/HADOOP-4188
>             Project: Hadoop Core
>          Issue Type: Sub-task
>            Reporter: Tom White
>


-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (HADOOP-4188) Remove Task's dependency on concrete file systems

Posted by "Sharad Agarwal (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HADOOP-4188?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Sharad Agarwal updated HADOOP-4188:
-----------------------------------

    Attachment: 4188_v1.patch

this patch (not tested yet) :
- adds statistics table to FileSystem class indexed by URIScheme
- removes the Task's dependency on concrete file systems by taking the list from FileSystem class
- dynamically creates the File system counters in Task of the form <URIScheme>_BYTES_READ/<URIScheme>_BYTES_WRITE

> Remove Task's dependency on concrete file systems
> -------------------------------------------------
>
>                 Key: HADOOP-4188
>                 URL: https://issues.apache.org/jira/browse/HADOOP-4188
>             Project: Hadoop Core
>          Issue Type: Sub-task
>            Reporter: Tom White
>            Assignee: Sharad Agarwal
>         Attachments: 4188_v1.patch
>
>


-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HADOOP-4188) Remove Task's dependency on concrete file systems

Posted by "Sharad Agarwal (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-4188?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12646184#action_12646184 ] 

Sharad Agarwal commented on HADOOP-4188:
----------------------------------------

bq. Do we need to support (and deprecate) the old counter names
don't need to as these counters are not public

> Remove Task's dependency on concrete file systems
> -------------------------------------------------
>
>                 Key: HADOOP-4188
>                 URL: https://issues.apache.org/jira/browse/HADOOP-4188
>             Project: Hadoop Core
>          Issue Type: Sub-task
>            Reporter: Tom White
>            Assignee: Sharad Agarwal
>         Attachments: 4188_v1.patch
>
>


-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (HADOOP-4188) Remove Task's dependency on concrete file systems

Posted by "Sharad Agarwal (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HADOOP-4188?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Sharad Agarwal updated HADOOP-4188:
-----------------------------------

    Fix Version/s: 0.20.0
           Status: Patch Available  (was: Open)

> Remove Task's dependency on concrete file systems
> -------------------------------------------------
>
>                 Key: HADOOP-4188
>                 URL: https://issues.apache.org/jira/browse/HADOOP-4188
>             Project: Hadoop Core
>          Issue Type: Sub-task
>            Reporter: Tom White
>            Assignee: Sharad Agarwal
>             Fix For: 0.20.0
>
>         Attachments: 4188_v1.patch, 4188_v2.patch
>
>


-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Resolved: (HADOOP-4188) Remove Task's dependency on concrete file systems

Posted by "Devaraj Das (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HADOOP-4188?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Devaraj Das resolved HADOOP-4188.
---------------------------------

      Resolution: Fixed
    Hadoop Flags: [Incompatible change, Reviewed]

I just committed this. Thanks, Sharad!

> Remove Task's dependency on concrete file systems
> -------------------------------------------------
>
>                 Key: HADOOP-4188
>                 URL: https://issues.apache.org/jira/browse/HADOOP-4188
>             Project: Hadoop Core
>          Issue Type: Sub-task
>            Reporter: Tom White
>            Assignee: Sharad Agarwal
>             Fix For: 0.20.0
>
>         Attachments: 4188_v1.patch, 4188_v2.patch
>
>


-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (HADOOP-4188) Remove Task's dependency on concrete file systems

Posted by "Robert Chansler (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HADOOP-4188?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Robert Chansler updated HADOOP-4188:
------------------------------------

    Release Note: Removed Task's dependency on concrete file systems by taking list from FileSystem class. Added statistics table to FileSystem class. Deprecated FileSystem method getStatistics(Class<? extends FileSystem> cls).
    Hadoop Flags: [Incompatible change, Reviewed]  (was: [Reviewed, Incompatible change])

Edit release note for publication.

> Remove Task's dependency on concrete file systems
> -------------------------------------------------
>
>                 Key: HADOOP-4188
>                 URL: https://issues.apache.org/jira/browse/HADOOP-4188
>             Project: Hadoop Core
>          Issue Type: Sub-task
>            Reporter: Tom White
>            Assignee: Sharad Agarwal
>             Fix For: 0.20.0
>
>         Attachments: 4188_v1.patch, 4188_v2.patch
>
>


-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HADOOP-4188) Remove Task's dependency on concrete file systems

Posted by "Tom White (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-4188?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12645846#action_12645846 ] 

Tom White commented on HADOOP-4188:
-----------------------------------

Looks good. This change renames the counters. Do we need to support (and deprecate) the old counter names (i.e. do they form a part of the public API) while introducing the new counter names, or is it acceptable to mark this as an incompatible change?

> Remove Task's dependency on concrete file systems
> -------------------------------------------------
>
>                 Key: HADOOP-4188
>                 URL: https://issues.apache.org/jira/browse/HADOOP-4188
>             Project: Hadoop Core
>          Issue Type: Sub-task
>            Reporter: Tom White
>            Assignee: Sharad Agarwal
>         Attachments: 4188_v1.patch
>
>


-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Assigned: (HADOOP-4188) Remove Task's dependency on concrete file systems

Posted by "Sharad Agarwal (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HADOOP-4188?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Sharad Agarwal reassigned HADOOP-4188:
--------------------------------------

    Assignee: Sharad Agarwal

> Remove Task's dependency on concrete file systems
> -------------------------------------------------
>
>                 Key: HADOOP-4188
>                 URL: https://issues.apache.org/jira/browse/HADOOP-4188
>             Project: Hadoop Core
>          Issue Type: Sub-task
>            Reporter: Tom White
>            Assignee: Sharad Agarwal
>


-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (HADOOP-4188) Remove Task's dependency on concrete file systems

Posted by "Owen O'Malley (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HADOOP-4188?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Owen O'Malley updated HADOOP-4188:
----------------------------------

    Comment: was deleted

> Remove Task's dependency on concrete file systems
> -------------------------------------------------
>
>                 Key: HADOOP-4188
>                 URL: https://issues.apache.org/jira/browse/HADOOP-4188
>             Project: Hadoop Core
>          Issue Type: Sub-task
>            Reporter: Tom White
>            Assignee: Sharad Agarwal
>             Fix For: 0.20.0
>
>         Attachments: 4188_v1.patch, 4188_v2.patch
>
>


-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.