You are viewing a plain text version of this content. The canonical link for it is here.
Posted to common-dev@hadoop.apache.org by "Devaraj Das (JIRA)" <ji...@apache.org> on 2008/03/16 14:40:24 UTC
[jira] Created: (HADOOP-3025) ChecksumFileSystem needs to support
the new delete method
ChecksumFileSystem needs to support the new delete method
---------------------------------------------------------
Key: HADOOP-3025
URL: https://issues.apache.org/jira/browse/HADOOP-3025
Project: Hadoop Core
Issue Type: Bug
Components: fs
Affects Versions: 0.17.0
Reporter: Devaraj Das
Assignee: dhruba borthakur
Priority: Blocker
Fix For: 0.17.0
The method FileSystem.delete(path) has been deprecated in favor of the new method delete(path, recursive). Temporary files gets created in the MapReduce framework and when the time for deletion comes, they are deleted via delete(path, recursive). This doesn't delete the associated checksum files. This has a big impact when the FileSystem is the InMemoryFileSystem, where space is at a premium and wasting space here might hurt the performance of MapReduce jobs overall. One solution to this problem is to implement the method delete(path, recursive) in the ChecksumFileSystem but is there is a reason why it was left out as part of HADOOP-771?
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.
[jira] Updated: (HADOOP-3025) ChecksumFileSystem needs to support
the new delete method
Posted by "dhruba borthakur (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/HADOOP-3025?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
dhruba borthakur updated HADOOP-3025:
-------------------------------------
Resolution: Fixed
Status: Resolved (was: Patch Available)
I just committed this. Thanks Mahadev!
> ChecksumFileSystem needs to support the new delete method
> ---------------------------------------------------------
>
> Key: HADOOP-3025
> URL: https://issues.apache.org/jira/browse/HADOOP-3025
> Project: Hadoop Core
> Issue Type: Bug
> Components: fs
> Affects Versions: 0.17.0
> Reporter: Devaraj Das
> Assignee: Mahadev konar
> Priority: Blocker
> Fix For: 0.17.0
>
> Attachments: HADOOP_3025_1.patch, HADOOP_3025_2.patch, HADOOP_3025_3.patch, HADOOP_3025_4.patch
>
>
> The method FileSystem.delete(path) has been deprecated in favor of the new method delete(path, recursive). Temporary files gets created in the MapReduce framework and when the time for deletion comes, they are deleted via delete(path, recursive). This doesn't delete the associated checksum files. This has a big impact when the FileSystem is the InMemoryFileSystem, where space is at a premium and wasting space here might hurt the performance of MapReduce jobs overall. One solution to this problem is to implement the method delete(path, recursive) in the ChecksumFileSystem but is there is a reason why it was left out as part of HADOOP-771?
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.
[jira] Updated: (HADOOP-3025) ChecksumFileSystem needs to support
the new delete method
Posted by "Mahadev konar (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/HADOOP-3025?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Mahadev konar updated HADOOP-3025:
----------------------------------
Status: Patch Available (was: Open)
> ChecksumFileSystem needs to support the new delete method
> ---------------------------------------------------------
>
> Key: HADOOP-3025
> URL: https://issues.apache.org/jira/browse/HADOOP-3025
> Project: Hadoop Core
> Issue Type: Bug
> Components: fs
> Affects Versions: 0.17.0
> Reporter: Devaraj Das
> Assignee: Mahadev konar
> Priority: Blocker
> Fix For: 0.17.0
>
> Attachments: HADOOP_3025_1.patch, HADOOP_3025_2.patch, HADOOP_3025_3.patch, HADOOP_3025_4.patch
>
>
> The method FileSystem.delete(path) has been deprecated in favor of the new method delete(path, recursive). Temporary files gets created in the MapReduce framework and when the time for deletion comes, they are deleted via delete(path, recursive). This doesn't delete the associated checksum files. This has a big impact when the FileSystem is the InMemoryFileSystem, where space is at a premium and wasting space here might hurt the performance of MapReduce jobs overall. One solution to this problem is to implement the method delete(path, recursive) in the ChecksumFileSystem but is there is a reason why it was left out as part of HADOOP-771?
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.
[jira] Commented: (HADOOP-3025) ChecksumFileSystem needs to support
the new delete method
Posted by "Hudson (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/HADOOP-3025?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12581067#action_12581067 ]
Hudson commented on HADOOP-3025:
--------------------------------
Integrated in Hadoop-trunk #435 (See [http://hudson.zones.apache.org/hudson/job/Hadoop-trunk/435/])
> ChecksumFileSystem needs to support the new delete method
> ---------------------------------------------------------
>
> Key: HADOOP-3025
> URL: https://issues.apache.org/jira/browse/HADOOP-3025
> Project: Hadoop Core
> Issue Type: Bug
> Components: fs
> Affects Versions: 0.17.0
> Reporter: Devaraj Das
> Assignee: Mahadev konar
> Priority: Blocker
> Fix For: 0.17.0
>
> Attachments: HADOOP_3025_1.patch, HADOOP_3025_2.patch, HADOOP_3025_3.patch, HADOOP_3025_4.patch
>
>
> The method FileSystem.delete(path) has been deprecated in favor of the new method delete(path, recursive). Temporary files gets created in the MapReduce framework and when the time for deletion comes, they are deleted via delete(path, recursive). This doesn't delete the associated checksum files. This has a big impact when the FileSystem is the InMemoryFileSystem, where space is at a premium and wasting space here might hurt the performance of MapReduce jobs overall. One solution to this problem is to implement the method delete(path, recursive) in the ChecksumFileSystem but is there is a reason why it was left out as part of HADOOP-771?
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.
[jira] Commented: (HADOOP-3025) ChecksumFileSystem needs to support
the new delete method
Posted by "Mahadev konar (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/HADOOP-3025?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12579506#action_12579506 ]
Mahadev konar commented on HADOOP-3025:
---------------------------------------
this is a bug introduced by my patch to HADOOP-771. I will fix it as soon as possible.
> ChecksumFileSystem needs to support the new delete method
> ---------------------------------------------------------
>
> Key: HADOOP-3025
> URL: https://issues.apache.org/jira/browse/HADOOP-3025
> Project: Hadoop Core
> Issue Type: Bug
> Components: fs
> Affects Versions: 0.17.0
> Reporter: Devaraj Das
> Assignee: dhruba borthakur
> Priority: Blocker
> Fix For: 0.17.0
>
>
> The method FileSystem.delete(path) has been deprecated in favor of the new method delete(path, recursive). Temporary files gets created in the MapReduce framework and when the time for deletion comes, they are deleted via delete(path, recursive). This doesn't delete the associated checksum files. This has a big impact when the FileSystem is the InMemoryFileSystem, where space is at a premium and wasting space here might hurt the performance of MapReduce jobs overall. One solution to this problem is to implement the method delete(path, recursive) in the ChecksumFileSystem but is there is a reason why it was left out as part of HADOOP-771?
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.
[jira] Updated: (HADOOP-3025) ChecksumFileSystem needs to support
the new delete method
Posted by "Mahadev konar (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/HADOOP-3025?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Mahadev konar updated HADOOP-3025:
----------------------------------
Attachment: HADOOP_3025_1.patch
attaching patch for this. I havent added a unit test. Will add shortly.. mukund can you try with this patch?
> ChecksumFileSystem needs to support the new delete method
> ---------------------------------------------------------
>
> Key: HADOOP-3025
> URL: https://issues.apache.org/jira/browse/HADOOP-3025
> Project: Hadoop Core
> Issue Type: Bug
> Components: fs
> Affects Versions: 0.17.0
> Reporter: Devaraj Das
> Assignee: Mahadev konar
> Priority: Blocker
> Fix For: 0.17.0
>
> Attachments: HADOOP_3025_1.patch
>
>
> The method FileSystem.delete(path) has been deprecated in favor of the new method delete(path, recursive). Temporary files gets created in the MapReduce framework and when the time for deletion comes, they are deleted via delete(path, recursive). This doesn't delete the associated checksum files. This has a big impact when the FileSystem is the InMemoryFileSystem, where space is at a premium and wasting space here might hurt the performance of MapReduce jobs overall. One solution to this problem is to implement the method delete(path, recursive) in the ChecksumFileSystem but is there is a reason why it was left out as part of HADOOP-771?
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.
[jira] Updated: (HADOOP-3025) ChecksumFileSystem needs to support
the new delete method
Posted by "Mahadev konar (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/HADOOP-3025?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Mahadev konar updated HADOOP-3025:
----------------------------------
Attachment: HADOOP_3025_4.patch
here is a patch with nicholas's comments incorporated.
> ChecksumFileSystem needs to support the new delete method
> ---------------------------------------------------------
>
> Key: HADOOP-3025
> URL: https://issues.apache.org/jira/browse/HADOOP-3025
> Project: Hadoop Core
> Issue Type: Bug
> Components: fs
> Affects Versions: 0.17.0
> Reporter: Devaraj Das
> Assignee: Mahadev konar
> Priority: Blocker
> Fix For: 0.17.0
>
> Attachments: HADOOP_3025_1.patch, HADOOP_3025_2.patch, HADOOP_3025_3.patch, HADOOP_3025_4.patch
>
>
> The method FileSystem.delete(path) has been deprecated in favor of the new method delete(path, recursive). Temporary files gets created in the MapReduce framework and when the time for deletion comes, they are deleted via delete(path, recursive). This doesn't delete the associated checksum files. This has a big impact when the FileSystem is the InMemoryFileSystem, where space is at a premium and wasting space here might hurt the performance of MapReduce jobs overall. One solution to this problem is to implement the method delete(path, recursive) in the ChecksumFileSystem but is there is a reason why it was left out as part of HADOOP-771?
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.
[jira] Commented: (HADOOP-3025) ChecksumFileSystem needs to support
the new delete method
Posted by "Tsz Wo (Nicholas), SZE (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/HADOOP-3025?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12580048#action_12580048 ]
Tsz Wo (Nicholas), SZE commented on HADOOP-3025:
------------------------------------------------
+1
> ChecksumFileSystem needs to support the new delete method
> ---------------------------------------------------------
>
> Key: HADOOP-3025
> URL: https://issues.apache.org/jira/browse/HADOOP-3025
> Project: Hadoop Core
> Issue Type: Bug
> Components: fs
> Affects Versions: 0.17.0
> Reporter: Devaraj Das
> Assignee: Mahadev konar
> Priority: Blocker
> Fix For: 0.17.0
>
> Attachments: HADOOP_3025_1.patch, HADOOP_3025_2.patch, HADOOP_3025_3.patch, HADOOP_3025_4.patch
>
>
> The method FileSystem.delete(path) has been deprecated in favor of the new method delete(path, recursive). Temporary files gets created in the MapReduce framework and when the time for deletion comes, they are deleted via delete(path, recursive). This doesn't delete the associated checksum files. This has a big impact when the FileSystem is the InMemoryFileSystem, where space is at a premium and wasting space here might hurt the performance of MapReduce jobs overall. One solution to this problem is to implement the method delete(path, recursive) in the ChecksumFileSystem but is there is a reason why it was left out as part of HADOOP-771?
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.
[jira] Assigned: (HADOOP-3025) ChecksumFileSystem needs to support
the new delete method
Posted by "Mahadev konar (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/HADOOP-3025?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Mahadev konar reassigned HADOOP-3025:
-------------------------------------
Assignee: Mahadev konar (was: dhruba borthakur)
> ChecksumFileSystem needs to support the new delete method
> ---------------------------------------------------------
>
> Key: HADOOP-3025
> URL: https://issues.apache.org/jira/browse/HADOOP-3025
> Project: Hadoop Core
> Issue Type: Bug
> Components: fs
> Affects Versions: 0.17.0
> Reporter: Devaraj Das
> Assignee: Mahadev konar
> Priority: Blocker
> Fix For: 0.17.0
>
>
> The method FileSystem.delete(path) has been deprecated in favor of the new method delete(path, recursive). Temporary files gets created in the MapReduce framework and when the time for deletion comes, they are deleted via delete(path, recursive). This doesn't delete the associated checksum files. This has a big impact when the FileSystem is the InMemoryFileSystem, where space is at a premium and wasting space here might hurt the performance of MapReduce jobs overall. One solution to this problem is to implement the method delete(path, recursive) in the ChecksumFileSystem but is there is a reason why it was left out as part of HADOOP-771?
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.
[jira] Updated: (HADOOP-3025) ChecksumFileSystem needs to support
the new delete method
Posted by "Mahadev konar (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/HADOOP-3025?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Mahadev konar updated HADOOP-3025:
----------------------------------
Attachment: HADOOP_3025_3.patch
this patch includes a test... and remves windows lien delimiters in the file also...
> ChecksumFileSystem needs to support the new delete method
> ---------------------------------------------------------
>
> Key: HADOOP-3025
> URL: https://issues.apache.org/jira/browse/HADOOP-3025
> Project: Hadoop Core
> Issue Type: Bug
> Components: fs
> Affects Versions: 0.17.0
> Reporter: Devaraj Das
> Assignee: Mahadev konar
> Priority: Blocker
> Fix For: 0.17.0
>
> Attachments: HADOOP_3025_1.patch, HADOOP_3025_2.patch, HADOOP_3025_3.patch
>
>
> The method FileSystem.delete(path) has been deprecated in favor of the new method delete(path, recursive). Temporary files gets created in the MapReduce framework and when the time for deletion comes, they are deleted via delete(path, recursive). This doesn't delete the associated checksum files. This has a big impact when the FileSystem is the InMemoryFileSystem, where space is at a premium and wasting space here might hurt the performance of MapReduce jobs overall. One solution to this problem is to implement the method delete(path, recursive) in the ChecksumFileSystem but is there is a reason why it was left out as part of HADOOP-771?
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.
[jira] Commented: (HADOOP-3025) ChecksumFileSystem needs to support
the new delete method
Posted by "Hadoop QA (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/HADOOP-3025?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12580288#action_12580288 ]
Hadoop QA commented on HADOOP-3025:
-----------------------------------
+1 overall. Here are the results of testing the latest attachment
http://issues.apache.org/jira/secure/attachment/12378157/HADOOP_3025_4.patch
against trunk revision 619744.
@author +1. The patch does not contain any @author tags.
tests included +1. The patch appears to include 4 new or modified tests.
javadoc +1. The javadoc tool did not generate any warning messages.
javac +1. The applied patch does not generate any new javac compiler warnings.
release audit +1. The applied patch does not generate any new release audit warnings.
findbugs +1. The patch does not introduce any new Findbugs warnings.
core tests +1. The patch passed core unit tests.
contrib tests +1. The patch passed contrib unit tests.
Test results: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/1993/testReport/
Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/1993/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
Checkstyle results: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/1993/artifact/trunk/build/test/checkstyle-errors.html
Console output: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/1993/console
This message is automatically generated.
> ChecksumFileSystem needs to support the new delete method
> ---------------------------------------------------------
>
> Key: HADOOP-3025
> URL: https://issues.apache.org/jira/browse/HADOOP-3025
> Project: Hadoop Core
> Issue Type: Bug
> Components: fs
> Affects Versions: 0.17.0
> Reporter: Devaraj Das
> Assignee: Mahadev konar
> Priority: Blocker
> Fix For: 0.17.0
>
> Attachments: HADOOP_3025_1.patch, HADOOP_3025_2.patch, HADOOP_3025_3.patch, HADOOP_3025_4.patch
>
>
> The method FileSystem.delete(path) has been deprecated in favor of the new method delete(path, recursive). Temporary files gets created in the MapReduce framework and when the time for deletion comes, they are deleted via delete(path, recursive). This doesn't delete the associated checksum files. This has a big impact when the FileSystem is the InMemoryFileSystem, where space is at a premium and wasting space here might hurt the performance of MapReduce jobs overall. One solution to this problem is to implement the method delete(path, recursive) in the ChecksumFileSystem but is there is a reason why it was left out as part of HADOOP-771?
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.
[jira] Commented: (HADOOP-3025) ChecksumFileSystem needs to support
the new delete method
Posted by "Tsz Wo (Nicholas), SZE (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/HADOOP-3025?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12579970#action_12579970 ]
Tsz Wo (Nicholas), SZE commented on HADOOP-3025:
------------------------------------------------
- I think it is better to change FilterFileSystem.delete(Path f) (or even FileSystem.delete(Path f)) to make it calling FilterFileSystem.delete(f, true), instead of changing ChecksumFileSystem.delete(Path f)
- In ChecksumFileSystem.delete(Path f, boolean recursive), if f is a directory, it calls fs.delete(f, recursive). I think the checksum files won't be deleted.
- We need a test for deleting a tree for testing the recursive parameter.
- In RawInMemoryFileSystem.delete(Path f, boolean recursive), the recursive parameter is ignored.
> ChecksumFileSystem needs to support the new delete method
> ---------------------------------------------------------
>
> Key: HADOOP-3025
> URL: https://issues.apache.org/jira/browse/HADOOP-3025
> Project: Hadoop Core
> Issue Type: Bug
> Components: fs
> Affects Versions: 0.17.0
> Reporter: Devaraj Das
> Assignee: Mahadev konar
> Priority: Blocker
> Fix For: 0.17.0
>
> Attachments: HADOOP_3025_1.patch, HADOOP_3025_2.patch, HADOOP_3025_3.patch
>
>
> The method FileSystem.delete(path) has been deprecated in favor of the new method delete(path, recursive). Temporary files gets created in the MapReduce framework and when the time for deletion comes, they are deleted via delete(path, recursive). This doesn't delete the associated checksum files. This has a big impact when the FileSystem is the InMemoryFileSystem, where space is at a premium and wasting space here might hurt the performance of MapReduce jobs overall. One solution to this problem is to implement the method delete(path, recursive) in the ChecksumFileSystem but is there is a reason why it was left out as part of HADOOP-771?
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.
[jira] Commented: (HADOOP-3025) ChecksumFileSystem needs to support
the new delete method
Posted by "Mahadev konar (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/HADOOP-3025?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12579988#action_12579988 ]
Mahadev konar commented on HADOOP-3025:
---------------------------------------
- makes sense for fileter filesystem to have it ...
- this would work in our case since checkssum is in the same dir and file..
- ill add a test case
- raw in memory does not have an idea of directories ..
> ChecksumFileSystem needs to support the new delete method
> ---------------------------------------------------------
>
> Key: HADOOP-3025
> URL: https://issues.apache.org/jira/browse/HADOOP-3025
> Project: Hadoop Core
> Issue Type: Bug
> Components: fs
> Affects Versions: 0.17.0
> Reporter: Devaraj Das
> Assignee: Mahadev konar
> Priority: Blocker
> Fix For: 0.17.0
>
> Attachments: HADOOP_3025_1.patch, HADOOP_3025_2.patch, HADOOP_3025_3.patch
>
>
> The method FileSystem.delete(path) has been deprecated in favor of the new method delete(path, recursive). Temporary files gets created in the MapReduce framework and when the time for deletion comes, they are deleted via delete(path, recursive). This doesn't delete the associated checksum files. This has a big impact when the FileSystem is the InMemoryFileSystem, where space is at a premium and wasting space here might hurt the performance of MapReduce jobs overall. One solution to this problem is to implement the method delete(path, recursive) in the ChecksumFileSystem but is there is a reason why it was left out as part of HADOOP-771?
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.
[jira] Updated: (HADOOP-3025) ChecksumFileSystem needs to support
the new delete method
Posted by "Mahadev konar (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/HADOOP-3025?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Mahadev konar updated HADOOP-3025:
----------------------------------
Attachment: HADOOP_3025_2.patch
this patch gets rid of an unnecessary exception thrown by delete.
> ChecksumFileSystem needs to support the new delete method
> ---------------------------------------------------------
>
> Key: HADOOP-3025
> URL: https://issues.apache.org/jira/browse/HADOOP-3025
> Project: Hadoop Core
> Issue Type: Bug
> Components: fs
> Affects Versions: 0.17.0
> Reporter: Devaraj Das
> Assignee: Mahadev konar
> Priority: Blocker
> Fix For: 0.17.0
>
> Attachments: HADOOP_3025_1.patch, HADOOP_3025_2.patch
>
>
> The method FileSystem.delete(path) has been deprecated in favor of the new method delete(path, recursive). Temporary files gets created in the MapReduce framework and when the time for deletion comes, they are deleted via delete(path, recursive). This doesn't delete the associated checksum files. This has a big impact when the FileSystem is the InMemoryFileSystem, where space is at a premium and wasting space here might hurt the performance of MapReduce jobs overall. One solution to this problem is to implement the method delete(path, recursive) in the ChecksumFileSystem but is there is a reason why it was left out as part of HADOOP-771?
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.