You are viewing a plain text version of this content. The canonical link for it is here.
Posted to mapreduce-issues@hadoop.apache.org by "Mayank Bansal (JIRA)" <ji...@apache.org> on 2012/06/18 20:09:43 UTC
[jira] [Created] (MAPREDUCE-4349) Distributed Cache gives
inconsistent result if cache Archive files get deleted from task tracker
Mayank Bansal created MAPREDUCE-4349:
----------------------------------------
Summary: Distributed Cache gives inconsistent result if cache Archive files get deleted from task tracker
Key: MAPREDUCE-4349
URL: https://issues.apache.org/jira/browse/MAPREDUCE-4349
Project: Hadoop Map/Reduce
Issue Type: Bug
Affects Versions: 0.22.0, 1.0.3, trunk
Reporter: Mayank Bansal
Assignee: Mayank Bansal
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (MAPREDUCE-4349) Distributed Cache gives
inconsistent result if cache Archive files get deleted from task tracker
Posted by "Mayank Bansal (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/MAPREDUCE-4349?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Mayank Bansal updated MAPREDUCE-4349:
-------------------------------------
Attachment: PATCH-MAPREDUCE-4349-22-v3.patch
Thanks Konst for your comments.
I incorporated those in latest patch.
Thanks,
Mayank
> Distributed Cache gives inconsistent result if cache Archive files get deleted from task tracker
> -------------------------------------------------------------------------------------------------
>
> Key: MAPREDUCE-4349
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-4349
> Project: Hadoop Map/Reduce
> Issue Type: Improvement
> Affects Versions: 0.22.0, 1.0.3, trunk
> Reporter: Mayank Bansal
> Assignee: Mayank Bansal
> Priority: Minor
> Attachments: PATCH-MAPREDUCE-4349-22-v2.patch, PATCH-MAPREDUCE-4349-22-v3.patch, PATCH-MAPREDUCE-4349-22.patch
>
>
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-4349) Distributed Cache gives
inconsistent result if cache Archive files get deleted from task tracker
Posted by "Mayank Bansal (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/MAPREDUCE-4349?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13396090#comment-13396090 ]
Mayank Bansal commented on MAPREDUCE-4349:
------------------------------------------
Distributed Cache gives inconsistent result if Archive files get deleted from the task tracker. DC still thinks that it still have the file however file is deleted
> Distributed Cache gives inconsistent result if cache Archive files get deleted from task tracker
> -------------------------------------------------------------------------------------------------
>
> Key: MAPREDUCE-4349
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-4349
> Project: Hadoop Map/Reduce
> Issue Type: Bug
> Affects Versions: 0.22.0, 1.0.3, trunk
> Reporter: Mayank Bansal
> Assignee: Mayank Bansal
>
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-4349) Distributed Cache gives
inconsistent result if cache Archive files get deleted from task tracker
Posted by "Konstantin Shvachko (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/MAPREDUCE-4349?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13419488#comment-13419488 ]
Konstantin Shvachko commented on MAPREDUCE-4349:
------------------------------------------------
One simple suggestion. You should use {{FileUtil.fullyDelete}} instead of implementing {{delete()}} internally.
Once you do that {{if (f2.exists())}} should become {{AssertFalse(f2.exists())}}
> Distributed Cache gives inconsistent result if cache Archive files get deleted from task tracker
> -------------------------------------------------------------------------------------------------
>
> Key: MAPREDUCE-4349
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-4349
> Project: Hadoop Map/Reduce
> Issue Type: Improvement
> Affects Versions: 0.22.0, 1.0.3, trunk
> Reporter: Mayank Bansal
> Assignee: Mayank Bansal
> Priority: Minor
> Attachments: PATCH-MAPREDUCE-4349-22-v2.patch, PATCH-MAPREDUCE-4349-22.patch
>
>
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (MAPREDUCE-4349) Distributed Cache gives
inconsistent result if cache Archive files get deleted from task tracker
Posted by "Konstantin Shvachko (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/MAPREDUCE-4349?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Konstantin Shvachko updated MAPREDUCE-4349:
-------------------------------------------
Description: Add test to verify Distributed Cache consistency when cached archives are deleted. (was: I just committed this to branch 0.22.1.
Thank you, Mayank.)
I just committed this to branch 0.22.1.
Thank you, Mayank.
> Distributed Cache gives inconsistent result if cache Archive files get deleted from task tracker
> -------------------------------------------------------------------------------------------------
>
> Key: MAPREDUCE-4349
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-4349
> Project: Hadoop Map/Reduce
> Issue Type: Improvement
> Affects Versions: 0.22.0, 1.0.3, trunk
> Reporter: Mayank Bansal
> Assignee: Mayank Bansal
> Priority: Minor
> Fix For: 0.22.1
>
> Attachments: PATCH-MAPREDUCE-4349-22-v2.patch, PATCH-MAPREDUCE-4349-22-v3.patch, PATCH-MAPREDUCE-4349-22.patch
>
>
> Add test to verify Distributed Cache consistency when cached archives are deleted.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-4349) Distributed Cache gives
inconsistent result if cache Archive files get deleted from task tracker
Posted by "Konstantin Shvachko (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/MAPREDUCE-4349?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13422069#comment-13422069 ]
Konstantin Shvachko commented on MAPREDUCE-4349:
------------------------------------------------
Should we close it or is it applicable to other versions?
> Distributed Cache gives inconsistent result if cache Archive files get deleted from task tracker
> -------------------------------------------------------------------------------------------------
>
> Key: MAPREDUCE-4349
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-4349
> Project: Hadoop Map/Reduce
> Issue Type: Improvement
> Affects Versions: 0.22.0, 1.0.3, trunk
> Reporter: Mayank Bansal
> Assignee: Mayank Bansal
> Priority: Minor
> Fix For: 0.22.1
>
> Attachments: PATCH-MAPREDUCE-4349-22-v2.patch, PATCH-MAPREDUCE-4349-22-v3.patch, PATCH-MAPREDUCE-4349-22.patch
>
>
> Add test to verify Distributed Cache consistency when cached archives are deleted.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-4349) Distributed Cache gives
inconsistent result if cache Archive files get deleted from task tracker
Posted by "Konstantin Shvachko (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/MAPREDUCE-4349?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13414285#comment-13414285 ]
Konstantin Shvachko commented on MAPREDUCE-4349:
------------------------------------------------
I would rather integrated verification of archive files in {{testCacheConsistency()}} instead of creating a new test case.
You can do
{code}
DistributedCache.addCacheFile(firstCacheFile ...
DistributedCache.addCacheArchive(firstCacheArchive ...
{code}
And then add verification for the archive along with the file.
I think it will be less change, and definitely less code replication.
Otherwise it will need refactoring to extract common parts of code into methods.
> Distributed Cache gives inconsistent result if cache Archive files get deleted from task tracker
> -------------------------------------------------------------------------------------------------
>
> Key: MAPREDUCE-4349
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-4349
> Project: Hadoop Map/Reduce
> Issue Type: Improvement
> Affects Versions: 0.22.0, 1.0.3, trunk
> Reporter: Mayank Bansal
> Assignee: Mayank Bansal
> Priority: Minor
> Attachments: PATCH-MAPREDUCE-4349-22.patch
>
>
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-4349) Distributed Cache gives
inconsistent result if cache Archive files get deleted from task tracker
Posted by "Hudson (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/MAPREDUCE-4349?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13421938#comment-13421938 ]
Hudson commented on MAPREDUCE-4349:
-----------------------------------
Integrated in Hadoop-Mapreduce-22-branch #111 (See [https://builds.apache.org/job/Hadoop-Mapreduce-22-branch/111/])
MAPREDUCE-4349. Add test to verify Distributed Cache consistency when cached archives are deleted. Contributed by Mayank Bansal. (Revision 1365292)
Result = SUCCESS
shv : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1365292
Files :
* /hadoop/common/branches/branch-0.22/mapreduce/CHANGES.txt
* /hadoop/common/branches/branch-0.22/mapreduce/src/test/mapred/org/apache/hadoop/mapreduce/filecache/TestTrackerDistributedCacheManager.java
> Distributed Cache gives inconsistent result if cache Archive files get deleted from task tracker
> -------------------------------------------------------------------------------------------------
>
> Key: MAPREDUCE-4349
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-4349
> Project: Hadoop Map/Reduce
> Issue Type: Improvement
> Affects Versions: 0.22.0, 1.0.3, trunk
> Reporter: Mayank Bansal
> Assignee: Mayank Bansal
> Priority: Minor
> Fix For: 0.22.1
>
> Attachments: PATCH-MAPREDUCE-4349-22-v2.patch, PATCH-MAPREDUCE-4349-22-v3.patch, PATCH-MAPREDUCE-4349-22.patch
>
>
> Add test to verify Distributed Cache consistency when cached archives are deleted.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (MAPREDUCE-4349) Distributed Cache gives
inconsistent result if cache Archive files get deleted from task tracker
Posted by "Mayank Bansal (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/MAPREDUCE-4349?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Mayank Bansal updated MAPREDUCE-4349:
-------------------------------------
Priority: Minor (was: Major)
Issue Type: Improvement (was: Bug)
> Distributed Cache gives inconsistent result if cache Archive files get deleted from task tracker
> -------------------------------------------------------------------------------------------------
>
> Key: MAPREDUCE-4349
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-4349
> Project: Hadoop Map/Reduce
> Issue Type: Improvement
> Affects Versions: 0.22.0, 1.0.3, trunk
> Reporter: Mayank Bansal
> Assignee: Mayank Bansal
> Priority: Minor
> Attachments: PATCH-MAPREDUCE-4349-22.patch
>
>
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-4349) Distributed Cache gives
inconsistent result if cache Archive files get deleted from task tracker
Posted by "Mayank Bansal (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/MAPREDUCE-4349?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13413217#comment-13413217 ]
Mayank Bansal commented on MAPREDUCE-4349:
------------------------------------------
MAPREDUCE-4342 fixes this issue. I will add the test case to verify the event.
Thanks,
Mayank
> Distributed Cache gives inconsistent result if cache Archive files get deleted from task tracker
> -------------------------------------------------------------------------------------------------
>
> Key: MAPREDUCE-4349
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-4349
> Project: Hadoop Map/Reduce
> Issue Type: Bug
> Affects Versions: 0.22.0, 1.0.3, trunk
> Reporter: Mayank Bansal
> Assignee: Mayank Bansal
>
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (MAPREDUCE-4349) Distributed Cache gives
inconsistent result if cache Archive files get deleted from task tracker
Posted by "Mayank Bansal (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/MAPREDUCE-4349?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Mayank Bansal updated MAPREDUCE-4349:
-------------------------------------
Attachment: PATCH-MAPREDUCE-4349-22-v2.patch
Thanks Konstantin for your comments.
Attaching patch after incorporating Konstantin's comments.
Thanks,
Mayank
> Distributed Cache gives inconsistent result if cache Archive files get deleted from task tracker
> -------------------------------------------------------------------------------------------------
>
> Key: MAPREDUCE-4349
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-4349
> Project: Hadoop Map/Reduce
> Issue Type: Improvement
> Affects Versions: 0.22.0, 1.0.3, trunk
> Reporter: Mayank Bansal
> Assignee: Mayank Bansal
> Priority: Minor
> Attachments: PATCH-MAPREDUCE-4349-22-v2.patch, PATCH-MAPREDUCE-4349-22.patch
>
>
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (MAPREDUCE-4349) Distributed Cache gives
inconsistent result if cache Archive files get deleted from task tracker
Posted by "Konstantin Shvachko (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/MAPREDUCE-4349?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Konstantin Shvachko updated MAPREDUCE-4349:
-------------------------------------------
Description:
I just committed this to branch 0.22.1.
Thank you, Mayank.
Fix Version/s: 0.22.1
Hadoop Flags: Reviewed
> Distributed Cache gives inconsistent result if cache Archive files get deleted from task tracker
> -------------------------------------------------------------------------------------------------
>
> Key: MAPREDUCE-4349
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-4349
> Project: Hadoop Map/Reduce
> Issue Type: Improvement
> Affects Versions: 0.22.0, 1.0.3, trunk
> Reporter: Mayank Bansal
> Assignee: Mayank Bansal
> Priority: Minor
> Fix For: 0.22.1
>
> Attachments: PATCH-MAPREDUCE-4349-22-v2.patch, PATCH-MAPREDUCE-4349-22-v3.patch, PATCH-MAPREDUCE-4349-22.patch
>
>
> I just committed this to branch 0.22.1.
> Thank you, Mayank.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (MAPREDUCE-4349) Distributed Cache gives
inconsistent result if cache Archive files get deleted from task tracker
Posted by "Mayank Bansal (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/MAPREDUCE-4349?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Mayank Bansal updated MAPREDUCE-4349:
-------------------------------------
Attachment: PATCH-MAPREDUCE-4349-22.patch
Attaching unit test case for archive case.
Thanks,
Mayank
> Distributed Cache gives inconsistent result if cache Archive files get deleted from task tracker
> -------------------------------------------------------------------------------------------------
>
> Key: MAPREDUCE-4349
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-4349
> Project: Hadoop Map/Reduce
> Issue Type: Bug
> Affects Versions: 0.22.0, 1.0.3, trunk
> Reporter: Mayank Bansal
> Assignee: Mayank Bansal
> Attachments: PATCH-MAPREDUCE-4349-22.patch
>
>
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-4349) Distributed Cache gives
inconsistent result if cache Archive files get deleted from task tracker
Posted by "Konstantin Shvachko (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/MAPREDUCE-4349?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13421727#comment-13421727 ]
Konstantin Shvachko commented on MAPREDUCE-4349:
------------------------------------------------
+1 the patch looks good.
I removed two unused imports. Will commit now.
> Distributed Cache gives inconsistent result if cache Archive files get deleted from task tracker
> -------------------------------------------------------------------------------------------------
>
> Key: MAPREDUCE-4349
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-4349
> Project: Hadoop Map/Reduce
> Issue Type: Improvement
> Affects Versions: 0.22.0, 1.0.3, trunk
> Reporter: Mayank Bansal
> Assignee: Mayank Bansal
> Priority: Minor
> Attachments: PATCH-MAPREDUCE-4349-22-v2.patch, PATCH-MAPREDUCE-4349-22-v3.patch, PATCH-MAPREDUCE-4349-22.patch
>
>
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira