You are viewing a plain text version of this content. The canonical link for it is here.
Posted to mapreduce-issues@hadoop.apache.org by "Mayank Bansal (JIRA)" <ji...@apache.org> on 2012/06/18 20:09:43 UTC

[jira] [Created] (MAPREDUCE-4349) Distributed Cache gives inconsistent result if cache Archive files get deleted from task tracker

Mayank Bansal created MAPREDUCE-4349:
----------------------------------------

             Summary: Distributed Cache gives inconsistent result if cache Archive files get deleted from task tracker 
                 Key: MAPREDUCE-4349
                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4349
             Project: Hadoop Map/Reduce
          Issue Type: Bug
    Affects Versions: 0.22.0, 1.0.3, trunk
            Reporter: Mayank Bansal
            Assignee: Mayank Bansal




--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Updated] (MAPREDUCE-4349) Distributed Cache gives inconsistent result if cache Archive files get deleted from task tracker

Posted by "Mayank Bansal (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/MAPREDUCE-4349?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Mayank Bansal updated MAPREDUCE-4349:
-------------------------------------

    Attachment: PATCH-MAPREDUCE-4349-22-v3.patch

Thanks Konst for your comments.

I incorporated those in latest patch.

Thanks,
Mayank
                
> Distributed Cache gives inconsistent result if cache Archive files get deleted from task tracker 
> -------------------------------------------------------------------------------------------------
>
>                 Key: MAPREDUCE-4349
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4349
>             Project: Hadoop Map/Reduce
>          Issue Type: Improvement
>    Affects Versions: 0.22.0, 1.0.3, trunk
>            Reporter: Mayank Bansal
>            Assignee: Mayank Bansal
>            Priority: Minor
>         Attachments: PATCH-MAPREDUCE-4349-22-v2.patch, PATCH-MAPREDUCE-4349-22-v3.patch, PATCH-MAPREDUCE-4349-22.patch
>
>


--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (MAPREDUCE-4349) Distributed Cache gives inconsistent result if cache Archive files get deleted from task tracker

Posted by "Mayank Bansal (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/MAPREDUCE-4349?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13396090#comment-13396090 ] 

Mayank Bansal commented on MAPREDUCE-4349:
------------------------------------------

Distributed Cache gives inconsistent result if Archive files get deleted from the task tracker. DC still thinks that it still have the file however file is deleted
                
> Distributed Cache gives inconsistent result if cache Archive files get deleted from task tracker 
> -------------------------------------------------------------------------------------------------
>
>                 Key: MAPREDUCE-4349
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4349
>             Project: Hadoop Map/Reduce
>          Issue Type: Bug
>    Affects Versions: 0.22.0, 1.0.3, trunk
>            Reporter: Mayank Bansal
>            Assignee: Mayank Bansal
>


--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (MAPREDUCE-4349) Distributed Cache gives inconsistent result if cache Archive files get deleted from task tracker

Posted by "Konstantin Shvachko (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/MAPREDUCE-4349?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13419488#comment-13419488 ] 

Konstantin Shvachko commented on MAPREDUCE-4349:
------------------------------------------------

One simple suggestion. You should use {{FileUtil.fullyDelete}} instead of implementing {{delete()}} internally.
Once you do that {{if (f2.exists())}} should become {{AssertFalse(f2.exists())}}
                
> Distributed Cache gives inconsistent result if cache Archive files get deleted from task tracker 
> -------------------------------------------------------------------------------------------------
>
>                 Key: MAPREDUCE-4349
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4349
>             Project: Hadoop Map/Reduce
>          Issue Type: Improvement
>    Affects Versions: 0.22.0, 1.0.3, trunk
>            Reporter: Mayank Bansal
>            Assignee: Mayank Bansal
>            Priority: Minor
>         Attachments: PATCH-MAPREDUCE-4349-22-v2.patch, PATCH-MAPREDUCE-4349-22.patch
>
>


--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Updated] (MAPREDUCE-4349) Distributed Cache gives inconsistent result if cache Archive files get deleted from task tracker

Posted by "Konstantin Shvachko (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/MAPREDUCE-4349?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Konstantin Shvachko updated MAPREDUCE-4349:
-------------------------------------------

    Description: Add test to verify Distributed Cache consistency when cached archives are deleted.  (was: I just committed this to branch 0.22.1.
Thank you, Mayank.)

I just committed this to branch 0.22.1.
Thank you, Mayank.
                
> Distributed Cache gives inconsistent result if cache Archive files get deleted from task tracker 
> -------------------------------------------------------------------------------------------------
>
>                 Key: MAPREDUCE-4349
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4349
>             Project: Hadoop Map/Reduce
>          Issue Type: Improvement
>    Affects Versions: 0.22.0, 1.0.3, trunk
>            Reporter: Mayank Bansal
>            Assignee: Mayank Bansal
>            Priority: Minor
>             Fix For: 0.22.1
>
>         Attachments: PATCH-MAPREDUCE-4349-22-v2.patch, PATCH-MAPREDUCE-4349-22-v3.patch, PATCH-MAPREDUCE-4349-22.patch
>
>
> Add test to verify Distributed Cache consistency when cached archives are deleted.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (MAPREDUCE-4349) Distributed Cache gives inconsistent result if cache Archive files get deleted from task tracker

Posted by "Konstantin Shvachko (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/MAPREDUCE-4349?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13422069#comment-13422069 ] 

Konstantin Shvachko commented on MAPREDUCE-4349:
------------------------------------------------

Should we close it or is it applicable to other versions?
                
> Distributed Cache gives inconsistent result if cache Archive files get deleted from task tracker 
> -------------------------------------------------------------------------------------------------
>
>                 Key: MAPREDUCE-4349
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4349
>             Project: Hadoop Map/Reduce
>          Issue Type: Improvement
>    Affects Versions: 0.22.0, 1.0.3, trunk
>            Reporter: Mayank Bansal
>            Assignee: Mayank Bansal
>            Priority: Minor
>             Fix For: 0.22.1
>
>         Attachments: PATCH-MAPREDUCE-4349-22-v2.patch, PATCH-MAPREDUCE-4349-22-v3.patch, PATCH-MAPREDUCE-4349-22.patch
>
>
> Add test to verify Distributed Cache consistency when cached archives are deleted.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (MAPREDUCE-4349) Distributed Cache gives inconsistent result if cache Archive files get deleted from task tracker

Posted by "Konstantin Shvachko (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/MAPREDUCE-4349?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13414285#comment-13414285 ] 

Konstantin Shvachko commented on MAPREDUCE-4349:
------------------------------------------------

I would rather integrated verification of archive files in {{testCacheConsistency()}} instead of creating a new test case.
You can do
{code}
DistributedCache.addCacheFile(firstCacheFile ...
DistributedCache.addCacheArchive(firstCacheArchive ...
{code}
And then add verification for the archive along with the file.
I think it will be less change, and definitely less code replication.
Otherwise it will need refactoring to extract common parts of code into methods.
                
> Distributed Cache gives inconsistent result if cache Archive files get deleted from task tracker 
> -------------------------------------------------------------------------------------------------
>
>                 Key: MAPREDUCE-4349
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4349
>             Project: Hadoop Map/Reduce
>          Issue Type: Improvement
>    Affects Versions: 0.22.0, 1.0.3, trunk
>            Reporter: Mayank Bansal
>            Assignee: Mayank Bansal
>            Priority: Minor
>         Attachments: PATCH-MAPREDUCE-4349-22.patch
>
>


--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (MAPREDUCE-4349) Distributed Cache gives inconsistent result if cache Archive files get deleted from task tracker

Posted by "Hudson (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/MAPREDUCE-4349?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13421938#comment-13421938 ] 

Hudson commented on MAPREDUCE-4349:
-----------------------------------

Integrated in Hadoop-Mapreduce-22-branch #111 (See [https://builds.apache.org/job/Hadoop-Mapreduce-22-branch/111/])
    MAPREDUCE-4349. Add test to verify Distributed Cache consistency when cached archives are deleted. Contributed by Mayank Bansal. (Revision 1365292)

     Result = SUCCESS
shv : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1365292
Files : 
* /hadoop/common/branches/branch-0.22/mapreduce/CHANGES.txt
* /hadoop/common/branches/branch-0.22/mapreduce/src/test/mapred/org/apache/hadoop/mapreduce/filecache/TestTrackerDistributedCacheManager.java

                
> Distributed Cache gives inconsistent result if cache Archive files get deleted from task tracker 
> -------------------------------------------------------------------------------------------------
>
>                 Key: MAPREDUCE-4349
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4349
>             Project: Hadoop Map/Reduce
>          Issue Type: Improvement
>    Affects Versions: 0.22.0, 1.0.3, trunk
>            Reporter: Mayank Bansal
>            Assignee: Mayank Bansal
>            Priority: Minor
>             Fix For: 0.22.1
>
>         Attachments: PATCH-MAPREDUCE-4349-22-v2.patch, PATCH-MAPREDUCE-4349-22-v3.patch, PATCH-MAPREDUCE-4349-22.patch
>
>
> Add test to verify Distributed Cache consistency when cached archives are deleted.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Updated] (MAPREDUCE-4349) Distributed Cache gives inconsistent result if cache Archive files get deleted from task tracker

Posted by "Mayank Bansal (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/MAPREDUCE-4349?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Mayank Bansal updated MAPREDUCE-4349:
-------------------------------------

      Priority: Minor  (was: Major)
    Issue Type: Improvement  (was: Bug)
    
> Distributed Cache gives inconsistent result if cache Archive files get deleted from task tracker 
> -------------------------------------------------------------------------------------------------
>
>                 Key: MAPREDUCE-4349
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4349
>             Project: Hadoop Map/Reduce
>          Issue Type: Improvement
>    Affects Versions: 0.22.0, 1.0.3, trunk
>            Reporter: Mayank Bansal
>            Assignee: Mayank Bansal
>            Priority: Minor
>         Attachments: PATCH-MAPREDUCE-4349-22.patch
>
>


--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (MAPREDUCE-4349) Distributed Cache gives inconsistent result if cache Archive files get deleted from task tracker

Posted by "Mayank Bansal (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/MAPREDUCE-4349?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13413217#comment-13413217 ] 

Mayank Bansal commented on MAPREDUCE-4349:
------------------------------------------

MAPREDUCE-4342 fixes this issue. I will add the test case to verify the event.

Thanks,
Mayank
                
> Distributed Cache gives inconsistent result if cache Archive files get deleted from task tracker 
> -------------------------------------------------------------------------------------------------
>
>                 Key: MAPREDUCE-4349
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4349
>             Project: Hadoop Map/Reduce
>          Issue Type: Bug
>    Affects Versions: 0.22.0, 1.0.3, trunk
>            Reporter: Mayank Bansal
>            Assignee: Mayank Bansal
>


--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Updated] (MAPREDUCE-4349) Distributed Cache gives inconsistent result if cache Archive files get deleted from task tracker

Posted by "Mayank Bansal (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/MAPREDUCE-4349?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Mayank Bansal updated MAPREDUCE-4349:
-------------------------------------

    Attachment: PATCH-MAPREDUCE-4349-22-v2.patch

Thanks Konstantin for your comments.

Attaching patch after incorporating Konstantin's comments.

Thanks,
Mayank
                
> Distributed Cache gives inconsistent result if cache Archive files get deleted from task tracker 
> -------------------------------------------------------------------------------------------------
>
>                 Key: MAPREDUCE-4349
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4349
>             Project: Hadoop Map/Reduce
>          Issue Type: Improvement
>    Affects Versions: 0.22.0, 1.0.3, trunk
>            Reporter: Mayank Bansal
>            Assignee: Mayank Bansal
>            Priority: Minor
>         Attachments: PATCH-MAPREDUCE-4349-22-v2.patch, PATCH-MAPREDUCE-4349-22.patch
>
>


--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Updated] (MAPREDUCE-4349) Distributed Cache gives inconsistent result if cache Archive files get deleted from task tracker

Posted by "Konstantin Shvachko (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/MAPREDUCE-4349?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Konstantin Shvachko updated MAPREDUCE-4349:
-------------------------------------------

      Description: 
I just committed this to branch 0.22.1.
Thank you, Mayank.
    Fix Version/s: 0.22.1
     Hadoop Flags: Reviewed
    
> Distributed Cache gives inconsistent result if cache Archive files get deleted from task tracker 
> -------------------------------------------------------------------------------------------------
>
>                 Key: MAPREDUCE-4349
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4349
>             Project: Hadoop Map/Reduce
>          Issue Type: Improvement
>    Affects Versions: 0.22.0, 1.0.3, trunk
>            Reporter: Mayank Bansal
>            Assignee: Mayank Bansal
>            Priority: Minor
>             Fix For: 0.22.1
>
>         Attachments: PATCH-MAPREDUCE-4349-22-v2.patch, PATCH-MAPREDUCE-4349-22-v3.patch, PATCH-MAPREDUCE-4349-22.patch
>
>
> I just committed this to branch 0.22.1.
> Thank you, Mayank.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Updated] (MAPREDUCE-4349) Distributed Cache gives inconsistent result if cache Archive files get deleted from task tracker

Posted by "Mayank Bansal (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/MAPREDUCE-4349?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Mayank Bansal updated MAPREDUCE-4349:
-------------------------------------

    Attachment: PATCH-MAPREDUCE-4349-22.patch

Attaching unit test case for archive case.

Thanks,
Mayank
                
> Distributed Cache gives inconsistent result if cache Archive files get deleted from task tracker 
> -------------------------------------------------------------------------------------------------
>
>                 Key: MAPREDUCE-4349
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4349
>             Project: Hadoop Map/Reduce
>          Issue Type: Bug
>    Affects Versions: 0.22.0, 1.0.3, trunk
>            Reporter: Mayank Bansal
>            Assignee: Mayank Bansal
>         Attachments: PATCH-MAPREDUCE-4349-22.patch
>
>


--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (MAPREDUCE-4349) Distributed Cache gives inconsistent result if cache Archive files get deleted from task tracker

Posted by "Konstantin Shvachko (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/MAPREDUCE-4349?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13421727#comment-13421727 ] 

Konstantin Shvachko commented on MAPREDUCE-4349:
------------------------------------------------

+1 the patch looks good.
I removed two unused imports. Will commit now.
                
> Distributed Cache gives inconsistent result if cache Archive files get deleted from task tracker 
> -------------------------------------------------------------------------------------------------
>
>                 Key: MAPREDUCE-4349
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4349
>             Project: Hadoop Map/Reduce
>          Issue Type: Improvement
>    Affects Versions: 0.22.0, 1.0.3, trunk
>            Reporter: Mayank Bansal
>            Assignee: Mayank Bansal
>            Priority: Minor
>         Attachments: PATCH-MAPREDUCE-4349-22-v2.patch, PATCH-MAPREDUCE-4349-22-v3.patch, PATCH-MAPREDUCE-4349-22.patch
>
>


--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira