You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@pig.apache.org by "Cheolsoo Park (JIRA)" <ji...@apache.org> on 2013/02/20 19:59:12 UTC

[jira] [Comment Edited] (PIG-3169) Remove temporary files that are not needed

    [ https://issues.apache.org/jira/browse/PIG-3169?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13582412#comment-13582412 ] 

Cheolsoo Park edited comment on PIG-3169 at 2/20/13 6:57 PM:
-------------------------------------------------------------

The issue is that the test cases generate input files under /tmp and use them across MR jobs:
{code}
pig.registerQuery("A = LOAD '" + Util.generateURI(tmpFile.toString(), pig.getPigContext()) + "';");
pig.registerQuery("Split A into A1 if $0<=10, A2 if $0>10;");
pig.registerQuery("Store A1 into '" + FileLocalizer.getTemporaryPath(pigContext) + "';");
pig.openIterator("A2");
{code}
The "store A1" is the 1st job, and "openIterator(A2)" is the 2nd job. Since the input file was deleted after the 1st job, "openIterator(A2)" fails to load it.

Attached is a patch that remove "deleteTempFiles()" from PigServer. I think having it only in GruntParser serves the original motivation of this jira. Please let me know anyone thinks otherwise.
                
      was (Author: cheolsoo):
    The issue is that the test cases generate input files under /tmp and use them across MR jobs:
{code}
pig.registerQuery("A = LOAD '" + Util.generateURI(tmpFile.toString(), pig.getPigContext()) + "';");
pig.registerQuery("Split A into A1 if $0<=10, A2 if $0>10;");
pig.registerQuery("Store A1 into '" + FileLocalizer.getTemporaryPath(pigContext) + "';");
pig.openIterator("A2");
{code}
The "store A1" is the 1st job, and "openIterator(A2)" is the 2nd job. Since the input file was deleted after the 1st job, "openIterator(A2)" fails to load it.

Attached is a patch that remove "deleteTemp
Files()" from PigServer. I think having it only in GruntParser servers the original motivation of this jira. Please let me know anyone thinks otherwise.
                  
> Remove temporary files that are not needed
> ------------------------------------------
>
>                 Key: PIG-3169
>                 URL: https://issues.apache.org/jira/browse/PIG-3169
>             Project: Pig
>          Issue Type: Improvement
>            Reporter: Mark Wagner
>            Assignee: Mark Wagner
>            Priority: Minor
>             Fix For: 0.12
>
>         Attachments: PIG-3169.1.patch, PIG-3169-hotfix.patch
>
>
> When using Grunt, intermediate data and distributed caches files are left in 'pig.temp.dir' until the session is closed. It would be nice to cleanup files as they are no longer needed.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira