You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@pig.apache.org by "niraj rai (JIRA)" <ji...@apache.org> on 2010/08/05 20:11:17 UTC

[jira] Updated: (PIG-103) Shared Job /tmp location should be configurable

     [ https://issues.apache.org/jira/browse/PIG-103?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

niraj rai updated PIG-103:
--------------------------

    Attachment: conf_tmp_dir.patch

This patch is to make the pig temp directory for the intermediate data configurable.

> Shared Job /tmp location should be configurable
> -----------------------------------------------
>
>                 Key: PIG-103
>                 URL: https://issues.apache.org/jira/browse/PIG-103
>             Project: Pig
>          Issue Type: Improvement
>          Components: impl
>         Environment: Partially shared file:// filesystem (eg NFS)
>            Reporter: Craig Macdonald
>            Assignee: niraj rai
>             Fix For: 0.8.0
>
>         Attachments: conf_tmp_dir.patch
>
>
> Hello,
> I'm investigating running pig in an environment where various parts of the file:// filesystem are available on all nodes. I can tell hadoop to use a file:// file system location for it's default, by seting fs.default.name=file://path/to/shared/folder
> However, this creates issues for Pig, as Pig writes it's job information in a folder that it assumes is a shared FS (eg DFS). However, in this scenario /tmp is not shared on each machine.
> So /tmp should either be configurable, or Hadoop should tell you the actual full location set in fs.default.name?
> Straightforward solution is to make "/tmp/" a property in src/org/apache/pig/impl/io/FileLocalizer.java init(PigContext)
> Any suggestions of property names?

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.