You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@pig.apache.org by "Alan Gates (JIRA)" <ji...@apache.org> on 2010/02/03 18:11:28 UTC
[jira] Resolved: (PIG-1174) Creation of output path should be done
by storage function
[ https://issues.apache.org/jira/browse/PIG-1174?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Alan Gates resolved PIG-1174.
-----------------------------
Resolution: Duplicate
> Creation of output path should be done by storage function
> ----------------------------------------------------------
>
> Key: PIG-1174
> URL: https://issues.apache.org/jira/browse/PIG-1174
> Project: Pig
> Issue Type: Bug
> Reporter: Bill Graham
> Fix For: 0.7.0
>
>
> When executing a STORE command, Pig creates the output location before the storage function gets called. This causes problems with storage functions that have logic to determine the output location. See this thread:
> http://www.mail-archive.com/pig-user%40hadoop.apache.org/msg01538.html
> For example, when making a request like this:
> STORE A INTO '/my/home/output' USING MultiStorage('/my/home/output','0', 'none', '\t');
> Pig creates a file '/my/home/output' and then an exception is thrown when MultiStorage tries to make a directory under '/my/home/output'. The workaround is to instead specify a dummy location as the first path like so:
> STORE A INTO '/my/home/output/temp' USING MultiStorage('/my/home/output','0', 'none', '\t');
> Two changes should be made:
> 1. The path specified in the INTO clause should be available to the storage function so it doesn't need to be duplicated.
> 2. The creation of the output paths should be delegated to the storage function.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.