You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@pig.apache.org by "Olga Natkovich (JIRA)" <ji...@apache.org> on 2010/01/28 01:14:39 UTC
[jira] Resolved: (PIG-758) Converting load/store locations into
fully qualified absolute paths
[ https://issues.apache.org/jira/browse/PIG-758?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Olga Natkovich resolved PIG-758.
--------------------------------
Resolution: Fixed
Fixed as part of LSR.
> Converting load/store locations into fully qualified absolute paths
> -------------------------------------------------------------------
>
> Key: PIG-758
> URL: https://issues.apache.org/jira/browse/PIG-758
> Project: Pig
> Issue Type: Bug
> Reporter: Gunther Hagleitner
> Fix For: 0.7.0
>
>
> As part of the multiquery optimization work there is a need to use absolute paths for load and store operations (because the current directory changes during the execution of the script). In order to do so, we are suggesting a change to the semantics of the location/filename string used in LoadFunc and Slicer/Slice.
> The proposed change is:
> * Load locations without a scheme part are expected to be hdfs (mapreduce mode) or local (local mode) paths
> * Any hdfs or local path will be translated to a fully qualified absolute path before it is handed to either a LoadFunc or Slicer
> * Any scheme other than "file" or "hdfs" will result in the load path to be passed through to the LoadFunc or Slicer without any modification.
> Example:
> If you have a LoadFunc that reads from a database, in the current system the following could be used:
> {noformat}
> a = load 'table' using DBLoader();
> {noformat}
> With the proposed changes table would be translated into an hdfs path though ("hdfs://..../table"). Probably not what the DBLoader would want to see. In order to make it work one could use:
> {noformat}
> a = load 'sql://table' using DBLoader();
> {noformat}
> Now the DBLoader would see the unchanged string "sql://table".
> This is an incompatible change, but hopefully not affecting many existing Loaders/Slicers. Since this is needed with the multiquery feature, the behavior can be reverted back by using the "no_multiquery" pig flag.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.