You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@pig.apache.org by "Mathias Herberts (JIRA)" <ji...@apache.org> on 2012/06/18 15:24:42 UTC
[jira] [Created] (PIG-2760) resources added with a relative path
are added to the JobXXXX jar file under their absolute path
Mathias Herberts created PIG-2760:
-------------------------------------
Summary: resources added with a relative path are added to the JobXXXX jar file under their absolute path
Key: PIG-2760
URL: https://issues.apache.org/jira/browse/PIG-2760
Project: Pig
Issue Type: Bug
Affects Versions: 0.10.0
Reporter: Mathias Herberts
When registering a local resource using a relative path, the resource is added to the JobXXXX jar under its absolute path.
If a pig script contains the following:
REGISTER etc/foo;
and is executed from a directory /PATH/TO/DIR, the JobXXXX jar file will contain the following:
/PATH/TO/DIR/etc/foo
instead of
etc/foo
which was the previous behavior
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (PIG-2760) resources added with a relative path
are added to the JobXXXX jar file under their absolute path
Posted by "Cheolsoo Park (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/PIG-2760?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13397007#comment-13397007 ]
Cheolsoo Park commented on PIG-2760:
------------------------------------
Hi Mathias,
Agreed. I haven't thought about the use case that you're describing. :-) Thanks for explaining!
I like your patch because it solves all the cases that I can think of. Just a minor comment. Can't you collapse the following lines of code into a single line?
{code}
String nameInJar = cp.startsWith(cwd) ? cp.substring(cwd.length() + 1) : cp;
// Strip leading path.sep
if (nameInJar.startsWith("/")) {
nameInJar = nameInJar.substring(1);
}
{code}
=>
{code}
String nameInJar = cp.startsWith(cwd) ? cp.substring(cwd.length() + 1) : cp.substring(1);
{code}
Given that cp is always going to be an absolute path (as a relative path is converted to an absolute one by fetchfile()), the "if" condition seems redundant to me. Please correct me if I am wrong.
Thanks!
> resources added with a relative path are added to the JobXXXX jar file under their absolute path
> ------------------------------------------------------------------------------------------------
>
> Key: PIG-2760
> URL: https://issues.apache.org/jira/browse/PIG-2760
> Project: Pig
> Issue Type: Bug
> Affects Versions: 0.10.0
> Reporter: Mathias Herberts
> Attachments: PIG-2760.patch
>
>
> When registering a local resource using a relative path, the resource is added to the JobXXXX jar under its absolute path.
> If a pig script contains the following:
> REGISTER etc/foo;
> and is executed from a directory /PATH/TO/DIR, the JobXXXX jar file will contain the following:
> /PATH/TO/DIR/etc/foo
> instead of
> etc/foo
> which was the previous behavior
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (PIG-2760) resources added with a relative path
are added to the JobXXXX jar file under their absolute path
Posted by "Rohini Palaniswamy (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/PIG-2760?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13398072#comment-13398072 ]
Rohini Palaniswamy commented on PIG-2760:
-----------------------------------------
Mathias,
It requires one more minor change. If we just do cp.startsWith(cwd), then even if absolute path was specified and if the script is in a subdirectory under current directory, the jar entry only has the relative path instead of the absolute path. Need to do cp.equals(cwd + "/" + patch).
I was addressing script loading issue in PIG-2761 and had some modifications for the same line of code. So added your fix to it also and tested and also made it part of PIG-2761 patch. Hope you don't mind. If you could, I would appreciate you reviewing PIG-2761.
> resources added with a relative path are added to the JobXXXX jar file under their absolute path
> ------------------------------------------------------------------------------------------------
>
> Key: PIG-2760
> URL: https://issues.apache.org/jira/browse/PIG-2760
> Project: Pig
> Issue Type: Bug
> Affects Versions: 0.10.0
> Reporter: Mathias Herberts
> Attachments: PIG-2760.patch
>
>
> When registering a local resource using a relative path, the resource is added to the JobXXXX jar under its absolute path.
> If a pig script contains the following:
> REGISTER etc/foo;
> and is executed from a directory /PATH/TO/DIR, the JobXXXX jar file will contain the following:
> /PATH/TO/DIR/etc/foo
> instead of
> etc/foo
> which was the previous behavior
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (PIG-2760) resources added with a relative path
are added to the JobXXXX jar file under their absolute path
Posted by "Rohini Palaniswamy (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/PIG-2760?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13397111#comment-13397111 ]
Rohini Palaniswamy commented on PIG-2760:
-----------------------------------------
*This will cause getScriptAsStream() to error out on the backend as f.getPath() minus leading / will not be in the jar.
> resources added with a relative path are added to the JobXXXX jar file under their absolute path
> ------------------------------------------------------------------------------------------------
>
> Key: PIG-2760
> URL: https://issues.apache.org/jira/browse/PIG-2760
> Project: Pig
> Issue Type: Bug
> Affects Versions: 0.10.0
> Reporter: Mathias Herberts
> Attachments: PIG-2760.patch
>
>
> When registering a local resource using a relative path, the resource is added to the JobXXXX jar under its absolute path.
> If a pig script contains the following:
> REGISTER etc/foo;
> and is executed from a directory /PATH/TO/DIR, the JobXXXX jar file will contain the following:
> /PATH/TO/DIR/etc/foo
> instead of
> etc/foo
> which was the previous behavior
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (PIG-2760) resources added with a relative path
are added to the JobXXXX jar file under their absolute path
Posted by "Cheolsoo Park (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/PIG-2760?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13396016#comment-13396016 ]
Cheolsoo Park commented on PIG-2760:
------------------------------------
This is a regression of PIG-2623:
{code}
- File f = new File(path);
+ File f = FileLocalizer.fetchFile(pigContext.getProperties(), path).file;
{code}
where fetchFile() converts a relative path to absolute path.
In fact, converting a relative path to an absolute path isn't an issue, but the leading "/" makes registered files not found. That is fixed at PIG-2745.
Thanks!
> resources added with a relative path are added to the JobXXXX jar file under their absolute path
> ------------------------------------------------------------------------------------------------
>
> Key: PIG-2760
> URL: https://issues.apache.org/jira/browse/PIG-2760
> Project: Pig
> Issue Type: Bug
> Affects Versions: 0.10.0
> Reporter: Mathias Herberts
>
> When registering a local resource using a relative path, the resource is added to the JobXXXX jar under its absolute path.
> If a pig script contains the following:
> REGISTER etc/foo;
> and is executed from a directory /PATH/TO/DIR, the JobXXXX jar file will contain the following:
> /PATH/TO/DIR/etc/foo
> instead of
> etc/foo
> which was the previous behavior
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (PIG-2760) resources added with a relative path
are added to the JobXXXX jar file under their absolute path
Posted by "Mathias Herberts (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/PIG-2760?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13397368#comment-13397368 ]
Mathias Herberts commented on PIG-2760:
---------------------------------------
So this means we should do the following:
<code>
String cwd = new File(System.getProperty("user.dir")).getCanonicalPath();
String cp = f.getCanonicalPath();
String nameInJar = cp.startsWith(cwd) ? cp.substring(cwd.length() + 1) : cp.substring(1);
pigContext.addScriptFile(nameInJar,f.getPath());
se.registerFunctions(nameInJar, namespace, pigContext);
</code>
right?
> resources added with a relative path are added to the JobXXXX jar file under their absolute path
> ------------------------------------------------------------------------------------------------
>
> Key: PIG-2760
> URL: https://issues.apache.org/jira/browse/PIG-2760
> Project: Pig
> Issue Type: Bug
> Affects Versions: 0.10.0
> Reporter: Mathias Herberts
> Attachments: PIG-2760.patch
>
>
> When registering a local resource using a relative path, the resource is added to the JobXXXX jar under its absolute path.
> If a pig script contains the following:
> REGISTER etc/foo;
> and is executed from a directory /PATH/TO/DIR, the JobXXXX jar file will contain the following:
> /PATH/TO/DIR/etc/foo
> instead of
> etc/foo
> which was the previous behavior
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (PIG-2760) resources added with a relative path
are added to the JobXXXX jar file under their absolute path
Posted by "Rohini Palaniswamy (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/PIG-2760?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13397110#comment-13397110 ]
Rohini Palaniswamy commented on PIG-2760:
-----------------------------------------
<code>
se.registerFunctions(f.getPath(), namespace, pigContext);
String cwd = new File(System.getProperty("user.dir")).getCanonicalPath();
String cp = f.getCanonicalPath();
String nameInJar = cp.startsWith(cwd) ? cp.substring(cwd.length() + 1) : cp.substring(1);
pigContext.addScriptFile(nameInJar,f.getPath());
</code>
The function is still registered with f.getPath() even though nameInJar is going to be relative to current directory. This will cause getScriptAsStream() on the backend as f.getPath() minus leading / will not be in the jar.
> resources added with a relative path are added to the JobXXXX jar file under their absolute path
> ------------------------------------------------------------------------------------------------
>
> Key: PIG-2760
> URL: https://issues.apache.org/jira/browse/PIG-2760
> Project: Pig
> Issue Type: Bug
> Affects Versions: 0.10.0
> Reporter: Mathias Herberts
> Attachments: PIG-2760.patch
>
>
> When registering a local resource using a relative path, the resource is added to the JobXXXX jar under its absolute path.
> If a pig script contains the following:
> REGISTER etc/foo;
> and is executed from a directory /PATH/TO/DIR, the JobXXXX jar file will contain the following:
> /PATH/TO/DIR/etc/foo
> instead of
> etc/foo
> which was the previous behavior
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (PIG-2760) resources added with a relative path
are added to the JobXXXX jar file under their absolute path
Posted by "Mathias Herberts (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/PIG-2760?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13397030#comment-13397030 ]
Mathias Herberts commented on PIG-2760:
---------------------------------------
I guess with an appropriate comment the two code chunks could be collapsed into one yes.
> resources added with a relative path are added to the JobXXXX jar file under their absolute path
> ------------------------------------------------------------------------------------------------
>
> Key: PIG-2760
> URL: https://issues.apache.org/jira/browse/PIG-2760
> Project: Pig
> Issue Type: Bug
> Affects Versions: 0.10.0
> Reporter: Mathias Herberts
> Attachments: PIG-2760.patch
>
>
> When registering a local resource using a relative path, the resource is added to the JobXXXX jar under its absolute path.
> If a pig script contains the following:
> REGISTER etc/foo;
> and is executed from a directory /PATH/TO/DIR, the JobXXXX jar file will contain the following:
> /PATH/TO/DIR/etc/foo
> instead of
> etc/foo
> which was the previous behavior
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (PIG-2760) resources added with a relative path
are added to the JobXXXX jar file under their absolute path
Posted by "Cheolsoo Park (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/PIG-2760?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13397142#comment-13397142 ]
Cheolsoo Park commented on PIG-2760:
------------------------------------
Indeed. Good catch!
> resources added with a relative path are added to the JobXXXX jar file under their absolute path
> ------------------------------------------------------------------------------------------------
>
> Key: PIG-2760
> URL: https://issues.apache.org/jira/browse/PIG-2760
> Project: Pig
> Issue Type: Bug
> Affects Versions: 0.10.0
> Reporter: Mathias Herberts
> Attachments: PIG-2760.patch
>
>
> When registering a local resource using a relative path, the resource is added to the JobXXXX jar under its absolute path.
> If a pig script contains the following:
> REGISTER etc/foo;
> and is executed from a directory /PATH/TO/DIR, the JobXXXX jar file will contain the following:
> /PATH/TO/DIR/etc/foo
> instead of
> etc/foo
> which was the previous behavior
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Resolved] (PIG-2760) resources added with a relative path
are added to the JobXXXX jar file under their absolute path
Posted by "Daniel Dai (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/PIG-2760?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Daniel Dai resolved PIG-2760.
-----------------------------
Resolution: Fixed
Fix Version/s: 0.10.1
0.11
Assignee: Rohini Palaniswamy
Hadoop Flags: Reviewed
This is fixed along with PIG-2761. Thanks folks!
> resources added with a relative path are added to the JobXXXX jar file under their absolute path
> ------------------------------------------------------------------------------------------------
>
> Key: PIG-2760
> URL: https://issues.apache.org/jira/browse/PIG-2760
> Project: Pig
> Issue Type: Bug
> Affects Versions: 0.10.0
> Reporter: Mathias Herberts
> Assignee: Rohini Palaniswamy
> Fix For: 0.11, 0.10.1
>
> Attachments: PIG-2760.patch
>
>
> When registering a local resource using a relative path, the resource is added to the JobXXXX jar under its absolute path.
> If a pig script contains the following:
> REGISTER etc/foo;
> and is executed from a directory /PATH/TO/DIR, the JobXXXX jar file will contain the following:
> /PATH/TO/DIR/etc/foo
> instead of
> etc/foo
> which was the previous behavior
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (PIG-2760) resources added with a relative path
are added to the JobXXXX jar file under their absolute path
Posted by "Mathias Herberts (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/PIG-2760?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13400459#comment-13400459 ]
Mathias Herberts commented on PIG-2760:
---------------------------------------
Your patch attached to PIG-2761 Looks Good To Me for what concerns PIG-2760.
> resources added with a relative path are added to the JobXXXX jar file under their absolute path
> ------------------------------------------------------------------------------------------------
>
> Key: PIG-2760
> URL: https://issues.apache.org/jira/browse/PIG-2760
> Project: Pig
> Issue Type: Bug
> Affects Versions: 0.10.0
> Reporter: Mathias Herberts
> Attachments: PIG-2760.patch
>
>
> When registering a local resource using a relative path, the resource is added to the JobXXXX jar under its absolute path.
> If a pig script contains the following:
> REGISTER etc/foo;
> and is executed from a directory /PATH/TO/DIR, the JobXXXX jar file will contain the following:
> /PATH/TO/DIR/etc/foo
> instead of
> etc/foo
> which was the previous behavior
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (PIG-2760) resources added with a relative path
are added to the JobXXXX jar file under their absolute path
Posted by "Mathias Herberts (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/PIG-2760?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Mathias Herberts updated PIG-2760:
----------------------------------
Attachment: PIG-2760.patch
This patch changes the name used in the job jar to relative paths if the added resource lies under the current working directory.
It also strips leading '/' as PIG-2745
> resources added with a relative path are added to the JobXXXX jar file under their absolute path
> ------------------------------------------------------------------------------------------------
>
> Key: PIG-2760
> URL: https://issues.apache.org/jira/browse/PIG-2760
> Project: Pig
> Issue Type: Bug
> Affects Versions: 0.10.0
> Reporter: Mathias Herberts
> Attachments: PIG-2760.patch
>
>
> When registering a local resource using a relative path, the resource is added to the JobXXXX jar under its absolute path.
> If a pig script contains the following:
> REGISTER etc/foo;
> and is executed from a directory /PATH/TO/DIR, the JobXXXX jar file will contain the following:
> /PATH/TO/DIR/etc/foo
> instead of
> etc/foo
> which was the previous behavior
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (PIG-2760) resources added with a relative path
are added to the JobXXXX jar file under their absolute path
Posted by "Mathias Herberts (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/PIG-2760?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13396596#comment-13396596 ]
Mathias Herberts commented on PIG-2760:
---------------------------------------
Converting a relative path to an absolute one may be an issue when accessing a resource in a UDF using getResourceAsStream
Previously, if we used 'REGISTER foo/bar;' in a script, you could access 'bar' by calling this.getClass().getClassLoader().getResourceAsStream('foo/bar'); in your UDF, and this would work whatever directory the pig script is run from.
If converting the relative path to an absolute one (with no leading '/'), the argument to getResourceAsStream will need to be dependent on the directory from which the pig script is run, this kinds of defeat usability.
> resources added with a relative path are added to the JobXXXX jar file under their absolute path
> ------------------------------------------------------------------------------------------------
>
> Key: PIG-2760
> URL: https://issues.apache.org/jira/browse/PIG-2760
> Project: Pig
> Issue Type: Bug
> Affects Versions: 0.10.0
> Reporter: Mathias Herberts
>
> When registering a local resource using a relative path, the resource is added to the JobXXXX jar under its absolute path.
> If a pig script contains the following:
> REGISTER etc/foo;
> and is executed from a directory /PATH/TO/DIR, the JobXXXX jar file will contain the following:
> /PATH/TO/DIR/etc/foo
> instead of
> etc/foo
> which was the previous behavior
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira