You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@pig.apache.org by "Mathias Herberts (JIRA)" <ji...@apache.org> on 2012/06/18 15:24:42 UTC

[jira] [Created] (PIG-2760) resources added with a relative path are added to the JobXXXX jar file under their absolute path

Mathias Herberts created PIG-2760:
-------------------------------------

             Summary: resources added with a relative path are added to the JobXXXX jar file under their absolute path
                 Key: PIG-2760
                 URL: https://issues.apache.org/jira/browse/PIG-2760
             Project: Pig
          Issue Type: Bug
    Affects Versions: 0.10.0
            Reporter: Mathias Herberts


When registering a local resource using a relative path, the resource is added to the JobXXXX jar under its absolute path.

If a pig script contains the following:

REGISTER etc/foo;

and is executed from a directory /PATH/TO/DIR, the JobXXXX jar file will contain the following:

/PATH/TO/DIR/etc/foo

instead of

etc/foo

which was the previous behavior

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (PIG-2760) resources added with a relative path are added to the JobXXXX jar file under their absolute path

Posted by "Cheolsoo Park (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/PIG-2760?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13397007#comment-13397007 ] 

Cheolsoo Park commented on PIG-2760:
------------------------------------

Hi Mathias,

Agreed. I haven't thought about the use case that you're describing. :-) Thanks for explaining!

I like your patch because it solves all the cases that I can think of. Just a minor comment. Can't you collapse the following lines of code into a single line?

{code}
String nameInJar = cp.startsWith(cwd) ? cp.substring(cwd.length() + 1) : cp;
// Strip leading path.sep
if (nameInJar.startsWith("/")) {
    nameInJar = nameInJar.substring(1);
}
{code}

=>

{code}
String nameInJar = cp.startsWith(cwd) ? cp.substring(cwd.length() + 1) : cp.substring(1);
{code}

Given that cp is always going to be an absolute path (as a relative path is converted to an absolute one by fetchfile()), the "if" condition seems redundant to me. Please correct me if I am wrong.

Thanks!
                
> resources added with a relative path are added to the JobXXXX jar file under their absolute path
> ------------------------------------------------------------------------------------------------
>
>                 Key: PIG-2760
>                 URL: https://issues.apache.org/jira/browse/PIG-2760
>             Project: Pig
>          Issue Type: Bug
>    Affects Versions: 0.10.0
>            Reporter: Mathias Herberts
>         Attachments: PIG-2760.patch
>
>
> When registering a local resource using a relative path, the resource is added to the JobXXXX jar under its absolute path.
> If a pig script contains the following:
> REGISTER etc/foo;
> and is executed from a directory /PATH/TO/DIR, the JobXXXX jar file will contain the following:
> /PATH/TO/DIR/etc/foo
> instead of
> etc/foo
> which was the previous behavior

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (PIG-2760) resources added with a relative path are added to the JobXXXX jar file under their absolute path

Posted by "Rohini Palaniswamy (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/PIG-2760?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13398072#comment-13398072 ] 

Rohini Palaniswamy commented on PIG-2760:
-----------------------------------------

Mathias,
 It requires one more minor change. If we just do cp.startsWith(cwd), then even if absolute path was specified and if the script is in a subdirectory under current directory, the jar entry only has the relative path instead of the absolute path. Need to do cp.equals(cwd + "/" + patch). 

I was addressing script loading issue in PIG-2761 and had some modifications for the same line of code. So added your fix to it also and tested and also made it part of PIG-2761 patch. Hope you don't mind. If you could, I would appreciate you reviewing PIG-2761.
                
> resources added with a relative path are added to the JobXXXX jar file under their absolute path
> ------------------------------------------------------------------------------------------------
>
>                 Key: PIG-2760
>                 URL: https://issues.apache.org/jira/browse/PIG-2760
>             Project: Pig
>          Issue Type: Bug
>    Affects Versions: 0.10.0
>            Reporter: Mathias Herberts
>         Attachments: PIG-2760.patch
>
>
> When registering a local resource using a relative path, the resource is added to the JobXXXX jar under its absolute path.
> If a pig script contains the following:
> REGISTER etc/foo;
> and is executed from a directory /PATH/TO/DIR, the JobXXXX jar file will contain the following:
> /PATH/TO/DIR/etc/foo
> instead of
> etc/foo
> which was the previous behavior

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (PIG-2760) resources added with a relative path are added to the JobXXXX jar file under their absolute path

Posted by "Rohini Palaniswamy (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/PIG-2760?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13397111#comment-13397111 ] 

Rohini Palaniswamy commented on PIG-2760:
-----------------------------------------

*This will cause getScriptAsStream() to error out on the backend as f.getPath() minus leading / will not be in the jar.
                
> resources added with a relative path are added to the JobXXXX jar file under their absolute path
> ------------------------------------------------------------------------------------------------
>
>                 Key: PIG-2760
>                 URL: https://issues.apache.org/jira/browse/PIG-2760
>             Project: Pig
>          Issue Type: Bug
>    Affects Versions: 0.10.0
>            Reporter: Mathias Herberts
>         Attachments: PIG-2760.patch
>
>
> When registering a local resource using a relative path, the resource is added to the JobXXXX jar under its absolute path.
> If a pig script contains the following:
> REGISTER etc/foo;
> and is executed from a directory /PATH/TO/DIR, the JobXXXX jar file will contain the following:
> /PATH/TO/DIR/etc/foo
> instead of
> etc/foo
> which was the previous behavior

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (PIG-2760) resources added with a relative path are added to the JobXXXX jar file under their absolute path

Posted by "Cheolsoo Park (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/PIG-2760?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13396016#comment-13396016 ] 

Cheolsoo Park commented on PIG-2760:
------------------------------------

This is a regression of PIG-2623:

{code}
-        File f = new File(path);
+        File f = FileLocalizer.fetchFile(pigContext.getProperties(), path).file;
{code}

where fetchFile() converts a relative path to absolute path.

In fact, converting a relative path to an absolute path isn't an issue, but the leading "/" makes registered files not found. That is fixed at PIG-2745.

Thanks!
                
> resources added with a relative path are added to the JobXXXX jar file under their absolute path
> ------------------------------------------------------------------------------------------------
>
>                 Key: PIG-2760
>                 URL: https://issues.apache.org/jira/browse/PIG-2760
>             Project: Pig
>          Issue Type: Bug
>    Affects Versions: 0.10.0
>            Reporter: Mathias Herberts
>
> When registering a local resource using a relative path, the resource is added to the JobXXXX jar under its absolute path.
> If a pig script contains the following:
> REGISTER etc/foo;
> and is executed from a directory /PATH/TO/DIR, the JobXXXX jar file will contain the following:
> /PATH/TO/DIR/etc/foo
> instead of
> etc/foo
> which was the previous behavior

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (PIG-2760) resources added with a relative path are added to the JobXXXX jar file under their absolute path

Posted by "Mathias Herberts (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/PIG-2760?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13397368#comment-13397368 ] 

Mathias Herberts commented on PIG-2760:
---------------------------------------

So this means we should do the following:

<code>
String cwd = new File(System.getProperty("user.dir")).getCanonicalPath();
String cp = f.getCanonicalPath();
String nameInJar = cp.startsWith(cwd) ? cp.substring(cwd.length() + 1) : cp.substring(1);
pigContext.addScriptFile(nameInJar,f.getPath());
se.registerFunctions(nameInJar, namespace, pigContext);
</code>

right?
                
> resources added with a relative path are added to the JobXXXX jar file under their absolute path
> ------------------------------------------------------------------------------------------------
>
>                 Key: PIG-2760
>                 URL: https://issues.apache.org/jira/browse/PIG-2760
>             Project: Pig
>          Issue Type: Bug
>    Affects Versions: 0.10.0
>            Reporter: Mathias Herberts
>         Attachments: PIG-2760.patch
>
>
> When registering a local resource using a relative path, the resource is added to the JobXXXX jar under its absolute path.
> If a pig script contains the following:
> REGISTER etc/foo;
> and is executed from a directory /PATH/TO/DIR, the JobXXXX jar file will contain the following:
> /PATH/TO/DIR/etc/foo
> instead of
> etc/foo
> which was the previous behavior

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (PIG-2760) resources added with a relative path are added to the JobXXXX jar file under their absolute path

Posted by "Rohini Palaniswamy (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/PIG-2760?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13397110#comment-13397110 ] 

Rohini Palaniswamy commented on PIG-2760:
-----------------------------------------

<code>
se.registerFunctions(f.getPath(), namespace, pigContext);
String cwd = new File(System.getProperty("user.dir")).getCanonicalPath();
String cp = f.getCanonicalPath();
String nameInJar = cp.startsWith(cwd) ? cp.substring(cwd.length() + 1) : cp.substring(1);
pigContext.addScriptFile(nameInJar,f.getPath());
</code>

  The function is still registered with f.getPath() even though nameInJar is going to be relative to current directory. This will cause getScriptAsStream() on the backend as f.getPath() minus leading / will not be in the jar.  
                
> resources added with a relative path are added to the JobXXXX jar file under their absolute path
> ------------------------------------------------------------------------------------------------
>
>                 Key: PIG-2760
>                 URL: https://issues.apache.org/jira/browse/PIG-2760
>             Project: Pig
>          Issue Type: Bug
>    Affects Versions: 0.10.0
>            Reporter: Mathias Herberts
>         Attachments: PIG-2760.patch
>
>
> When registering a local resource using a relative path, the resource is added to the JobXXXX jar under its absolute path.
> If a pig script contains the following:
> REGISTER etc/foo;
> and is executed from a directory /PATH/TO/DIR, the JobXXXX jar file will contain the following:
> /PATH/TO/DIR/etc/foo
> instead of
> etc/foo
> which was the previous behavior

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (PIG-2760) resources added with a relative path are added to the JobXXXX jar file under their absolute path

Posted by "Mathias Herberts (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/PIG-2760?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13397030#comment-13397030 ] 

Mathias Herberts commented on PIG-2760:
---------------------------------------

I guess with an appropriate comment the two code chunks could be collapsed into one yes.
                
> resources added with a relative path are added to the JobXXXX jar file under their absolute path
> ------------------------------------------------------------------------------------------------
>
>                 Key: PIG-2760
>                 URL: https://issues.apache.org/jira/browse/PIG-2760
>             Project: Pig
>          Issue Type: Bug
>    Affects Versions: 0.10.0
>            Reporter: Mathias Herberts
>         Attachments: PIG-2760.patch
>
>
> When registering a local resource using a relative path, the resource is added to the JobXXXX jar under its absolute path.
> If a pig script contains the following:
> REGISTER etc/foo;
> and is executed from a directory /PATH/TO/DIR, the JobXXXX jar file will contain the following:
> /PATH/TO/DIR/etc/foo
> instead of
> etc/foo
> which was the previous behavior

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (PIG-2760) resources added with a relative path are added to the JobXXXX jar file under their absolute path

Posted by "Cheolsoo Park (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/PIG-2760?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13397142#comment-13397142 ] 

Cheolsoo Park commented on PIG-2760:
------------------------------------

Indeed. Good catch!
                
> resources added with a relative path are added to the JobXXXX jar file under their absolute path
> ------------------------------------------------------------------------------------------------
>
>                 Key: PIG-2760
>                 URL: https://issues.apache.org/jira/browse/PIG-2760
>             Project: Pig
>          Issue Type: Bug
>    Affects Versions: 0.10.0
>            Reporter: Mathias Herberts
>         Attachments: PIG-2760.patch
>
>
> When registering a local resource using a relative path, the resource is added to the JobXXXX jar under its absolute path.
> If a pig script contains the following:
> REGISTER etc/foo;
> and is executed from a directory /PATH/TO/DIR, the JobXXXX jar file will contain the following:
> /PATH/TO/DIR/etc/foo
> instead of
> etc/foo
> which was the previous behavior

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Resolved] (PIG-2760) resources added with a relative path are added to the JobXXXX jar file under their absolute path

Posted by "Daniel Dai (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/PIG-2760?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Daniel Dai resolved PIG-2760.
-----------------------------

       Resolution: Fixed
    Fix Version/s: 0.10.1
                   0.11
         Assignee: Rohini Palaniswamy
     Hadoop Flags: Reviewed

This is fixed along with PIG-2761. Thanks folks!
                
> resources added with a relative path are added to the JobXXXX jar file under their absolute path
> ------------------------------------------------------------------------------------------------
>
>                 Key: PIG-2760
>                 URL: https://issues.apache.org/jira/browse/PIG-2760
>             Project: Pig
>          Issue Type: Bug
>    Affects Versions: 0.10.0
>            Reporter: Mathias Herberts
>            Assignee: Rohini Palaniswamy
>             Fix For: 0.11, 0.10.1
>
>         Attachments: PIG-2760.patch
>
>
> When registering a local resource using a relative path, the resource is added to the JobXXXX jar under its absolute path.
> If a pig script contains the following:
> REGISTER etc/foo;
> and is executed from a directory /PATH/TO/DIR, the JobXXXX jar file will contain the following:
> /PATH/TO/DIR/etc/foo
> instead of
> etc/foo
> which was the previous behavior

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (PIG-2760) resources added with a relative path are added to the JobXXXX jar file under their absolute path

Posted by "Mathias Herberts (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/PIG-2760?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13400459#comment-13400459 ] 

Mathias Herberts commented on PIG-2760:
---------------------------------------

Your patch attached to PIG-2761 Looks Good To Me for what concerns PIG-2760.
                
> resources added with a relative path are added to the JobXXXX jar file under their absolute path
> ------------------------------------------------------------------------------------------------
>
>                 Key: PIG-2760
>                 URL: https://issues.apache.org/jira/browse/PIG-2760
>             Project: Pig
>          Issue Type: Bug
>    Affects Versions: 0.10.0
>            Reporter: Mathias Herberts
>         Attachments: PIG-2760.patch
>
>
> When registering a local resource using a relative path, the resource is added to the JobXXXX jar under its absolute path.
> If a pig script contains the following:
> REGISTER etc/foo;
> and is executed from a directory /PATH/TO/DIR, the JobXXXX jar file will contain the following:
> /PATH/TO/DIR/etc/foo
> instead of
> etc/foo
> which was the previous behavior

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Updated] (PIG-2760) resources added with a relative path are added to the JobXXXX jar file under their absolute path

Posted by "Mathias Herberts (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/PIG-2760?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Mathias Herberts updated PIG-2760:
----------------------------------

    Attachment: PIG-2760.patch

This patch changes the name used in the job jar to relative paths if the added resource lies under the current working directory.

It also strips leading '/' as PIG-2745
                
> resources added with a relative path are added to the JobXXXX jar file under their absolute path
> ------------------------------------------------------------------------------------------------
>
>                 Key: PIG-2760
>                 URL: https://issues.apache.org/jira/browse/PIG-2760
>             Project: Pig
>          Issue Type: Bug
>    Affects Versions: 0.10.0
>            Reporter: Mathias Herberts
>         Attachments: PIG-2760.patch
>
>
> When registering a local resource using a relative path, the resource is added to the JobXXXX jar under its absolute path.
> If a pig script contains the following:
> REGISTER etc/foo;
> and is executed from a directory /PATH/TO/DIR, the JobXXXX jar file will contain the following:
> /PATH/TO/DIR/etc/foo
> instead of
> etc/foo
> which was the previous behavior

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (PIG-2760) resources added with a relative path are added to the JobXXXX jar file under their absolute path

Posted by "Mathias Herberts (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/PIG-2760?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13396596#comment-13396596 ] 

Mathias Herberts commented on PIG-2760:
---------------------------------------

Converting a relative path to an absolute one may be an issue when accessing a resource in a UDF using getResourceAsStream

Previously, if we used 'REGISTER foo/bar;' in a script, you could access 'bar' by calling this.getClass().getClassLoader().getResourceAsStream('foo/bar'); in your UDF, and this would work whatever directory the pig script is run from.

If converting the relative path to an absolute one (with no leading '/'), the argument to getResourceAsStream will need to be dependent on the directory from which the pig script is run, this kinds of defeat usability.
                
> resources added with a relative path are added to the JobXXXX jar file under their absolute path
> ------------------------------------------------------------------------------------------------
>
>                 Key: PIG-2760
>                 URL: https://issues.apache.org/jira/browse/PIG-2760
>             Project: Pig
>          Issue Type: Bug
>    Affects Versions: 0.10.0
>            Reporter: Mathias Herberts
>
> When registering a local resource using a relative path, the resource is added to the JobXXXX jar under its absolute path.
> If a pig script contains the following:
> REGISTER etc/foo;
> and is executed from a directory /PATH/TO/DIR, the JobXXXX jar file will contain the following:
> /PATH/TO/DIR/etc/foo
> instead of
> etc/foo
> which was the previous behavior

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira