You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@pig.apache.org by "Julien Le Dem (JIRA)" <ji...@apache.org> on 2011/06/15 22:52:47 UTC

[jira] [Created] (PIG-2128) Pig local mode is very slow.

Pig local mode is very slow.
----------------------------

                 Key: PIG-2128
                 URL: https://issues.apache.org/jira/browse/PIG-2128
             Project: Pig
          Issue Type: Improvement
          Components: impl
            Reporter: Julien Le Dem


Since the Pig local mode implementation has been moved to Hadoop local it is very slow.
We should optimize some of the steps so that it is more user friendly.
One thing would be to skip building the jar for each job as it runs in the same process.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Updated] (PIG-2128) Generating the jar file takes a lot of time and is unnecessary when running Pig local mode

Posted by "Olga Natkovich (Updated) (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/PIG-2128?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Olga Natkovich updated PIG-2128:
--------------------------------


Thanks!
                
> Generating the jar file takes a lot of time and is unnecessary when running Pig local mode
> ------------------------------------------------------------------------------------------
>
>                 Key: PIG-2128
>                 URL: https://issues.apache.org/jira/browse/PIG-2128
>             Project: Pig
>          Issue Type: Improvement
>          Components: impl
>            Reporter: Julien Le Dem
>            Assignee: Julien Le Dem
>             Fix For: 0.10
>
>         Attachments: PIG-2128.patch
>
>
> Since the Pig local mode implementation has been moved to Hadoop local it is very slow.
> We should optimize some of the steps so that it is more user friendly.
> In this case we should skip building the jar for each job as it runs in the same process.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Updated] (PIG-2128) Pig local mode is very slow.

Posted by "Julien Le Dem (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/PIG-2128?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Julien Le Dem updated PIG-2128:
-------------------------------

    Attachment: PIG-2128.patch

attaching PIG-2128.patch to skip building the jar when in local mode

> Pig local mode is very slow.
> ----------------------------
>
>                 Key: PIG-2128
>                 URL: https://issues.apache.org/jira/browse/PIG-2128
>             Project: Pig
>          Issue Type: Improvement
>          Components: impl
>            Reporter: Julien Le Dem
>         Attachments: PIG-2128.patch
>
>
> Since the Pig local mode implementation has been moved to Hadoop local it is very slow.
> We should optimize some of the steps so that it is more user friendly.
> One thing would be to skip building the jar for each job as it runs in the same process.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Assigned] (PIG-2128) Pig local mode is very slow.

Posted by "Julien Le Dem (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/PIG-2128?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Julien Le Dem reassigned PIG-2128:
----------------------------------

    Assignee: Julien Le Dem

> Pig local mode is very slow.
> ----------------------------
>
>                 Key: PIG-2128
>                 URL: https://issues.apache.org/jira/browse/PIG-2128
>             Project: Pig
>          Issue Type: Improvement
>          Components: impl
>            Reporter: Julien Le Dem
>            Assignee: Julien Le Dem
>         Attachments: PIG-2128.patch
>
>
> Since the Pig local mode implementation has been moved to Hadoop local it is very slow.
> We should optimize some of the steps so that it is more user friendly.
> One thing would be to skip building the jar for each job as it runs in the same process.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (PIG-2128) Pig local mode is very slow.

Posted by "Dmitriy V. Ryaboy (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/PIG-2128?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13050156#comment-13050156 ] 

Dmitriy V. Ryaboy commented on PIG-2128:
----------------------------------------

I verified that external jars do work. It's a significant speedup. +1.

Please change title of jira to refer to this specific optimization so we don't have 15 "local mode is slow" tickets as we keep chipping away at this :).

> Pig local mode is very slow.
> ----------------------------
>
>                 Key: PIG-2128
>                 URL: https://issues.apache.org/jira/browse/PIG-2128
>             Project: Pig
>          Issue Type: Improvement
>          Components: impl
>            Reporter: Julien Le Dem
>            Assignee: Julien Le Dem
>         Attachments: PIG-2128.patch
>
>
> Since the Pig local mode implementation has been moved to Hadoop local it is very slow.
> We should optimize some of the steps so that it is more user friendly.
> One thing would be to skip building the jar for each job as it runs in the same process.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (PIG-2128) Generating the jar file takes a lot of time and is unnecessary when running Pig local mode

Posted by "Dmitriy V. Ryaboy (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/PIG-2128?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13155712#comment-13155712 ] 

Dmitriy V. Ryaboy commented on PIG-2128:
----------------------------------------

Viraj,
Yes, this patch applies to 0.9 and even 0.8 (possibly with a bit of elbow grease -- I don't remember now) 

Was this a question or a request? :)
                
> Generating the jar file takes a lot of time and is unnecessary when running Pig local mode
> ------------------------------------------------------------------------------------------
>
>                 Key: PIG-2128
>                 URL: https://issues.apache.org/jira/browse/PIG-2128
>             Project: Pig
>          Issue Type: Improvement
>          Components: impl
>            Reporter: Julien Le Dem
>            Assignee: Julien Le Dem
>             Fix For: 0.10
>
>         Attachments: PIG-2128.patch
>
>
> Since the Pig local mode implementation has been moved to Hadoop local it is very slow.
> We should optimize some of the steps so that it is more user friendly.
> In this case we should skip building the jar for each job as it runs in the same process.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (PIG-2128) Generating the jar file takes a lot of time and is unnecessary when running Pig local mode

Posted by "Olga Natkovich (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/PIG-2128?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13161155#comment-13161155 ] 

Olga Natkovich commented on PIG-2128:
-------------------------------------

We were wondering if this change can be committed to 0.9 branch. It looks pretty benign.
                
> Generating the jar file takes a lot of time and is unnecessary when running Pig local mode
> ------------------------------------------------------------------------------------------
>
>                 Key: PIG-2128
>                 URL: https://issues.apache.org/jira/browse/PIG-2128
>             Project: Pig
>          Issue Type: Improvement
>          Components: impl
>            Reporter: Julien Le Dem
>            Assignee: Julien Le Dem
>             Fix For: 0.10
>
>         Attachments: PIG-2128.patch
>
>
> Since the Pig local mode implementation has been moved to Hadoop local it is very slow.
> We should optimize some of the steps so that it is more user friendly.
> In this case we should skip building the jar for each job as it runs in the same process.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (PIG-2128) Generating the jar file takes a lot of time and is unnecessary when running Pig local mode

Posted by "Dmitriy V. Ryaboy (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/PIG-2128?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13161362#comment-13161362 ] 

Dmitriy V. Ryaboy commented on PIG-2128:
----------------------------------------

committed to 0.9.2
                
> Generating the jar file takes a lot of time and is unnecessary when running Pig local mode
> ------------------------------------------------------------------------------------------
>
>                 Key: PIG-2128
>                 URL: https://issues.apache.org/jira/browse/PIG-2128
>             Project: Pig
>          Issue Type: Improvement
>          Components: impl
>            Reporter: Julien Le Dem
>            Assignee: Julien Le Dem
>             Fix For: 0.10, 0.9.2
>
>         Attachments: PIG-2128.patch
>
>
> Since the Pig local mode implementation has been moved to Hadoop local it is very slow.
> We should optimize some of the steps so that it is more user friendly.
> In this case we should skip building the jar for each job as it runs in the same process.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (PIG-2128) Generating the jar file takes a lot of time and is unnecessary when running Pig local mode

Posted by "Viraj Bhat (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/PIG-2128?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13155630#comment-13155630 ] 

Viraj Bhat commented on PIG-2128:
---------------------------------

Can this patch be backported to Pig 0.9?
Viraj
                
> Generating the jar file takes a lot of time and is unnecessary when running Pig local mode
> ------------------------------------------------------------------------------------------
>
>                 Key: PIG-2128
>                 URL: https://issues.apache.org/jira/browse/PIG-2128
>             Project: Pig
>          Issue Type: Improvement
>          Components: impl
>            Reporter: Julien Le Dem
>            Assignee: Julien Le Dem
>             Fix For: 0.10
>
>         Attachments: PIG-2128.patch
>
>
> Since the Pig local mode implementation has been moved to Hadoop local it is very slow.
> We should optimize some of the steps so that it is more user friendly.
> In this case we should skip building the jar for each job as it runs in the same process.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (PIG-2128) Pig local mode is very slow.

Posted by "Dmitriy V. Ryaboy (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/PIG-2128?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13050080#comment-13050080 ] 

Dmitriy V. Ryaboy commented on PIG-2128:
----------------------------------------

Julien, good idea.

Does this work if external jars are registered in the script?

> Pig local mode is very slow.
> ----------------------------
>
>                 Key: PIG-2128
>                 URL: https://issues.apache.org/jira/browse/PIG-2128
>             Project: Pig
>          Issue Type: Improvement
>          Components: impl
>            Reporter: Julien Le Dem
>            Assignee: Julien Le Dem
>         Attachments: PIG-2128.patch
>
>
> Since the Pig local mode implementation has been moved to Hadoop local it is very slow.
> We should optimize some of the steps so that it is more user friendly.
> One thing would be to skip building the jar for each job as it runs in the same process.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Updated] (PIG-2128) Generating the jar file takes a lot of time and is unnecessary when running Pig local mode

Posted by "Dmitriy V. Ryaboy (Updated) (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/PIG-2128?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Dmitriy V. Ryaboy updated PIG-2128:
-----------------------------------

    Fix Version/s: 0.9.2
    
> Generating the jar file takes a lot of time and is unnecessary when running Pig local mode
> ------------------------------------------------------------------------------------------
>
>                 Key: PIG-2128
>                 URL: https://issues.apache.org/jira/browse/PIG-2128
>             Project: Pig
>          Issue Type: Improvement
>          Components: impl
>            Reporter: Julien Le Dem
>            Assignee: Julien Le Dem
>             Fix For: 0.10, 0.9.2
>
>         Attachments: PIG-2128.patch
>
>
> Since the Pig local mode implementation has been moved to Hadoop local it is very slow.
> We should optimize some of the steps so that it is more user friendly.
> In this case we should skip building the jar for each job as it runs in the same process.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Updated] (PIG-2128) Generating the jar file takes a lot of time and is unnecessary when running Pig local mode

Posted by "Julien Le Dem (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/PIG-2128?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Julien Le Dem updated PIG-2128:
-------------------------------

       Resolution: Fixed
    Fix Version/s: 0.10
     Release Note: local mode will now skip building a jar with dependencies
           Status: Resolved  (was: Patch Available)

> Generating the jar file takes a lot of time and is unnecessary when running Pig local mode
> ------------------------------------------------------------------------------------------
>
>                 Key: PIG-2128
>                 URL: https://issues.apache.org/jira/browse/PIG-2128
>             Project: Pig
>          Issue Type: Improvement
>          Components: impl
>            Reporter: Julien Le Dem
>            Assignee: Julien Le Dem
>             Fix For: 0.10
>
>         Attachments: PIG-2128.patch
>
>
> Since the Pig local mode implementation has been moved to Hadoop local it is very slow.
> We should optimize some of the steps so that it is more user friendly.
> In this case we should skip building the jar for each job as it runs in the same process.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (PIG-2128) Generating the jar file takes a lot of time and is unnecessary when running Pig local mode

Posted by "Dmitriy V. Ryaboy (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/PIG-2128?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13161305#comment-13161305 ] 

Dmitriy V. Ryaboy commented on PIG-2128:
----------------------------------------

Patch applied cleanly, I'm running test-commit right now and will commit assuming it passes.
                
> Generating the jar file takes a lot of time and is unnecessary when running Pig local mode
> ------------------------------------------------------------------------------------------
>
>                 Key: PIG-2128
>                 URL: https://issues.apache.org/jira/browse/PIG-2128
>             Project: Pig
>          Issue Type: Improvement
>          Components: impl
>            Reporter: Julien Le Dem
>            Assignee: Julien Le Dem
>             Fix For: 0.10
>
>         Attachments: PIG-2128.patch
>
>
> Since the Pig local mode implementation has been moved to Hadoop local it is very slow.
> We should optimize some of the steps so that it is more user friendly.
> In this case we should skip building the jar for each job as it runs in the same process.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (PIG-2128) Generating the jar file takes a lot of time and is unnecessary when running Pig local mode

Posted by "Dmitriy V. Ryaboy (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/PIG-2128?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13050661#comment-13050661 ] 

Dmitriy V. Ryaboy commented on PIG-2128:
----------------------------------------

I verified that registering jars is fine and properties are picked up (in fact, because we always register fairly sizable jars, this is a significant enough improvement that I already pushed this patch to production in our clusters). So far so good. Go ahead and commit.

> Generating the jar file takes a lot of time and is unnecessary when running Pig local mode
> ------------------------------------------------------------------------------------------
>
>                 Key: PIG-2128
>                 URL: https://issues.apache.org/jira/browse/PIG-2128
>             Project: Pig
>          Issue Type: Improvement
>          Components: impl
>            Reporter: Julien Le Dem
>            Assignee: Julien Le Dem
>         Attachments: PIG-2128.patch
>
>
> Since the Pig local mode implementation has been moved to Hadoop local it is very slow.
> We should optimize some of the steps so that it is more user friendly.
> In this case we should skip building the jar for each job as it runs in the same process.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Updated] (PIG-2128) Generating the jar file takes a lot of time and is unnecessary when running Pig local mode

Posted by "Alan Gates (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/PIG-2128?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Alan Gates updated PIG-2128:
----------------------------

    Status: Patch Available  (was: Open)

> Generating the jar file takes a lot of time and is unnecessary when running Pig local mode
> ------------------------------------------------------------------------------------------
>
>                 Key: PIG-2128
>                 URL: https://issues.apache.org/jira/browse/PIG-2128
>             Project: Pig
>          Issue Type: Improvement
>          Components: impl
>            Reporter: Julien Le Dem
>            Assignee: Julien Le Dem
>         Attachments: PIG-2128.patch
>
>
> Since the Pig local mode implementation has been moved to Hadoop local it is very slow.
> We should optimize some of the steps so that it is more user friendly.
> In this case we should skip building the jar for each job as it runs in the same process.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Updated] (PIG-2128) Generating the jar file takes a lot of time and is unnecessary when running Pig local mode

Posted by "Julien Le Dem (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/PIG-2128?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Julien Le Dem updated PIG-2128:
-------------------------------

    Description: 
Since the Pig local mode implementation has been moved to Hadoop local it is very slow.
We should optimize some of the steps so that it is more user friendly.
In this case we should skip building the jar for each job as it runs in the same process.

  was:
Since the Pig local mode implementation has been moved to Hadoop local it is very slow.
We should optimize some of the steps so that it is more user friendly.
One thing would be to skip building the jar for each job as it runs in the same process.

        Summary: Generating the jar file takes a lot of time and is unnecessary when running Pig local mode  (was: Pig local mode is very slow.)

I've updated the summary and description
I am not sure if this could break other things (like register).

> Generating the jar file takes a lot of time and is unnecessary when running Pig local mode
> ------------------------------------------------------------------------------------------
>
>                 Key: PIG-2128
>                 URL: https://issues.apache.org/jira/browse/PIG-2128
>             Project: Pig
>          Issue Type: Improvement
>          Components: impl
>            Reporter: Julien Le Dem
>            Assignee: Julien Le Dem
>         Attachments: PIG-2128.patch
>
>
> Since the Pig local mode implementation has been moved to Hadoop local it is very slow.
> We should optimize some of the steps so that it is more user friendly.
> In this case we should skip building the jar for each job as it runs in the same process.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira