You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@pig.apache.org by "Scott Carey (JIRA)" <ji...@apache.org> on 2010/08/09 19:39:15 UTC

[jira] Created: (PIG-1540) clean up pig dependencies included in jar files

clean up pig dependencies included in jar files
-----------------------------------------------

                 Key: PIG-1540
                 URL: https://issues.apache.org/jira/browse/PIG-1540
             Project: Pig
          Issue Type: Bug
    Affects Versions: 0.7.0
            Reporter: Scott Carey


Pig's output jars are difficult to include in other projects and bloated.   Building some jar targets for common use cases would be a big benefit to those consuming pig.  As a bonus, if these were ready for use in a Maven repository, then a follow-on ticket to enable maven would be trivial.

More information in comments to follow.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (PIG-1540) clean up pig dependencies included in jar files

Posted by "Scott Carey (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/PIG-1540?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12896636#action_12896636 ] 

Scott Carey commented on PIG-1540:
----------------------------------

>From the email thread:

{quote}
Scott,

You are mostly correct. All of those jars are not required to be in
there in pig-withouthadoop.jar. I see no reason why junit needs to be
there. Jackson and Joda are piggybank dependencies and as such should
be included in piggybank.jar not in pig-withouthadoop.jar. No idea
from where hamcrest and jshell are getting included. Looks like they
should be removed as well. I think even jline can be removed since its
only required at client side where users will be either using pig.jar
(which contains everything in any case) or setting up there own
classpath to use pig-withouthadoop.jar. So, it seems all the jars you
pointed out can be removed from pig-withouthadoop and that will lower
the distribution cost of it to all tasktracker node.

Lets open a jira and continue the discussion over there. Scott, would
you mind opening one?

Ashutosh

On Sun, Aug 8, 2010 at 12:41, Scott Carey <sc...@richrelevance.com> wrote:
That ant target is still a problem.

It may have removed most hadoop jars, but still has useless dependencies.   Why is junit in there?  Why is jackson in there?  I don't see why I need to push Junit out to the cluster with each submitted job.  I don't see where Pig is using JSON form Jackson.
The latter makes it impossible to use Pig with Avro unless you order the classpath right or build a custom jar.


Are hamcrest and jshell used?

I get the jline, and joda inclusions, but even then those should probably be external jars on the classpath from a lib directory.

Setting Pig up with a proper maven POM or ivy configuration would be a big plus to those consuming Pig.
{quote}


> clean up pig dependencies included in jar files
> -----------------------------------------------
>
>                 Key: PIG-1540
>                 URL: https://issues.apache.org/jira/browse/PIG-1540
>             Project: Pig
>          Issue Type: Bug
>    Affects Versions: 0.7.0
>            Reporter: Scott Carey
>
> Pig's output jars are difficult to include in other projects and bloated.   Building some jar targets for common use cases would be a big benefit to those consuming pig.  As a bonus, if these were ready for use in a Maven repository, then a follow-on ticket to enable maven would be trivial.
> More information in comments to follow.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (PIG-1540) clean up pig dependencies included in jar files

Posted by "Daniel Dai (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/PIG-1540?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12896644#action_12896644 ] 

Daniel Dai commented on PIG-1540:
---------------------------------

Pig also produce build/pig-0.8.0-dev-core.jar, which include Pig only classes. When Pig is ready to publish to Maven (PIG-1334), we will only publish pig-0.8.0-dev-core.jar.

> clean up pig dependencies included in jar files
> -----------------------------------------------
>
>                 Key: PIG-1540
>                 URL: https://issues.apache.org/jira/browse/PIG-1540
>             Project: Pig
>          Issue Type: Bug
>    Affects Versions: 0.7.0
>            Reporter: Scott Carey
>
> Pig's output jars are difficult to include in other projects and bloated.   Building some jar targets for common use cases would be a big benefit to those consuming pig.  As a bonus, if these were ready for use in a Maven repository, then a follow-on ticket to enable maven would be trivial.
> More information in comments to follow.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (PIG-1540) clean up pig dependencies included in jar files

Posted by "Scott Carey (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/PIG-1540?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12896653#action_12896653 ] 

Scott Carey commented on PIG-1540:
----------------------------------

Here are some common use cases for pig's jar files:


Build and test pig as a developer, stand-alone:   pig.jar
Use pig for pig development and debugging: pig.jar
Run pig scripts in production, trimmed down dependencies:  pig-withouthadoop.jar
  * Perhaps this should be trimmed down somewhat, with just pig, piggybank, and other 'basics' that would commonly need to be used when running a pig script.

Include pig in a project (for example, a custom LoadFunc project):   ???? 
  * Here is where a maven-compatible jar, javadoc-jar, and source-jar would be a blessing.  PIG-1334 should address the javadoc and source jars as well.


> clean up pig dependencies included in jar files
> -----------------------------------------------
>
>                 Key: PIG-1540
>                 URL: https://issues.apache.org/jira/browse/PIG-1540
>             Project: Pig
>          Issue Type: Bug
>    Affects Versions: 0.7.0
>            Reporter: Scott Carey
>
> Pig's output jars are difficult to include in other projects and bloated.   Building some jar targets for common use cases would be a big benefit to those consuming pig.  As a bonus, if these were ready for use in a Maven repository, then a follow-on ticket to enable maven would be trivial.
> More information in comments to follow.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.