You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@pig.apache.org by "Eric Tschetter (JIRA)" <ji...@apache.org> on 2010/07/22 01:35:50 UTC
[jira] Updated: (PIG-1511) Pig removes packages from its own jar
when building the JAR to ship to Hadoop
[ https://issues.apache.org/jira/browse/PIG-1511?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Eric Tschetter updated PIG-1511:
--------------------------------
Attachment: pig-1511.diff
> Pig removes packages from its own jar when building the JAR to ship to Hadoop
> -----------------------------------------------------------------------------
>
> Key: PIG-1511
> URL: https://issues.apache.org/jira/browse/PIG-1511
> Project: Pig
> Issue Type: Bug
> Affects Versions: 0.7.0
> Reporter: Eric Tschetter
> Attachments: pig-1511.diff
>
>
> Pig generates a new jar file to ship over to Hadoop. Pig has a couple of packages whitelisted that it includes from its own jar. Pig throws away everything else.
> I package all of my dependencies into a single jar file. Pig is included in this jar file. I do it this way because my code needs to run reliably and reproducibly in production. Pig throws away all of my dependencies.
> I don't know what the performance gain is of shaving ~5MB off of a jar that is pushed to a job tracker once and then used to run over 100s of GB of data. The overhead is minimal on my cluster.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.