You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@pig.apache.org by "Rohini Palaniswamy (JIRA)" <ji...@apache.org> on 2014/07/25 05:17:39 UTC

[jira] [Commented] (PIG-4054) Do not create job.jar when submitting job

    [ https://issues.apache.org/jira/browse/PIG-4054?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14074006#comment-14074006 ] 

Rohini Palaniswamy commented on PIG-4054:
-----------------------------------------

Few minor comments:
   - Why not use a Set for List<URL> allJars? Same for List<String> defaultJars.
  - src/org/apache/pig/impl/util/Utils.java putJarOnClassPathThroughDistributedCache, shipToHDFS - The methods itself are not used and can be removed. Has been introduced as part of Tez merge only. 
  - Assert.assertEquals("size 1 for "+Arrays.toString(fileClassPaths), 1, fileClassPaths.length); - Can you keep the assert and just change the expected value?

> Do not create job.jar when submitting job
> -----------------------------------------
>
>                 Key: PIG-4054
>                 URL: https://issues.apache.org/jira/browse/PIG-4054
>             Project: Pig
>          Issue Type: Improvement
>          Components: impl
>            Reporter: Daniel Dai
>            Assignee: Daniel Dai
>             Fix For: 0.14.0
>
>         Attachments: PIG-4054-1.patch, PIG-4054-2.patch
>
>
> Currently Pig creates job.jar per job when submitting mapreduce job. There are several disadvantages:
> 1. job.jar varies job by job, job.jar will not get reused even if jar cache is used (PIG-2672).
> 2. Before job submission, we need to pack a job.jar which are mostly repacking of existing jars, this is a waste of time
> 3. job.jar is a uber jar which makes debug harder and could lead to jar conflicting issue (eg, PIG-3039)
> On tez side, situation is similar, the consequence is worse since container will not be reused.
> So instead of job.jar, I would like to ship individual jar to distributed cache. Note this issue is in essence independent of PIG-4047, however, PIG-4047 would make the picture more complete in that we don't have any uber jars.



--
This message was sent by Atlassian JIRA
(v6.2#6252)