You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@pig.apache.org by "Tom White (JIRA)" <ji...@apache.org> on 2008/05/16 17:23:55 UTC

[jira] Created: (PIG-240) Support launching concurrent Pig jobs from one VM

Support launching concurrent Pig jobs from one VM
-------------------------------------------------

                 Key: PIG-240
                 URL: https://issues.apache.org/jira/browse/PIG-240
             Project: Pig
          Issue Type: Improvement
          Components: impl
            Reporter: Tom White


For some applications it would be convenient to launch concurrent Pig jobs from a single VM. This is currently not possible since Pig has static mutable state.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (PIG-240) Support launching concurrent Pig jobs from one VM

Posted by "Pi Song (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/PIG-240?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12597520#action_12597520 ] 

Pi Song commented on PIG-240:
-----------------------------

Dear Tom,

Thanks a lot for your suggestions. However, these things are being completely rewritten in our new branch (under branch/type) that should come out within a few weeks.

There was a recent discussing about whether to make the launching method static in the new design. Problaby you could join our discussion at PIG-162.

Pi

> Support launching concurrent Pig jobs from one VM
> -------------------------------------------------
>
>                 Key: PIG-240
>                 URL: https://issues.apache.org/jira/browse/PIG-240
>             Project: Pig
>          Issue Type: Improvement
>          Components: impl
>            Reporter: Tom White
>         Attachments: pig-240.patch
>
>
> For some applications it would be convenient to launch concurrent Pig jobs from a single VM. This is currently not possible since Pig has static mutable state.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (PIG-240) Support launching concurrent Pig jobs from one VM

Posted by "Tom White (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/PIG-240?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Tom White updated PIG-240:
--------------------------

    Attachment: pig-240.patch

Patch with some of the simpler fixes.

> Support launching concurrent Pig jobs from one VM
> -------------------------------------------------
>
>                 Key: PIG-240
>                 URL: https://issues.apache.org/jira/browse/PIG-240
>             Project: Pig
>          Issue Type: Improvement
>          Components: impl
>            Reporter: Tom White
>         Attachments: pig-240.patch
>
>
> For some applications it would be convenient to launch concurrent Pig jobs from a single VM. This is currently not possible since Pig has static mutable state.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (PIG-240) Support launching concurrent Pig jobs from one VM

Posted by "Tom White (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/PIG-240?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12597512#action_12597512 ] 

Tom White commented on PIG-240:
-------------------------------


The following classes have shared state in (non-final) static fields. (I used FindBugs to get these, it would be nice if it was run automatically like Hadoop.)

BagFactory has a static SpillableMemoryManager. Since BagFactory is a singleton, SpillableMemoryManager can just be an instance variable.

MapReduceLauncher has several static fields and associated setters. POMapreduce has a static instance of MapReduceLauncher. This can be fixed by making HExecutionEngine create a MapReduceLauncher instance, set its non-static fields and set the instance on POMapreduce.

The static field totalHadoopTimeSpent on MapReduceLauncher cannot be dealt with in this way since it is used by PigServer to accumulate the time spent on jobs. This can be fixed by keeping it static (but accessing through a method rather than the field) and using AtomicLong for thread-safety. Longer term it would be better to have MapReduceLauncher.launchPig return a result object that PigServer gets the time from.

LogicalPlanBuilder has a static classloader field which is set by PigContext.addJar() and Main.main(). This field is widely used. It is used by the static method resolveClassName() on PigContext which is widely used via instantiateFuncFromSpec(). I think the proper approach is to make the classloader field an instance variable of PigContext, and make the PigContext available as needed.

PigMapReduce has two static fields: reporter and pigContext. DataBag.reportProgress uses reporter - DataBags should be constructed with a PigContext so they can get its non-static reporter. HadoopExecutableManager.configure uses pigContext - but it could just be given a JobConf in its constructor.

PigInputFormat has a static activeSplit field. Making the RecordReader hold a reference to the activeSplit would help here, but that might not be a solution for everywhere that uses the static field (e.g. FileLocalizer.openDFSFile, but that might not matter since it doesn't work locally anyway). 

> Support launching concurrent Pig jobs from one VM
> -------------------------------------------------
>
>                 Key: PIG-240
>                 URL: https://issues.apache.org/jira/browse/PIG-240
>             Project: Pig
>          Issue Type: Improvement
>          Components: impl
>            Reporter: Tom White
>
> For some applications it would be convenient to launch concurrent Pig jobs from a single VM. This is currently not possible since Pig has static mutable state.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.