You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@pig.apache.org by "Alan Gates (JIRA)" <ji...@apache.org> on 2008/01/25 18:37:40 UTC

[jira] Commented: (PIG-70) Improve PigContext code by using Factory Pattern

    [ https://issues.apache.org/jira/browse/PIG-70?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12562594#action_12562594 ] 

Alan Gates commented on PIG-70:
-------------------------------

I like the idea of splitting out PigContext into multiple classes, since it's somewhat large at the moment.  

Ben, since you wrote much of the original HOD connection code it would be good if you could look over the changes.

One small note.  We are trying to move to using spaces instead of tabs in our files.  When you're working in existing files (such as PigContext) it's good to stay with whatever the convention is there.  But as you create new files, like HOD.java and PigContextFactory.java, you should use spaces.

As you noted, this will require a lot of testing.  It will need to be tested in local mode, on HDFS with HOD and HDFS without HOD.  Do you have HOD setup somewhere that you can test with it, or will you need some assistance on the HOD testing part?

> Improve PigContext code by using Factory Pattern
> ------------------------------------------------
>
>                 Key: PIG-70
>                 URL: https://issues.apache.org/jira/browse/PIG-70
>             Project: Pig
>          Issue Type: Improvement
>          Components: impl
>    Affects Versions: 0.1.0
>            Reporter: Benjamin Francisoud
>         Attachments: PIG-70-v01.patch
>
>
> Even if the PigContext code is still quite small at the moment, for an outsider (me) it's already hard to understand :(
> If I understand correctly the PigContext purpose (on a Object Oriented point of view) is to hold various configuration objects like JobConf, JobClient, JobSubmissionProtocol...
> The initialization code mainly use the ExecType parameter but can also use quite complex code like: doHod(), initProperties(), connect()...
> (btw, the connect() method is actually doing 2 things: initializing some var and trying to connect, it initialization code should be move somewhere else)
> It is the perfect case to apply the [Factory Pattern|http://en.wikipedia.org/wiki/Factory_method_pattern] , you can also see [Replace Constructor with Factory Method|http://www.refactoring.com/catalog/replaceConstructorWithFactoryMethod.html] for more details.
> My proposal is to create a new PigContextFactory class, to old the initialization code make to PigContext and PigContextFactory 200 lines classes instead of one big 500 lines of code class.
> PigContext would hold some getter and setter and methods related to instantiate/run "functions"
> The new API would be:
> h4. PigContextFactory .java
> {code:java}
> public class PigContextFactory {
>     public static PigContext getInstance(ExecType execType) {...}
> }
> {code}
> h4. PigContext.java
> {code:java}
> public class PigContext implements Serializable, FunctionInstantiator {
>     public String getJobName(){...}
>     public JobSubmissionProtocol getJobTracker() {...}
>     public JobConf getConf() {...}
>     public static Object instantiateFuncFromSpec(String funcSpec) throws IOException{...}
>     public Object instantiateFuncFromAlias(String alias) throws IOException {...}
>     public void registerFunction(String function, String functionSpec) {...}
> }
> {code}
> h4. Client code
> {code:java}
> PigContext context = PigContextFactory.getInstance(ExecType.MAPREDUCE);
> {code}

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.