You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@pig.apache.org by "Antonio Magnaghi (JIRA)" <ji...@apache.org> on 2007/11/27 22:25:43 UTC

[jira] Created: (PIG-32) Abstraction Layer to decouple Pig from Back-End

Abstraction Layer to decouple Pig from Back-End
-----------------------------------------------

                 Key: PIG-32
                 URL: https://issues.apache.org/jira/browse/PIG-32
             Project: Pig
          Issue Type: New Feature
          Components: impl
            Reporter: Antonio Magnaghi
            Assignee: Antonio Magnaghi


I'm opening a new issue to track the development work to support an abstraction layer for Pig as defined at http://wiki.apache.org/pig/PigAbstractionLayer


-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


RE: [jira] Commented: (PIG-32) Abstraction Layer to decouple Pig from Back-End

Posted by Antonio Magnaghi <an...@yahoo-inc.com>.
Utkarsh,

Thanks for taking a preliminary look at the changes. 

Yes, the tests are easily modifiable: I am going to upload shortly
another patch in that direction...

-a.

-----Original Message-----
From: Utkarsh Srivastava (JIRA) [mailto:jira@apache.org] 
Sent: Wednesday, December 26, 2007 4:33 PM
To: pig-dev@incubator.apache.org
Subject: [jira] Commented: (PIG-32) Abstraction Layer to decouple Pig
from Back-End


    [
https://issues.apache.org/jira/browse/PIG-32?page=com.atlassian.jira.plu
gin.system.issuetabpanels:comment-tabpanel#action_12554500 ] 

Utkarsh Srivastava commented on PIG-32:
---------------------------------------

Haven't looked over the changes in detail. Looks good though. I am
guessing all the tests should be easily modifiable to deal with the new
code? Or would those need to change substantially as well?

> Abstraction Layer to decouple Pig from Back-End
> -----------------------------------------------
>
>                 Key: PIG-32
>                 URL: https://issues.apache.org/jira/browse/PIG-32
>             Project: Pig
>          Issue Type: New Feature
>          Components: impl
>            Reporter: Antonio Magnaghi
>            Assignee: Antonio Magnaghi
>         Attachments: DataStorage.diff, DataStorage20071212.diff,
patch2007_12_26.diff
>
>
> I'm opening a new issue to track the development work to support an
abstraction layer for Pig as defined at
http://wiki.apache.org/pig/PigAbstractionLayer

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


RE: [jira] Commented: (PIG-32) Abstraction Layer to decouple Pig from Back-End

Posted by Antonio Magnaghi <an...@yahoo-inc.com>.
Will prepare a patch for it.

Thanks, Ben, for finding this problem.

-a.

-----Original Message-----
From: Benjamin Francisoud (JIRA) [mailto:jira@apache.org] 
Sent: Friday, February 01, 2008 3:15 AM
To: pig-dev@incubator.apache.org
Subject: [jira] Commented: (PIG-32) Abstraction Layer to decouple Pig
from Back-End


    [
https://issues.apache.org/jira/browse/PIG-32?page=com.atlassian.jira.plu
gin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12564725#act
ion_12564725 ] 

Benjamin Francisoud commented on PIG-32:
----------------------------------------

I think this commit broke compatibility with java 5 :(

For example in PigServer line 281:
{code:java}
catch (ExecException e) {
    throw new IOException("Unable to open iterator for alias: " + id,
e);
}
{code}

This constructor is only available in java 6:
* [java 5 IOException
api|http://java.sun.com/j2se/1.5.0/docs/api/java/io/IOException.html#IOE
xception()]
* [java 6 IOException
api|http://java.sun.com/javase/6/docs/api/java/io/IOException.html#IOExc
eption(java.lang.String,%20java.lang.Throwable)]

I made the same mistake in PIG-80 ;) 
I made my local eclipse use a 1.5 jdk now to avoid this kind of
mistakes...

See [pig-user mailing
list|http://www.mail-archive.com/pig-user%40incubator.apache.org/msg0005
2.html] and [Getting started wiki
page|http://wiki.apache.org/pig/GettingStarted] for details.

> Abstraction Layer to decouple Pig from Back-End
> -----------------------------------------------
>
>                 Key: PIG-32
>                 URL: https://issues.apache.org/jira/browse/PIG-32
>             Project: Pig
>          Issue Type: New Feature
>          Components: impl
>            Reporter: Antonio Magnaghi
>            Assignee: Antonio Magnaghi
>         Attachments: 2008.01.29.patch, DataStorage.diff,
DataStorage20071212.diff, patch.2008.01.16.merge_w_trunk,
patch.2008.01.23.diff, PATCH.2008.01.31, patch2007_12_26.diff,
patch2007_12_26_II.diff, patch2007_12_27.diff, pig.jar,
pig.jar.2008.01.16, TEST.LOG, TEST.LOGS
>
>
> I'm opening a new issue to track the development work to support an
abstraction layer for Pig as defined at
http://wiki.apache.org/pig/PigAbstractionLayer

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (PIG-32) Abstraction Layer to decouple Pig from Back-End

Posted by "Alan Gates (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/PIG-32?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12565611#action_12565611 ] 

Alan Gates commented on PIG-32:
-------------------------------

Latest patch (patch.2008.02.04) committed.

> Abstraction Layer to decouple Pig from Back-End
> -----------------------------------------------
>
>                 Key: PIG-32
>                 URL: https://issues.apache.org/jira/browse/PIG-32
>             Project: Pig
>          Issue Type: New Feature
>          Components: impl
>            Reporter: Antonio Magnaghi
>            Assignee: Antonio Magnaghi
>         Attachments: 2008.01.29.patch, DataStorage.diff, DataStorage20071212.diff, patch.2008.01.16.merge_w_trunk, patch.2008.01.23.diff, PATCH.2008.01.31, patch.2008.02.04, patch2007_12_26.diff, patch2007_12_26_II.diff, patch2007_12_27.diff, PIG-32.jdk1.5.patch, pig.jar, pig.jar.2008.01.16, TEST.LOG, TEST.LOGS
>
>
> I'm opening a new issue to track the development work to support an abstraction layer for Pig as defined at http://wiki.apache.org/pig/PigAbstractionLayer

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Resolved: (PIG-32) Abstraction Layer to decouple Pig from Back-End

Posted by "Olga Natkovich (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/PIG-32?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Olga Natkovich resolved PIG-32.
-------------------------------

    Resolution: Fixed

changes committed

> Abstraction Layer to decouple Pig from Back-End
> -----------------------------------------------
>
>                 Key: PIG-32
>                 URL: https://issues.apache.org/jira/browse/PIG-32
>             Project: Pig
>          Issue Type: New Feature
>          Components: impl
>            Reporter: Antonio Magnaghi
>            Assignee: Antonio Magnaghi
>         Attachments: 2008.01.29.patch, DataStorage.diff, DataStorage20071212.diff, patch.2008.01.16.merge_w_trunk, patch.2008.01.23.diff, PATCH.2008.01.31, patch2007_12_26.diff, patch2007_12_26_II.diff, patch2007_12_27.diff, pig.jar, pig.jar.2008.01.16, TEST.LOG, TEST.LOGS
>
>
> I'm opening a new issue to track the development work to support an abstraction layer for Pig as defined at http://wiki.apache.org/pig/PigAbstractionLayer

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (PIG-32) Abstraction Layer to decouple Pig from Back-End

Posted by "Antonio Magnaghi (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/PIG-32?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Antonio Magnaghi updated PIG-32:
--------------------------------

    Attachment:     (was: SeekableInputStream.java)

> Abstraction Layer to decouple Pig from Back-End
> -----------------------------------------------
>
>                 Key: PIG-32
>                 URL: https://issues.apache.org/jira/browse/PIG-32
>             Project: Pig
>          Issue Type: New Feature
>          Components: impl
>            Reporter: Antonio Magnaghi
>            Assignee: Antonio Magnaghi
>
> I'm opening a new issue to track the development work to support an abstraction layer for Pig as defined at http://wiki.apache.org/pig/PigAbstractionLayer

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (PIG-32) Abstraction Layer to decouple Pig from Back-End

Posted by "Antonio Magnaghi (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/PIG-32?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12564492#action_12564492 ] 

Antonio Magnaghi commented on PIG-32:
-------------------------------------

Olga, am looking into why that one test (TestPigSplit) is slower than
expected. Actually from the logs, the time the MR job takes to run on
the cluster in the two cases (before and after the path) is the same.
The extra time is spent before actually launching the job to the
cluster.

I am looking into why this is happening.

-a.



> Abstraction Layer to decouple Pig from Back-End
> -----------------------------------------------
>
>                 Key: PIG-32
>                 URL: https://issues.apache.org/jira/browse/PIG-32
>             Project: Pig
>          Issue Type: New Feature
>          Components: impl
>            Reporter: Antonio Magnaghi
>            Assignee: Antonio Magnaghi
>         Attachments: 2008.01.29.patch, DataStorage.diff, DataStorage20071212.diff, patch.2008.01.16.merge_w_trunk, patch.2008.01.23.diff, patch2007_12_26.diff, patch2007_12_26_II.diff, patch2007_12_27.diff, pig.jar, pig.jar.2008.01.16, TEST.LOGS
>
>
> I'm opening a new issue to track the development work to support an abstraction layer for Pig as defined at http://wiki.apache.org/pig/PigAbstractionLayer

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (PIG-32) Abstraction Layer to decouple Pig from Back-End

Posted by "Olga Natkovich (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/PIG-32?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12564987#action_12564987 ] 

Olga Natkovich commented on PIG-32:
-----------------------------------

Antonio, 

For IOException, you want to use the other constructor and setup the causing exception via initCause.



> Abstraction Layer to decouple Pig from Back-End
> -----------------------------------------------
>
>                 Key: PIG-32
>                 URL: https://issues.apache.org/jira/browse/PIG-32
>             Project: Pig
>          Issue Type: New Feature
>          Components: impl
>            Reporter: Antonio Magnaghi
>            Assignee: Antonio Magnaghi
>         Attachments: 2008.01.29.patch, DataStorage.diff, DataStorage20071212.diff, patch.2008.01.16.merge_w_trunk, patch.2008.01.23.diff, PATCH.2008.01.31, patch2007_12_26.diff, patch2007_12_26_II.diff, patch2007_12_27.diff, PIG-32.jdk1.5.patch, pig.jar, pig.jar.2008.01.16, TEST.LOG, TEST.LOGS
>
>
> I'm opening a new issue to track the development work to support an abstraction layer for Pig as defined at http://wiki.apache.org/pig/PigAbstractionLayer

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (PIG-32) Abstraction Layer to decouple Pig from Back-End

Posted by "Olga Natkovich (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/PIG-32?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12564475#action_12564475 ] 

Olga Natkovich commented on PIG-32:
-----------------------------------

With help from Antonio, I was able to get the code to compile. All unit tests passed accept on of the test,  .TestPigSplit is taking much longer. Olf code - 15 sec, new code - 600 sec.

Antonio is looking into it. I am also running some end-to-end tests.

> Abstraction Layer to decouple Pig from Back-End
> -----------------------------------------------
>
>                 Key: PIG-32
>                 URL: https://issues.apache.org/jira/browse/PIG-32
>             Project: Pig
>          Issue Type: New Feature
>          Components: impl
>            Reporter: Antonio Magnaghi
>            Assignee: Antonio Magnaghi
>         Attachments: 2008.01.29.patch, DataStorage.diff, DataStorage20071212.diff, patch.2008.01.16.merge_w_trunk, patch.2008.01.23.diff, patch2007_12_26.diff, patch2007_12_26_II.diff, patch2007_12_27.diff, pig.jar, pig.jar.2008.01.16, TEST.LOGS
>
>
> I'm opening a new issue to track the development work to support an abstraction layer for Pig as defined at http://wiki.apache.org/pig/PigAbstractionLayer

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (PIG-32) Abstraction Layer to decouple Pig from Back-End

Posted by "Olga Natkovich (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/PIG-32?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12564590#action_12564590 ] 

Olga Natkovich commented on PIG-32:
-----------------------------------

I verified that with this patch the unit tests take the same time.

In the process of committing the patch ...

> Abstraction Layer to decouple Pig from Back-End
> -----------------------------------------------
>
>                 Key: PIG-32
>                 URL: https://issues.apache.org/jira/browse/PIG-32
>             Project: Pig
>          Issue Type: New Feature
>          Components: impl
>            Reporter: Antonio Magnaghi
>            Assignee: Antonio Magnaghi
>         Attachments: 2008.01.29.patch, DataStorage.diff, DataStorage20071212.diff, patch.2008.01.16.merge_w_trunk, patch.2008.01.23.diff, PATCH.2008.01.31, patch2007_12_26.diff, patch2007_12_26_II.diff, patch2007_12_27.diff, pig.jar, pig.jar.2008.01.16, TEST.LOG, TEST.LOGS
>
>
> I'm opening a new issue to track the development work to support an abstraction layer for Pig as defined at http://wiki.apache.org/pig/PigAbstractionLayer

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (PIG-32) Abstraction Layer to decouple Pig from Back-End

Posted by "Antonio Magnaghi (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/PIG-32?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Antonio Magnaghi updated PIG-32:
--------------------------------

    Attachment: patch2007_12_27.diff

This patch includes yesterday's patch (as it has not been committed yet) plus the changes below.

With these changes I was able to get a full pass on the Pig test suite.

- Fixed the implementation of PigServer::capacity: it was still explicitly using Hadoop DFS, instead should use the abstract data storage class provided as part of the abstraction API and there was a dynamic cast that is now wrong.

- Added keys to the DataStorage interface for collecting statistics on capacity and provided support for those in the HDataStorage

- Fixed POMapredeuce: specifically, the copy operation was not properly handling the copy of the physical Op table

- Fixed implementation of openDFSFile in FileLocalizer: it was not instantiating a DataStorage object to access distributed data when running as part of a map-reduce job, causing the job to fail.

- Have also cleaned up some source files removing packages that were imported but now not used anymore.



> Abstraction Layer to decouple Pig from Back-End
> -----------------------------------------------
>
>                 Key: PIG-32
>                 URL: https://issues.apache.org/jira/browse/PIG-32
>             Project: Pig
>          Issue Type: New Feature
>          Components: impl
>            Reporter: Antonio Magnaghi
>            Assignee: Antonio Magnaghi
>         Attachments: DataStorage.diff, DataStorage20071212.diff, patch2007_12_26.diff, patch2007_12_26_II.diff, patch2007_12_27.diff
>
>
> I'm opening a new issue to track the development work to support an abstraction layer for Pig as defined at http://wiki.apache.org/pig/PigAbstractionLayer

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (PIG-32) Abstraction Layer to decouple Pig from Back-End

Posted by "Antonio Magnaghi (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/PIG-32?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Antonio Magnaghi updated PIG-32:
--------------------------------

    Attachment: patch.2008.01.23.diff

Attaching diff that includes changes to re-use results previously materialized in the same pig query.

unit tests and regression tests pass.

> Abstraction Layer to decouple Pig from Back-End
> -----------------------------------------------
>
>                 Key: PIG-32
>                 URL: https://issues.apache.org/jira/browse/PIG-32
>             Project: Pig
>          Issue Type: New Feature
>          Components: impl
>            Reporter: Antonio Magnaghi
>            Assignee: Antonio Magnaghi
>         Attachments: DataStorage.diff, DataStorage20071212.diff, patch.2008.01.16.merge_w_trunk, patch.2008.01.23.diff, patch2007_12_26.diff, patch2007_12_26_II.diff, patch2007_12_27.diff, pig.jar, pig.jar.2008.01.16
>
>
> I'm opening a new issue to track the development work to support an abstraction layer for Pig as defined at http://wiki.apache.org/pig/PigAbstractionLayer

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (PIG-32) Abstraction Layer to decouple Pig from Back-End

Posted by "Antonio Magnaghi (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/PIG-32?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Antonio Magnaghi updated PIG-32:
--------------------------------

    Attachment: DataStorage20071212.diff

Attaching a new patch inclusive of previous patch plus implementation of Local Data Storage and unit tests

> Abstraction Layer to decouple Pig from Back-End
> -----------------------------------------------
>
>                 Key: PIG-32
>                 URL: https://issues.apache.org/jira/browse/PIG-32
>             Project: Pig
>          Issue Type: New Feature
>          Components: impl
>            Reporter: Antonio Magnaghi
>            Assignee: Antonio Magnaghi
>         Attachments: DataStorage.diff, DataStorage20071212.diff
>
>
> I'm opening a new issue to track the development work to support an abstraction layer for Pig as defined at http://wiki.apache.org/pig/PigAbstractionLayer

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (PIG-32) Abstraction Layer to decouple Pig from Back-End

Posted by "Olga Natkovich (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/PIG-32?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12548393 ] 

Olga Natkovich commented on PIG-32:
-----------------------------------

I created a branch called "plan" for this project. I could not find a way to give separate committer priveleges for just this branch so we would need to manage changes through patches the same way as we do for the trunk.

The current state of the branch is the same as of the trunk. If you want to submit a patch, please, follow the usual steps.

> Abstraction Layer to decouple Pig from Back-End
> -----------------------------------------------
>
>                 Key: PIG-32
>                 URL: https://issues.apache.org/jira/browse/PIG-32
>             Project: Pig
>          Issue Type: New Feature
>          Components: impl
>            Reporter: Antonio Magnaghi
>            Assignee: Antonio Magnaghi
>         Attachments: DataStorage.java, DataStorageContainerDescriptor.java, DataStorageElementDescriptor.java, DataStorageException.java, Properties.java, PropertiesObject.java, SeekableInputStream.java, TestPropertiesObject.java
>
>
> I'm opening a new issue to track the development work to support an abstraction layer for Pig as defined at http://wiki.apache.org/pig/PigAbstractionLayer

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (PIG-32) Abstraction Layer to decouple Pig from Back-End

Posted by "Olga Natkovich (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/PIG-32?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12564488#action_12564488 ] 

Olga Natkovich commented on PIG-32:
-----------------------------------

All end-to-end tests done and I did not see any slowdown with them.

> Abstraction Layer to decouple Pig from Back-End
> -----------------------------------------------
>
>                 Key: PIG-32
>                 URL: https://issues.apache.org/jira/browse/PIG-32
>             Project: Pig
>          Issue Type: New Feature
>          Components: impl
>            Reporter: Antonio Magnaghi
>            Assignee: Antonio Magnaghi
>         Attachments: 2008.01.29.patch, DataStorage.diff, DataStorage20071212.diff, patch.2008.01.16.merge_w_trunk, patch.2008.01.23.diff, patch2007_12_26.diff, patch2007_12_26_II.diff, patch2007_12_27.diff, pig.jar, pig.jar.2008.01.16, TEST.LOGS
>
>
> I'm opening a new issue to track the development work to support an abstraction layer for Pig as defined at http://wiki.apache.org/pig/PigAbstractionLayer

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (PIG-32) Abstraction Layer to decouple Pig from Back-End

Posted by "Antonio Magnaghi (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/PIG-32?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Antonio Magnaghi updated PIG-32:
--------------------------------

    Attachment: SeekableInputStream.java
                DataStorageException.java

> Abstraction Layer to decouple Pig from Back-End
> -----------------------------------------------
>
>                 Key: PIG-32
>                 URL: https://issues.apache.org/jira/browse/PIG-32
>             Project: Pig
>          Issue Type: New Feature
>          Components: impl
>            Reporter: Antonio Magnaghi
>            Assignee: Antonio Magnaghi
>         Attachments: DataStorage.java, DataStorageContainerDescriptor.java, DataStorageElementDescriptor.java, DataStorageException.java, Properties.java, PropertiesObject.java, SeekableInputStream.java, TestPropertiesObject.java
>
>
> I'm opening a new issue to track the development work to support an abstraction layer for Pig as defined at http://wiki.apache.org/pig/PigAbstractionLayer

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (PIG-32) Abstraction Layer to decouple Pig from Back-End

Posted by "Antonio Magnaghi (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/PIG-32?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Antonio Magnaghi updated PIG-32:
--------------------------------

    Attachment:     (was: DataStorageException.java)

> Abstraction Layer to decouple Pig from Back-End
> -----------------------------------------------
>
>                 Key: PIG-32
>                 URL: https://issues.apache.org/jira/browse/PIG-32
>             Project: Pig
>          Issue Type: New Feature
>          Components: impl
>            Reporter: Antonio Magnaghi
>            Assignee: Antonio Magnaghi
>
> I'm opening a new issue to track the development work to support an abstraction layer for Pig as defined at http://wiki.apache.org/pig/PigAbstractionLayer

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (PIG-32) Abstraction Layer to decouple Pig from Back-End

Posted by "Antonio Magnaghi (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/PIG-32?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12551166 ] 

Antonio Magnaghi commented on PIG-32:
-------------------------------------

Recording in the bug log some discussion points from a conversation with Utkarsh on the separation of front-end from back-end:

Data Storage Portion:
- add a "isContainer" method to the DataStorageElement interface

Execution Engine Portion:
- Introduce the concept of Job in the type hierarchy. This would be the result of executing/submitting a physical plan to the execution engine

- as part of the Local Back-End re-write:

   - currently co-group uses in mem data structures. we should switch over to spillable data structures instead
   - a general suggestion is to favor clarity of the code. A possibility we discussed was that some operators like LORead, LOSplits, and intermediate results can be removed.




> Abstraction Layer to decouple Pig from Back-End
> -----------------------------------------------
>
>                 Key: PIG-32
>                 URL: https://issues.apache.org/jira/browse/PIG-32
>             Project: Pig
>          Issue Type: New Feature
>          Components: impl
>            Reporter: Antonio Magnaghi
>            Assignee: Antonio Magnaghi
>         Attachments: DataStorage.diff, DataStorage20071212.diff
>
>
> I'm opening a new issue to track the development work to support an abstraction layer for Pig as defined at http://wiki.apache.org/pig/PigAbstractionLayer

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (PIG-32) Abstraction Layer to decouple Pig from Back-End

Posted by "Antonio Magnaghi (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/PIG-32?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12563807#action_12563807 ] 

Antonio Magnaghi commented on PIG-32:
-------------------------------------

Have run end to end test and they pass.

> Abstraction Layer to decouple Pig from Back-End
> -----------------------------------------------
>
>                 Key: PIG-32
>                 URL: https://issues.apache.org/jira/browse/PIG-32
>             Project: Pig
>          Issue Type: New Feature
>          Components: impl
>            Reporter: Antonio Magnaghi
>            Assignee: Antonio Magnaghi
>         Attachments: 2008.01.29.patch, DataStorage.diff, DataStorage20071212.diff, patch.2008.01.16.merge_w_trunk, patch.2008.01.23.diff, patch2007_12_26.diff, patch2007_12_26_II.diff, patch2007_12_27.diff, pig.jar, pig.jar.2008.01.16, TEST.LOGS
>
>
> I'm opening a new issue to track the development work to support an abstraction layer for Pig as defined at http://wiki.apache.org/pig/PigAbstractionLayer

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (PIG-32) Abstraction Layer to decouple Pig from Back-End

Posted by "Benjamin Francisoud (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/PIG-32?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12573842#action_12573842 ] 

Benjamin Francisoud commented on PIG-32:
----------------------------------------

+1 for closing it

> Abstraction Layer to decouple Pig from Back-End
> -----------------------------------------------
>
>                 Key: PIG-32
>                 URL: https://issues.apache.org/jira/browse/PIG-32
>             Project: Pig
>          Issue Type: New Feature
>          Components: impl
>            Reporter: Antonio Magnaghi
>            Assignee: Antonio Magnaghi
>         Attachments: 2008.01.29.patch, DataStorage.diff, DataStorage20071212.diff, patch.2008.01.16.merge_w_trunk, patch.2008.01.23.diff, PATCH.2008.01.31, patch.2008.02.04, patch2007_12_26.diff, patch2007_12_26_II.diff, patch2007_12_27.diff, PIG-32.jdk1.5.patch, pig.jar, pig.jar.2008.01.16, TEST.LOG, TEST.LOGS
>
>
> I'm opening a new issue to track the development work to support an abstraction layer for Pig as defined at http://wiki.apache.org/pig/PigAbstractionLayer

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (PIG-32) Abstraction Layer to decouple Pig from Back-End

Posted by "Craig Macdonald (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/PIG-32?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12565430#action_12565430 ] 

Craig Macdonald commented on PIG-32:
------------------------------------

Thanks Antonio, I can confirm that this patch fixes the issue with local execution that I described above.

C

> Abstraction Layer to decouple Pig from Back-End
> -----------------------------------------------
>
>                 Key: PIG-32
>                 URL: https://issues.apache.org/jira/browse/PIG-32
>             Project: Pig
>          Issue Type: New Feature
>          Components: impl
>            Reporter: Antonio Magnaghi
>            Assignee: Antonio Magnaghi
>         Attachments: 2008.01.29.patch, DataStorage.diff, DataStorage20071212.diff, patch.2008.01.16.merge_w_trunk, patch.2008.01.23.diff, PATCH.2008.01.31, patch.2008.02.04, patch2007_12_26.diff, patch2007_12_26_II.diff, patch2007_12_27.diff, PIG-32.jdk1.5.patch, pig.jar, pig.jar.2008.01.16, TEST.LOG, TEST.LOGS
>
>
> I'm opening a new issue to track the development work to support an abstraction layer for Pig as defined at http://wiki.apache.org/pig/PigAbstractionLayer

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (PIG-32) Abstraction Layer to decouple Pig from Back-End

Posted by "Antonio Magnaghi (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/PIG-32?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Antonio Magnaghi updated PIG-32:
--------------------------------

    Attachment:     (was: DataStorageElementDescriptor.java)

> Abstraction Layer to decouple Pig from Back-End
> -----------------------------------------------
>
>                 Key: PIG-32
>                 URL: https://issues.apache.org/jira/browse/PIG-32
>             Project: Pig
>          Issue Type: New Feature
>          Components: impl
>            Reporter: Antonio Magnaghi
>            Assignee: Antonio Magnaghi
>         Attachments: DataStorageContainerDescriptor.java, DataStorageException.java, HDataStorage_2007_12_04.tar, Properties.java, PropertiesObject.java, SeekableInputStream.java, TestPropertiesObject.java
>
>
> I'm opening a new issue to track the development work to support an abstraction layer for Pig as defined at http://wiki.apache.org/pig/PigAbstractionLayer

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (PIG-32) Abstraction Layer to decouple Pig from Back-End

Posted by "Benjamin Francisoud (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/PIG-32?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12564725#action_12564725 ] 

Benjamin Francisoud commented on PIG-32:
----------------------------------------

I think this commit broke compatibility with java 5 :(

For example in PigServer line 281:
{code:java}
catch (ExecException e) {
    throw new IOException("Unable to open iterator for alias: " + id, e);
}
{code}

This constructor is only available in java 6:
* [java 5 IOException api|http://java.sun.com/j2se/1.5.0/docs/api/java/io/IOException.html#IOException()]
* [java 6 IOException api|http://java.sun.com/javase/6/docs/api/java/io/IOException.html#IOException(java.lang.String,%20java.lang.Throwable)]

I made the same mistake in PIG-80 ;) 
I made my local eclipse use a 1.5 jdk now to avoid this kind of mistakes...

See [pig-user mailing list|http://www.mail-archive.com/pig-user%40incubator.apache.org/msg00052.html] and [Getting started wiki page|http://wiki.apache.org/pig/GettingStarted] for details.

> Abstraction Layer to decouple Pig from Back-End
> -----------------------------------------------
>
>                 Key: PIG-32
>                 URL: https://issues.apache.org/jira/browse/PIG-32
>             Project: Pig
>          Issue Type: New Feature
>          Components: impl
>            Reporter: Antonio Magnaghi
>            Assignee: Antonio Magnaghi
>         Attachments: 2008.01.29.patch, DataStorage.diff, DataStorage20071212.diff, patch.2008.01.16.merge_w_trunk, patch.2008.01.23.diff, PATCH.2008.01.31, patch2007_12_26.diff, patch2007_12_26_II.diff, patch2007_12_27.diff, pig.jar, pig.jar.2008.01.16, TEST.LOG, TEST.LOGS
>
>
> I'm opening a new issue to track the development work to support an abstraction layer for Pig as defined at http://wiki.apache.org/pig/PigAbstractionLayer

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (PIG-32) Abstraction Layer to decouple Pig from Back-End

Posted by "Utkarsh Srivastava (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/PIG-32?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12554500 ] 

Utkarsh Srivastava commented on PIG-32:
---------------------------------------

Haven't looked over the changes in detail. Looks good though. I am guessing all the tests should be easily modifiable to deal with the new code? Or would those need to change substantially as well?

> Abstraction Layer to decouple Pig from Back-End
> -----------------------------------------------
>
>                 Key: PIG-32
>                 URL: https://issues.apache.org/jira/browse/PIG-32
>             Project: Pig
>          Issue Type: New Feature
>          Components: impl
>            Reporter: Antonio Magnaghi
>            Assignee: Antonio Magnaghi
>         Attachments: DataStorage.diff, DataStorage20071212.diff, patch2007_12_26.diff
>
>
> I'm opening a new issue to track the development work to support an abstraction layer for Pig as defined at http://wiki.apache.org/pig/PigAbstractionLayer

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (PIG-32) Abstraction Layer to decouple Pig from Back-End

Posted by "Antonio Magnaghi (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/PIG-32?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Antonio Magnaghi updated PIG-32:
--------------------------------

    Attachment:     (was: DataStorageContainerDescriptor.java)

> Abstraction Layer to decouple Pig from Back-End
> -----------------------------------------------
>
>                 Key: PIG-32
>                 URL: https://issues.apache.org/jira/browse/PIG-32
>             Project: Pig
>          Issue Type: New Feature
>          Components: impl
>            Reporter: Antonio Magnaghi
>            Assignee: Antonio Magnaghi
>
> I'm opening a new issue to track the development work to support an abstraction layer for Pig as defined at http://wiki.apache.org/pig/PigAbstractionLayer

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (PIG-32) Abstraction Layer to decouple Pig from Back-End

Posted by "Antonio Magnaghi (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/PIG-32?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Antonio Magnaghi updated PIG-32:
--------------------------------

    Attachment: HDataStorage_2007_12_04.tar

- Attaching the tar file that contains a big chunk of the implementation of the Data Storage API on top of Hadoop file system. (code uses Hadoop file system as a provider of the services from the Data Storage API)

- code includes also initial unit testing (still to be completed)

- other minor items still pending are: better use of exception; get/set active container (that will probably require to refine Data Statorage API); use java.util.Properties instead of "custom" Properties
		

> Abstraction Layer to decouple Pig from Back-End
> -----------------------------------------------
>
>                 Key: PIG-32
>                 URL: https://issues.apache.org/jira/browse/PIG-32
>             Project: Pig
>          Issue Type: New Feature
>          Components: impl
>            Reporter: Antonio Magnaghi
>            Assignee: Antonio Magnaghi
>         Attachments: DataStorage.java, DataStorageContainerDescriptor.java, DataStorageElementDescriptor.java, DataStorageException.java, HDataStorage_2007_12_04.tar, Properties.java, PropertiesObject.java, SeekableInputStream.java, TestPropertiesObject.java
>
>
> I'm opening a new issue to track the development work to support an abstraction layer for Pig as defined at http://wiki.apache.org/pig/PigAbstractionLayer

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (PIG-32) Abstraction Layer to decouple Pig from Back-End

Posted by "Antonio Magnaghi (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/PIG-32?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Antonio Magnaghi updated PIG-32:
--------------------------------

    Attachment:     (was: PropertiesObject.java)

> Abstraction Layer to decouple Pig from Back-End
> -----------------------------------------------
>
>                 Key: PIG-32
>                 URL: https://issues.apache.org/jira/browse/PIG-32
>             Project: Pig
>          Issue Type: New Feature
>          Components: impl
>            Reporter: Antonio Magnaghi
>            Assignee: Antonio Magnaghi
>
> I'm opening a new issue to track the development work to support an abstraction layer for Pig as defined at http://wiki.apache.org/pig/PigAbstractionLayer

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (PIG-32) Abstraction Layer to decouple Pig from Back-End

Posted by "Olga Natkovich (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/PIG-32?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12564280#action_12564280 ] 

Olga Natkovich commented on PIG-32:
-----------------------------------

I ran into problems applying the patch. Antonio, I will need your help tomorrow to sort this out.

> Abstraction Layer to decouple Pig from Back-End
> -----------------------------------------------
>
>                 Key: PIG-32
>                 URL: https://issues.apache.org/jira/browse/PIG-32
>             Project: Pig
>          Issue Type: New Feature
>          Components: impl
>            Reporter: Antonio Magnaghi
>            Assignee: Antonio Magnaghi
>         Attachments: 2008.01.29.patch, DataStorage.diff, DataStorage20071212.diff, patch.2008.01.16.merge_w_trunk, patch.2008.01.23.diff, patch2007_12_26.diff, patch2007_12_26_II.diff, patch2007_12_27.diff, pig.jar, pig.jar.2008.01.16, TEST.LOGS
>
>
> I'm opening a new issue to track the development work to support an abstraction layer for Pig as defined at http://wiki.apache.org/pig/PigAbstractionLayer

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (PIG-32) Abstraction Layer to decouple Pig from Back-End

Posted by "Antonio Magnaghi (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/PIG-32?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Antonio Magnaghi updated PIG-32:
--------------------------------

    Attachment: pig.jar

- I have merged (locally to my machine) the changes for the abstraction layer onto the trunk (see patches previously uploaded). (Have not uploaded the patch for this merge yet)

- Unit tests pass. As already stated in previous patches, there are still few limitations in the changes I have made, that I am working on removing.

- As the bulk of the changes are in place, I would like to start regression testing on what we have at this point for the abstraction layer. Olga, we talked about this. I am attaching the pig.jar file to start an initial round of regression testing. Please let me know when we get the results.

Thanks a lot,
Antonio


> Abstraction Layer to decouple Pig from Back-End
> -----------------------------------------------
>
>                 Key: PIG-32
>                 URL: https://issues.apache.org/jira/browse/PIG-32
>             Project: Pig
>          Issue Type: New Feature
>          Components: impl
>            Reporter: Antonio Magnaghi
>            Assignee: Antonio Magnaghi
>         Attachments: DataStorage.diff, DataStorage20071212.diff, patch2007_12_26.diff, patch2007_12_26_II.diff, patch2007_12_27.diff, pig.jar
>
>
> I'm opening a new issue to track the development work to support an abstraction layer for Pig as defined at http://wiki.apache.org/pig/PigAbstractionLayer

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (PIG-32) Abstraction Layer to decouple Pig from Back-End

Posted by "Antonio Magnaghi (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/PIG-32?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Antonio Magnaghi updated PIG-32:
--------------------------------

    Attachment:     (was: Properties.java)

> Abstraction Layer to decouple Pig from Back-End
> -----------------------------------------------
>
>                 Key: PIG-32
>                 URL: https://issues.apache.org/jira/browse/PIG-32
>             Project: Pig
>          Issue Type: New Feature
>          Components: impl
>            Reporter: Antonio Magnaghi
>            Assignee: Antonio Magnaghi
>
> I'm opening a new issue to track the development work to support an abstraction layer for Pig as defined at http://wiki.apache.org/pig/PigAbstractionLayer

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (PIG-32) Abstraction Layer to decouple Pig from Back-End

Posted by "Antonio Magnaghi (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/PIG-32?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Antonio Magnaghi updated PIG-32:
--------------------------------

    Attachment: patch2007_12_26.diff

The basic functionalities are now in place. Whit this patch, it is now possible to run queries locally and on Hadoop clusters with some limitations as listed below. Have added some tests (TestAbstractionExecutionLayer.java)

In this patch:

- Pig (Front-End) has been completely modified to utilize the Abstraction layer APIs
- Two back-ends (Local and Hadoop) implement the DataStaorage and ExecutionEnginer APIs
- Logical Layer: Removed LORead, and IntermediateResult operators
- Local Execution Engine:
     - Provided support for Split in Local plans (before it was not supported)
     - Re-use of materialized results
     - (working on providing ability to re-use results materialized in memory as well, this will just require to extend LocalResult types that are stored in the table ot materialized results)
- Hadoop Execution Engine:
      - Basic support in place to run Pig queries on Hadoop.
      - (need to provide support for reuse of materialized results and split construct)

In addition, I'm working on adding further end-to-end testing and turning old tests back on (currently most of them are commented out to get Pig to build).

(Note: some files/directory have been deleted or moved. I have marked them as deleted in svn and a commit command from my side would probably take care of this. However, I am not sure svn diff is properly showing that, so I may need to coordinate with one of committers when the patch is applied to the branch.)


> Abstraction Layer to decouple Pig from Back-End
> -----------------------------------------------
>
>                 Key: PIG-32
>                 URL: https://issues.apache.org/jira/browse/PIG-32
>             Project: Pig
>          Issue Type: New Feature
>          Components: impl
>            Reporter: Antonio Magnaghi
>            Assignee: Antonio Magnaghi
>         Attachments: DataStorage.diff, DataStorage20071212.diff, patch2007_12_26.diff
>
>
> I'm opening a new issue to track the development work to support an abstraction layer for Pig as defined at http://wiki.apache.org/pig/PigAbstractionLayer

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (PIG-32) Abstraction Layer to decouple Pig from Back-End

Posted by "Antonio Magnaghi (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/PIG-32?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12554504 ] 

Antonio Magnaghi commented on PIG-32:
-------------------------------------

Utkarsh,

Thanks for taking a preliminary look at the changes. 

Yes, the tests are easily modifiable: I am going to upload shortly
another patch in that direction...

-a.



> Abstraction Layer to decouple Pig from Back-End
> -----------------------------------------------
>
>                 Key: PIG-32
>                 URL: https://issues.apache.org/jira/browse/PIG-32
>             Project: Pig
>          Issue Type: New Feature
>          Components: impl
>            Reporter: Antonio Magnaghi
>            Assignee: Antonio Magnaghi
>         Attachments: DataStorage.diff, DataStorage20071212.diff, patch2007_12_26.diff, patch2007_12_26_II.diff
>
>
> I'm opening a new issue to track the development work to support an abstraction layer for Pig as defined at http://wiki.apache.org/pig/PigAbstractionLayer

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (PIG-32) Abstraction Layer to decouple Pig from Back-End

Posted by "Antonio Magnaghi (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/PIG-32?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Antonio Magnaghi updated PIG-32:
--------------------------------

    Attachment:     (was: HDataStorage_2007_12_04.tar)

> Abstraction Layer to decouple Pig from Back-End
> -----------------------------------------------
>
>                 Key: PIG-32
>                 URL: https://issues.apache.org/jira/browse/PIG-32
>             Project: Pig
>          Issue Type: New Feature
>          Components: impl
>            Reporter: Antonio Magnaghi
>            Assignee: Antonio Magnaghi
>
> I'm opening a new issue to track the development work to support an abstraction layer for Pig as defined at http://wiki.apache.org/pig/PigAbstractionLayer

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (PIG-32) Abstraction Layer to decouple Pig from Back-End

Posted by "Antonio Magnaghi (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/PIG-32?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Antonio Magnaghi updated PIG-32:
--------------------------------

    Attachment:     (was: DataStorage.java)

> Abstraction Layer to decouple Pig from Back-End
> -----------------------------------------------
>
>                 Key: PIG-32
>                 URL: https://issues.apache.org/jira/browse/PIG-32
>             Project: Pig
>          Issue Type: New Feature
>          Components: impl
>            Reporter: Antonio Magnaghi
>            Assignee: Antonio Magnaghi
>         Attachments: DataStorageContainerDescriptor.java, DataStorageException.java, HDataStorage_2007_12_04.tar, Properties.java, PropertiesObject.java, SeekableInputStream.java, TestPropertiesObject.java
>
>
> I'm opening a new issue to track the development work to support an abstraction layer for Pig as defined at http://wiki.apache.org/pig/PigAbstractionLayer

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (PIG-32) Abstraction Layer to decouple Pig from Back-End

Posted by "Antonio Magnaghi (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/PIG-32?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Antonio Magnaghi updated PIG-32:
--------------------------------

    Attachment: patch2007_12_26_II.diff

This patch includes the one uploaded to the jira earlier this morning plus the fixes as below:

- MapReduceCompiler: support split operation for Map-Reduce (this portion of the code changed because I've removed from the logical plan LORead's)

- Introduced HadoopJob class with status information and ability to iterate over results from a map-reduce job

- Extended TestAbstractionExecutionEngine.java with tests for split and iteration

- Fixed src code of all tests in the test suite

- Obtained a preliminary pass of all tests in the test suite other than 4 test cases that I'm debugging.


> Abstraction Layer to decouple Pig from Back-End
> -----------------------------------------------
>
>                 Key: PIG-32
>                 URL: https://issues.apache.org/jira/browse/PIG-32
>             Project: Pig
>          Issue Type: New Feature
>          Components: impl
>            Reporter: Antonio Magnaghi
>            Assignee: Antonio Magnaghi
>         Attachments: DataStorage.diff, DataStorage20071212.diff, patch2007_12_26.diff, patch2007_12_26_II.diff
>
>
> I'm opening a new issue to track the development work to support an abstraction layer for Pig as defined at http://wiki.apache.org/pig/PigAbstractionLayer

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (PIG-32) Abstraction Layer to decouple Pig from Back-End

Posted by "Antonio Magnaghi (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/PIG-32?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12551817 ] 

Antonio Magnaghi commented on PIG-32:
-------------------------------------

Attaching to the bug this high level summary that I sent out to the mailing list few days back.

Have discussed this with Ben, one aspect we talked about was to estend the API to provide a way to collect logging and debugging information.

________________________________________
From: Antonio Magnaghi 
Sent: Monday, December 10, 2007 9:29 AM
To: 'pig-dev@incubator.apache.org'
Subject: Abstraction layer: execution engine (PIG-32)

I'm starting to work on the portion of the abstraction layer about the execution engine for the separation of front-end from back-end. 

Based on some previous discussions with various folks, including Trevor Strohman from the Galago project, I think it is possible to identify some requirements/changes that I've summarize below (in addition to what is currently posted at: http://wiki.apache.org/pig/PigAbstractionLayer.)

I would like to get some feedback on these points and whether I have left out aspects that'd need to be considered as well.

Thanks,
-a.


Front-End:
Change logical plan representation: goal is to change the representation of logical plans so that: 
•	details pertaining to the physical query plan execution are not present anymore in the front-end; 
•	a new logical plan submitted to the back-end can reference a portion (or alias) of another logical plan

Aspects affected by the changes above are:
1.	need to remove data collectors and logic to manage data-pipes from the eval specs and cond's of logical operators. These data structures are used in the case of the local execution mode. We can add physical eval specs and cond's where data pipes and data collectors are set up. This has the disadvantage of creating extra code (similar to the code for logical eval specs and logical cond's), but the overall separation of the logical aspects from the physical execution should be much cleaner.
2.	need to remove the table of query results, where aliases are mapped to intermediate results. This data structure is populated when the logical plan is compiled. The concept of intermediate results does not seem to belong in the front-end. (Information about the generation of intermediate results will be maintained in the back-end)
3.	extend representation of logical operators assigning to them a scope and a unique id within the scope. The motivation for doing this would be that new logical plans submitted to the back end can reference previous logical plans (or parts of it) via a (scope id, node id) pair. Having the concept of scope can provide support in the back-end for purging information about entities that go out of scope. For instance, the session id could be used as scope to garbage collect entities in the back-end no longer needed.
4.	need to add a catalog that maps aliases to logical trees. For instance, when a store operation is encountered, the front-end can determine the set of dependent logical trees to serialize and send to the back-end or (scope, id) of previous plans to reference. 
5.	Serialization process from the front-end to the back-end can produce a representation of the logical plan and its dependencies that include (scope, id) of each operators to send to the back end.

Back-End:
1.	back-end would maintain table of intermediate results
2.	compilation of logical plan to physical plan would take place in the back-end
3.	a local back-end would generate physical trees using the physical eval specs and physical cond's (as described above)
4.	a Hadoop back-end would compile logical plan to map/reduce




> Abstraction Layer to decouple Pig from Back-End
> -----------------------------------------------
>
>                 Key: PIG-32
>                 URL: https://issues.apache.org/jira/browse/PIG-32
>             Project: Pig
>          Issue Type: New Feature
>          Components: impl
>            Reporter: Antonio Magnaghi
>            Assignee: Antonio Magnaghi
>         Attachments: DataStorage.diff, DataStorage20071212.diff
>
>
> I'm opening a new issue to track the development work to support an abstraction layer for Pig as defined at http://wiki.apache.org/pig/PigAbstractionLayer

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (PIG-32) Abstraction Layer to decouple Pig from Back-End

Posted by "Olga Natkovich (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/PIG-32?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12551150 ] 

Olga Natkovich commented on PIG-32:
-----------------------------------

I committed the patch. We will do the review once we are ready to merge the changes into the main branch.

> Abstraction Layer to decouple Pig from Back-End
> -----------------------------------------------
>
>                 Key: PIG-32
>                 URL: https://issues.apache.org/jira/browse/PIG-32
>             Project: Pig
>          Issue Type: New Feature
>          Components: impl
>            Reporter: Antonio Magnaghi
>            Assignee: Antonio Magnaghi
>         Attachments: DataStorage.diff, DataStorage20071212.diff
>
>
> I'm opening a new issue to track the development work to support an abstraction layer for Pig as defined at http://wiki.apache.org/pig/PigAbstractionLayer

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (PIG-32) Abstraction Layer to decouple Pig from Back-End

Posted by "Olga Natkovich (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/PIG-32?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12549201 ] 

Olga Natkovich commented on PIG-32:
-----------------------------------

Antonio, could you just create a patch file that I can apply.

> Abstraction Layer to decouple Pig from Back-End
> -----------------------------------------------
>
>                 Key: PIG-32
>                 URL: https://issues.apache.org/jira/browse/PIG-32
>             Project: Pig
>          Issue Type: New Feature
>          Components: impl
>            Reporter: Antonio Magnaghi
>            Assignee: Antonio Magnaghi
>         Attachments: DataStorage.java, DataStorageContainerDescriptor.java, DataStorageElementDescriptor.java, DataStorageException.java, HDataStorage_2007_12_04.tar, Properties.java, PropertiesObject.java, SeekableInputStream.java, TestPropertiesObject.java
>
>
> I'm opening a new issue to track the development work to support an abstraction layer for Pig as defined at http://wiki.apache.org/pig/PigAbstractionLayer

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (PIG-32) Abstraction Layer to decouple Pig from Back-End

Posted by "Craig Macdonald (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/PIG-32?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12565379#action_12565379 ] 

Craig Macdonald commented on PIG-32:
------------------------------------

{{{
[user@blabla trunk]$ scripts/pig.pl -x local
I can't find HOD configuration for , hopefully you weren't planning on using HOD.
java.lang.ClassCastException: org.apache.pig.backend.local.executionengine.LocalExecutionEngine cannot be cast to org.apache.pig.backend.hadoop.executionengine.HExecutionEngine
        at org.apache.pig.tools.grunt.GruntParser.setParams(GruntParser.java:93)
        at org.apache.pig.tools.grunt.Grunt.<init>(Grunt.java:42)
        at org.apache.pig.Main.main(Main.java:245)
}}}

No pig script run - just trying top open a grunt shell for local execution (revision 618301).

C


> Abstraction Layer to decouple Pig from Back-End
> -----------------------------------------------
>
>                 Key: PIG-32
>                 URL: https://issues.apache.org/jira/browse/PIG-32
>             Project: Pig
>          Issue Type: New Feature
>          Components: impl
>            Reporter: Antonio Magnaghi
>            Assignee: Antonio Magnaghi
>         Attachments: 2008.01.29.patch, DataStorage.diff, DataStorage20071212.diff, patch.2008.01.16.merge_w_trunk, patch.2008.01.23.diff, PATCH.2008.01.31, patch2007_12_26.diff, patch2007_12_26_II.diff, patch2007_12_27.diff, PIG-32.jdk1.5.patch, pig.jar, pig.jar.2008.01.16, TEST.LOG, TEST.LOGS
>
>
> I'm opening a new issue to track the development work to support an abstraction layer for Pig as defined at http://wiki.apache.org/pig/PigAbstractionLayer

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (PIG-32) Abstraction Layer to decouple Pig from Back-End

Posted by "Antonio Magnaghi (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/PIG-32?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Antonio Magnaghi updated PIG-32:
--------------------------------

    Attachment: TEST.LOGS
                2008.01.29.patch

Thanks to Olga and Alan for taking the time to review the changes, today we finished going through the patch.

I have incorporated today's feedback and feedback from previous code reviews.

Here I am attaching:

- the latest path, after completing code review and merging with the trunk (version 616575)

- the log with the results from running unit tests on the latest patch.



> Abstraction Layer to decouple Pig from Back-End
> -----------------------------------------------
>
>                 Key: PIG-32
>                 URL: https://issues.apache.org/jira/browse/PIG-32
>             Project: Pig
>          Issue Type: New Feature
>          Components: impl
>            Reporter: Antonio Magnaghi
>            Assignee: Antonio Magnaghi
>         Attachments: 2008.01.29.patch, DataStorage.diff, DataStorage20071212.diff, patch.2008.01.16.merge_w_trunk, patch.2008.01.23.diff, patch2007_12_26.diff, patch2007_12_26_II.diff, patch2007_12_27.diff, pig.jar, pig.jar.2008.01.16, TEST.LOGS
>
>
> I'm opening a new issue to track the development work to support an abstraction layer for Pig as defined at http://wiki.apache.org/pig/PigAbstractionLayer

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (PIG-32) Abstraction Layer to decouple Pig from Back-End

Posted by "Antonio Magnaghi (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/PIG-32?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Antonio Magnaghi updated PIG-32:
--------------------------------

    Attachment: TestPropertiesObject.java
                PropertiesObject.java
                Properties.java

Initial implementation of the configuration portion as from design spec.

> Abstraction Layer to decouple Pig from Back-End
> -----------------------------------------------
>
>                 Key: PIG-32
>                 URL: https://issues.apache.org/jira/browse/PIG-32
>             Project: Pig
>          Issue Type: New Feature
>          Components: impl
>            Reporter: Antonio Magnaghi
>            Assignee: Antonio Magnaghi
>         Attachments: Properties.java, PropertiesObject.java, TestPropertiesObject.java
>
>
> I'm opening a new issue to track the development work to support an abstraction layer for Pig as defined at http://wiki.apache.org/pig/PigAbstractionLayer

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (PIG-32) Abstraction Layer to decouple Pig from Back-End

Posted by "Antonio Magnaghi (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/PIG-32?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12565378#action_12565378 ] 

Antonio Magnaghi commented on PIG-32:
-------------------------------------

Can you please attach a code sample to reproduce the problem?

Thanks
-a.



> Abstraction Layer to decouple Pig from Back-End
> -----------------------------------------------
>
>                 Key: PIG-32
>                 URL: https://issues.apache.org/jira/browse/PIG-32
>             Project: Pig
>          Issue Type: New Feature
>          Components: impl
>            Reporter: Antonio Magnaghi
>            Assignee: Antonio Magnaghi
>         Attachments: 2008.01.29.patch, DataStorage.diff, DataStorage20071212.diff, patch.2008.01.16.merge_w_trunk, patch.2008.01.23.diff, PATCH.2008.01.31, patch2007_12_26.diff, patch2007_12_26_II.diff, patch2007_12_27.diff, PIG-32.jdk1.5.patch, pig.jar, pig.jar.2008.01.16, TEST.LOG, TEST.LOGS
>
>
> I'm opening a new issue to track the development work to support an abstraction layer for Pig as defined at http://wiki.apache.org/pig/PigAbstractionLayer

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (PIG-32) Abstraction Layer to decouple Pig from Back-End

Posted by "Antonio Magnaghi (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/PIG-32?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12547198 ] 

Antonio Magnaghi commented on PIG-32:
-------------------------------------

Attaching some feedback from Trevor (Galago project)

________________________________________
From: Trevor Strohman [mailto:strohman@cs.umass.edu] 
Sent: Wednesday, November 21, 2007 5:01 PM
To: Antonio Magnaghi
Subject: Re: galago


Antonio,

Wow, you've done a lot of work here.  This looks great.  I hope you end up with lots of other backends.

I'll just give you comments as I read the PigAbstractionLayer page.  Feel free to e-mail again if you want different (or more) information.

The DataStorage interface looks great.  I'd consider using this in Galago for file storage (I've always wanted to make the Hadoop DFS an option for data storage in Galago).  However, since Galago uses the native filesystem right now, I wouldn't have to implement this interface.

Should addFromResource be a part of the configuration interface?  This isn't something I want to implement myself (assuming there will be lots of these PigBackEndProperties objects around).  Maybe you could have a standard implementation that I could use.

A suggestion for the getStatistics() method in ExecutionEngine: perhaps part of the statistics object could be a set of objects that can be tracked using Java Management Extensions (JMX).  At some point I plan to make Galago JMX-ready, which would give you a lot of information about current running jobs, etc.

You might want a method on ExecutionEnginePhysicalPlan that allows the caller to block waiting for completion.

==
I think the API as specified seems like something I could implement for Galago.

It's not clear from the API how a new LogicalPlan can refer to results generated by previous LogicalPlans that have already been compiled and executed.  I never made this work in Galago with the current implementation.  Also, it seems like you might want to be able to ask a completed PhysicalPlan for a particular computed tuple stream.  Again, I never figured out how to do that in the current Pig (at least not in a way that would work with Galago).

Trevor

On Nov 21, 2007, at 5:57 PM, Antonio Magnaghi wrote:


Hi Trevor,
 
I would like to follow up on the email exchange we had few weeks ago about Galago and Pig.
 
In particular, at YRL we have decided to suggest, inside the Apache Pig incubator, some extensions to Pig that could make it easier to integrate Pig with different back-ends. The main approach is outlined at: http://wiki.apache.org/pig/PigAbstractionLayer.
 
At this point in time, I'm collecting some initial feedback before starting the actual implementation. Do you have possible requirements in order to allow Pig to better support Galago? As you have direct experience on some of the issues involved, I'd appreciate if you could share some of your thoughts on the design proposed.
 
Thanks,
Antonio
 
________________________________________
From: Trevor Strohman [mailto:strohman@cs.umass.edu] 
Sent: Tuesday, October 23, 2007 10:59 AM
To: Antonio Magnaghi
Subject: Re: galago
 
 
Antonio,
 
I'll do my best to answer your questions by e-mail, but you might also find it useful to download the Galago code and my version of Pig.  In the galago/java/pig-galago directory, you'll find a file called "pig-galago.patch" which contains all of the changes I made to the current Pig distribution to make it work with Galago.  The whole download is here:
            http://galagosearch.org/downloads
 
Before I start, I should mention that Galago is primarily meant to be a search engine toolkit, kind of like Lucene.  It happens to have its own MapReduce-like job execution engine called TupleFlow, and Pig can run on top of that.  TupleFlow has some similarities to the Pig model, in that strongly-typed tuples flow between computational steps to create an answer.



1.) the high-level language the user can utilize to specify the tuple-processing;
 
Users usually create TupleFlow jobs by creating an XML job specification.  The job specification allows the user to describe what Java objects will be used and how they should be connected together in an execution graph.  TupleFlow then schedules these components out onto computational nodes, sometimes with the help of a job execution system (like Grid Engine or Condor).  TupleFlow is probably most similar to Microsoft's Dryad system.
 
In the Pig/Galago port, I translate Pig jobs into TupleFlow jobs in code, so no XML files are made.
 
2.) how the tuple processing specification is mapped to a physical processing plan;
 
I know that Pig has both a high-level and low-level specification.  Compared to Pig, TupleFlow really only has a low-level processing language.  Pig is TupleFlow's high level language (when I want one).



3.) what type of platform/computational model is used.
 
I'm not exactly sure how to answer this.  It's all in Java, and objects are passed around using files on a shared file system.  Unlike Pig, Galago typically creates a different Java class for each type of tuple sent through the system.  When running Pig jobs on Galago, I've hacked Galago a little bit to allow it to use Pig's Tuple type.



 I understand that the data/tuple processing is carried out by porting/extending the Pig or Pig-like front-end to run on a back-end that is not Hadoop/map-reduce? Is this correct?
 
Yes, that's right.  It might be best to think of TupleFlow as something that implements most of the physical layer of Pig as well as a MapReduce execution engine.
 
Trevor
 

> Abstraction Layer to decouple Pig from Back-End
> -----------------------------------------------
>
>                 Key: PIG-32
>                 URL: https://issues.apache.org/jira/browse/PIG-32
>             Project: Pig
>          Issue Type: New Feature
>          Components: impl
>            Reporter: Antonio Magnaghi
>            Assignee: Antonio Magnaghi
>
> I'm opening a new issue to track the development work to support an abstraction layer for Pig as defined at http://wiki.apache.org/pig/PigAbstractionLayer

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (PIG-32) Abstraction Layer to decouple Pig from Back-End

Posted by "Craig Macdonald (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/PIG-32?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12565303#action_12565303 ] 

Craig Macdonald commented on PIG-32:
------------------------------------

Hello,

the PIG-32 commit (r617338) broke local execution:
{{{
java.lang.ClassCastException: org.apache.pig.backend.local.executionengine.LocalExecutionEngine cannot be cast to org.apache.pig.backend.hadoop.executionengine.HExecutionEngine
        at org.apache.pig.tools.grunt.GruntParser.setParams(GruntParser.java:93)
        at org.apache.pig.tools.grunt.Grunt.<init>(Grunt.java:42)
        at org.apache.pig.Main.main(Main.java:245)
}}}

The relevant code from GruntParser is:
{{{
        // TODO: this violates the abstraction layer decoupling between
        // front end and back end and needs to be changed.
        // Right now I am not clear on how the Job Id comes from to tell
        // the back end to kill a given job (mJobClient is used only in
        // processKill)
        //
        mJobClient = ((HExecutionEngine)(mPigServer.getPigContext().getExecutionEngine())).getJobClient();
}}}

Thanks

C

> Abstraction Layer to decouple Pig from Back-End
> -----------------------------------------------
>
>                 Key: PIG-32
>                 URL: https://issues.apache.org/jira/browse/PIG-32
>             Project: Pig
>          Issue Type: New Feature
>          Components: impl
>            Reporter: Antonio Magnaghi
>            Assignee: Antonio Magnaghi
>         Attachments: 2008.01.29.patch, DataStorage.diff, DataStorage20071212.diff, patch.2008.01.16.merge_w_trunk, patch.2008.01.23.diff, PATCH.2008.01.31, patch2007_12_26.diff, patch2007_12_26_II.diff, patch2007_12_27.diff, PIG-32.jdk1.5.patch, pig.jar, pig.jar.2008.01.16, TEST.LOG, TEST.LOGS
>
>
> I'm opening a new issue to track the development work to support an abstraction layer for Pig as defined at http://wiki.apache.org/pig/PigAbstractionLayer

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (PIG-32) Abstraction Layer to decouple Pig from Back-End

Posted by "Antonio Magnaghi (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/PIG-32?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Antonio Magnaghi updated PIG-32:
--------------------------------

    Attachment: TEST.LOG
                PATCH.2008.01.31

I have isolated the problem.

During the compilation process of MR jobs, in some instances (like when a logical operator is an LOEval: in the case of TestPigSplit we have a long chain of 500 LOEval's) the copy method is called on the compiled input. The copy method performs a copy via serialization/deserialization of the input MR job. 

In the current tree represenation that we are using, each physical operator contains a pointer to the global table of physical operators that define the operator tree. In the initial implementation, the copy method in the Abstraction Layer patch was not avoiding a useless serialization/deserialization of the opTable.

In this specific test case, this was causing a significant time overhead.

I have attahced a patch that fixes the problem.

The unit tests pass and the unit test logs attached show execution times that seem to be in line with the execution times before the AL patch.

I have also check that the regression tests still pass:
=== Regression test results ===
tail /tmp/miners_test_harness_log_1201817146

[...]
Results so far, PASSED: 102 FAILED: 0 ABORTED: 0 FAILED DEPENDENCY: 0
Final results, PASSED: 102 FAILED: 0 ABORTED: 0 FAILED DEPENDENCY: 0
Finished test run at 1201821034


> Abstraction Layer to decouple Pig from Back-End
> -----------------------------------------------
>
>                 Key: PIG-32
>                 URL: https://issues.apache.org/jira/browse/PIG-32
>             Project: Pig
>          Issue Type: New Feature
>          Components: impl
>            Reporter: Antonio Magnaghi
>            Assignee: Antonio Magnaghi
>         Attachments: 2008.01.29.patch, DataStorage.diff, DataStorage20071212.diff, patch.2008.01.16.merge_w_trunk, patch.2008.01.23.diff, PATCH.2008.01.31, patch2007_12_26.diff, patch2007_12_26_II.diff, patch2007_12_27.diff, pig.jar, pig.jar.2008.01.16, TEST.LOG, TEST.LOGS
>
>
> I'm opening a new issue to track the development work to support an abstraction layer for Pig as defined at http://wiki.apache.org/pig/PigAbstractionLayer

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (PIG-32) Abstraction Layer to decouple Pig from Back-End

Posted by "Antonio Magnaghi (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/PIG-32?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Antonio Magnaghi updated PIG-32:
--------------------------------

    Attachment: patch.2008.01.16.merge_w_trunk
                pig.jar.2008.01.16

- uploading a new pig.jar that includes some fixes for the grunt parser in relation to the AB  work. This should allow grunt to properly connect to the cluster and un-block end to end testing. (Have verified the fixes by  manually running some pig programs via grunt)

- uploading the patch for sources and unit tests that merges the changes for AB onto the trunk (as of 01/14).



> Abstraction Layer to decouple Pig from Back-End
> -----------------------------------------------
>
>                 Key: PIG-32
>                 URL: https://issues.apache.org/jira/browse/PIG-32
>             Project: Pig
>          Issue Type: New Feature
>          Components: impl
>            Reporter: Antonio Magnaghi
>            Assignee: Antonio Magnaghi
>         Attachments: DataStorage.diff, DataStorage20071212.diff, patch.2008.01.16.merge_w_trunk, patch2007_12_26.diff, patch2007_12_26_II.diff, patch2007_12_27.diff, pig.jar, pig.jar.2008.01.16
>
>
> I'm opening a new issue to track the development work to support an abstraction layer for Pig as defined at http://wiki.apache.org/pig/PigAbstractionLayer

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (PIG-32) Abstraction Layer to decouple Pig from Back-End

Posted by "Antonio Magnaghi (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/PIG-32?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Antonio Magnaghi updated PIG-32:
--------------------------------

    Attachment: PIG-32.jdk1.5.patch

attaching patch:
 -  to make the changes in the abstraction layer backword compatible with jdk1.5
 - DataStorageException and ExecException subclass Exception as initially intended

> Abstraction Layer to decouple Pig from Back-End
> -----------------------------------------------
>
>                 Key: PIG-32
>                 URL: https://issues.apache.org/jira/browse/PIG-32
>             Project: Pig
>          Issue Type: New Feature
>          Components: impl
>            Reporter: Antonio Magnaghi
>            Assignee: Antonio Magnaghi
>         Attachments: 2008.01.29.patch, DataStorage.diff, DataStorage20071212.diff, patch.2008.01.16.merge_w_trunk, patch.2008.01.23.diff, PATCH.2008.01.31, patch2007_12_26.diff, patch2007_12_26_II.diff, patch2007_12_27.diff, PIG-32.jdk1.5.patch, pig.jar, pig.jar.2008.01.16, TEST.LOG, TEST.LOGS
>
>
> I'm opening a new issue to track the development work to support an abstraction layer for Pig as defined at http://wiki.apache.org/pig/PigAbstractionLayer

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (PIG-32) Abstraction Layer to decouple Pig from Back-End

Posted by "Antonio Magnaghi (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/PIG-32?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Antonio Magnaghi updated PIG-32:
--------------------------------

    Attachment: patch.2008.02.04

Patch for:

- restore backward compatibility with jdk5

- fix local execution mode: mJobClient is not avaiable in local mode. Have restored a behavior compatible with what we were doing before the break-up of front-end from back-end. Unfortunately this problem went unnoticed as currently there are not unit or regression tests for grunt. Some time ago have opened a bug to track this.

> Abstraction Layer to decouple Pig from Back-End
> -----------------------------------------------
>
>                 Key: PIG-32
>                 URL: https://issues.apache.org/jira/browse/PIG-32
>             Project: Pig
>          Issue Type: New Feature
>          Components: impl
>            Reporter: Antonio Magnaghi
>            Assignee: Antonio Magnaghi
>         Attachments: 2008.01.29.patch, DataStorage.diff, DataStorage20071212.diff, patch.2008.01.16.merge_w_trunk, patch.2008.01.23.diff, PATCH.2008.01.31, patch.2008.02.04, patch2007_12_26.diff, patch2007_12_26_II.diff, patch2007_12_27.diff, PIG-32.jdk1.5.patch, pig.jar, pig.jar.2008.01.16, TEST.LOG, TEST.LOGS
>
>
> I'm opening a new issue to track the development work to support an abstraction layer for Pig as defined at http://wiki.apache.org/pig/PigAbstractionLayer

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (PIG-32) Abstraction Layer to decouple Pig from Back-End

Posted by "Antonio Magnaghi (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/PIG-32?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Antonio Magnaghi updated PIG-32:
--------------------------------

    Attachment:     (was: TestPropertiesObject.java)

> Abstraction Layer to decouple Pig from Back-End
> -----------------------------------------------
>
>                 Key: PIG-32
>                 URL: https://issues.apache.org/jira/browse/PIG-32
>             Project: Pig
>          Issue Type: New Feature
>          Components: impl
>            Reporter: Antonio Magnaghi
>            Assignee: Antonio Magnaghi
>
> I'm opening a new issue to track the development work to support an abstraction layer for Pig as defined at http://wiki.apache.org/pig/PigAbstractionLayer

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (PIG-32) Abstraction Layer to decouple Pig from Back-End

Posted by "Antonio Magnaghi (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/PIG-32?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Antonio Magnaghi updated PIG-32:
--------------------------------

    Attachment: DataStorage.diff

Have removed previous partial attachments (sorry for the "noise/spam" on the list) and am attaching a pacth for feedback/review:

- (1.) the data storage API (in the form of a set of interfaces)

- (2.) the code that implements the abstract API services via Hadoop

- (3.) unit tests for (2.) 

> Abstraction Layer to decouple Pig from Back-End
> -----------------------------------------------
>
>                 Key: PIG-32
>                 URL: https://issues.apache.org/jira/browse/PIG-32
>             Project: Pig
>          Issue Type: New Feature
>          Components: impl
>            Reporter: Antonio Magnaghi
>            Assignee: Antonio Magnaghi
>         Attachments: DataStorage.diff
>
>
> I'm opening a new issue to track the development work to support an abstraction layer for Pig as defined at http://wiki.apache.org/pig/PigAbstractionLayer

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (PIG-32) Abstraction Layer to decouple Pig from Back-End

Posted by "Antonio Magnaghi (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/PIG-32?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Antonio Magnaghi updated PIG-32:
--------------------------------

    Attachment: DataStorageElementDescriptor.java
                DataStorageContainerDescriptor.java
                DataStorage.java

> Abstraction Layer to decouple Pig from Back-End
> -----------------------------------------------
>
>                 Key: PIG-32
>                 URL: https://issues.apache.org/jira/browse/PIG-32
>             Project: Pig
>          Issue Type: New Feature
>          Components: impl
>            Reporter: Antonio Magnaghi
>            Assignee: Antonio Magnaghi
>         Attachments: DataStorage.java, DataStorageContainerDescriptor.java, DataStorageElementDescriptor.java, Properties.java, PropertiesObject.java, TestPropertiesObject.java
>
>
> I'm opening a new issue to track the development work to support an abstraction layer for Pig as defined at http://wiki.apache.org/pig/PigAbstractionLayer

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (PIG-32) Abstraction Layer to decouple Pig from Back-End

Posted by "Olga Natkovich (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/PIG-32?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12571623#action_12571623 ] 

Olga Natkovich commented on PIG-32:
-----------------------------------

Can this bug be closed?

> Abstraction Layer to decouple Pig from Back-End
> -----------------------------------------------
>
>                 Key: PIG-32
>                 URL: https://issues.apache.org/jira/browse/PIG-32
>             Project: Pig
>          Issue Type: New Feature
>          Components: impl
>            Reporter: Antonio Magnaghi
>            Assignee: Antonio Magnaghi
>         Attachments: 2008.01.29.patch, DataStorage.diff, DataStorage20071212.diff, patch.2008.01.16.merge_w_trunk, patch.2008.01.23.diff, PATCH.2008.01.31, patch.2008.02.04, patch2007_12_26.diff, patch2007_12_26_II.diff, patch2007_12_27.diff, PIG-32.jdk1.5.patch, pig.jar, pig.jar.2008.01.16, TEST.LOG, TEST.LOGS
>
>
> I'm opening a new issue to track the development work to support an abstraction layer for Pig as defined at http://wiki.apache.org/pig/PigAbstractionLayer

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.