You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@pig.apache.org by "Olga Natkovich (JIRA)" <ji...@apache.org> on 2009/05/29 20:26:45 UTC

[jira] Created: (PIG-824) SQL interface for Pig

SQL interface for Pig
---------------------

                 Key: PIG-824
                 URL: https://issues.apache.org/jira/browse/PIG-824
             Project: Pig
          Issue Type: New Feature
            Reporter: Olga Natkovich


In the last 18 month PigLatin has gained significant popularity within the open source community. Many users like its data flow model, its rich type system and its ability to work with any data available on HDFS or outside. We have also heard from many users that having Pig speak SQL would bring many more users. Having a single system that exports multiple interfaces is a big advantage as it guarantees consistent semantics, custom code reuse, and reduces the amount of maintenance. This is especially relevant for project where using both interfaces for different parts of the system is relevant.  For instance, in a 
data warehousing system, you would have ETL component that brings data  into the warehouse and a component that analyzes the data and produces reports. PigLatin is uniquely suited for ETL processing while SQL might be a better fit for report generation.

To start, it would make sense to implement a subset of SQL92 standard and to be as much as possible standard compliant. This would include all the standard constructs: select, from, where, group-by + having, order by, limit, join (inner + outer). Several extensions  such as support for pig's UDFs and possibly streaming, multiquery and support for pig's complex types would be helpful.

This work is dependent on metadata support outlined in https://issues.apache.org/jira/browse/PIG-823

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (PIG-824) SQL interface for Pig

Posted by "Olga Natkovich (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/PIG-824?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12894682#action_12894682 ] 

Olga Natkovich commented on PIG-824:
------------------------------------

Jeff is correct. We are not actively developing Pig SQL or Owl.

> SQL interface for Pig
> ---------------------
>
>                 Key: PIG-824
>                 URL: https://issues.apache.org/jira/browse/PIG-824
>             Project: Pig
>          Issue Type: New Feature
>            Reporter: Olga Natkovich
>            Assignee: Thejas M Nair
>         Attachments: java-cup-11a-runtime.jar, java-cup-11a.jar, PIG-824.1.patch, PIG-824.binfiles.tar.gz, pig_sql_beta.pdf, pigsql.patch, pigsql_tutorial.txt, SQL_IN_PIG.html, students2.bin, students_attr.bin
>
>
> In the last 18 month PigLatin has gained significant popularity within the open source community. Many users like its data flow model, its rich type system and its ability to work with any data available on HDFS or outside. We have also heard from many users that having Pig speak SQL would bring many more users. Having a single system that exports multiple interfaces is a big advantage as it guarantees consistent semantics, custom code reuse, and reduces the amount of maintenance. This is especially relevant for project where using both interfaces for different parts of the system is relevant.  For instance, in a 
> data warehousing system, you would have ETL component that brings data  into the warehouse and a component that analyzes the data and produces reports. PigLatin is uniquely suited for ETL processing while SQL might be a better fit for report generation.
> To start, it would make sense to implement a subset of SQL92 standard and to be as much as possible standard compliant. This would include all the standard constructs: select, from, where, group-by + having, order by, limit, join (inner + outer). Several extensions  such as support for pig's UDFs and possibly streaming, multiquery and support for pig's complex types would be helpful.
> This work is dependent on metadata support outlined in https://issues.apache.org/jira/browse/PIG-823

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (PIG-824) SQL interface for Pig

Posted by "eric baldeschwieler (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/PIG-824?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12894454#action_12894454 ] 

eric baldeschwieler commented on PIG-824:
-----------------------------------------

I'm on vacation til wednesday 7/28.

I'm in Hershey, PA, cell should work if needed.


> SQL interface for Pig
> ---------------------
>
>                 Key: PIG-824
>                 URL: https://issues.apache.org/jira/browse/PIG-824
>             Project: Pig
>          Issue Type: New Feature
>            Reporter: Olga Natkovich
>            Assignee: Thejas M Nair
>         Attachments: java-cup-11a-runtime.jar, java-cup-11a.jar, PIG-824.1.patch, PIG-824.binfiles.tar.gz, pig_sql_beta.pdf, pigsql.patch, pigsql_tutorial.txt, SQL_IN_PIG.html, students2.bin, students_attr.bin
>
>
> In the last 18 month PigLatin has gained significant popularity within the open source community. Many users like its data flow model, its rich type system and its ability to work with any data available on HDFS or outside. We have also heard from many users that having Pig speak SQL would bring many more users. Having a single system that exports multiple interfaces is a big advantage as it guarantees consistent semantics, custom code reuse, and reduces the amount of maintenance. This is especially relevant for project where using both interfaces for different parts of the system is relevant.  For instance, in a 
> data warehousing system, you would have ETL component that brings data  into the warehouse and a component that analyzes the data and produces reports. PigLatin is uniquely suited for ETL processing while SQL might be a better fit for report generation.
> To start, it would make sense to implement a subset of SQL92 standard and to be as much as possible standard compliant. This would include all the standard constructs: select, from, where, group-by + having, order by, limit, join (inner + outer). Several extensions  such as support for pig's UDFs and possibly streaming, multiquery and support for pig's complex types would be helpful.
> This work is dependent on metadata support outlined in https://issues.apache.org/jira/browse/PIG-823

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (PIG-824) SQL interface for Pig

Posted by "Thejas M Nair (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/PIG-824?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Thejas M Nair updated PIG-824:
------------------------------

    Attachment: java-cup-11a.jar
                students2.bin
                students_attr.bin

copy the attached jar files to lib/ dir to build the patch.

copy the bin storage format test files to following dirs -
students2.bin -> test/org/apache/pig/test/data/SQL/students2.bin and contrib/owl/contrib/pig/test/java/org/apache/hadoop/owl/pig/data/SQL/students2.bin
students_attr.bin -> test/org/apache/pig/test/data/SQL/students_attr.bin and contrib/owl/contrib/pig/test/java/org/apache/hadoop/owl/pig/data/SQL/students_attr.bin



> SQL interface for Pig
> ---------------------
>
>                 Key: PIG-824
>                 URL: https://issues.apache.org/jira/browse/PIG-824
>             Project: Pig
>          Issue Type: New Feature
>            Reporter: Olga Natkovich
>            Assignee: Thejas M Nair
>         Attachments: java-cup-11a-runtime.jar, java-cup-11a.jar, PIG-824.1.patch, PIG-824.binfiles.tar.gz, pig_sql_beta.pdf, pigsql.patch, SQL_IN_PIG.html, students2.bin, students_attr.bin
>
>
> In the last 18 month PigLatin has gained significant popularity within the open source community. Many users like its data flow model, its rich type system and its ability to work with any data available on HDFS or outside. We have also heard from many users that having Pig speak SQL would bring many more users. Having a single system that exports multiple interfaces is a big advantage as it guarantees consistent semantics, custom code reuse, and reduces the amount of maintenance. This is especially relevant for project where using both interfaces for different parts of the system is relevant.  For instance, in a 
> data warehousing system, you would have ETL component that brings data  into the warehouse and a component that analyzes the data and produces reports. PigLatin is uniquely suited for ETL processing while SQL might be a better fit for report generation.
> To start, it would make sense to implement a subset of SQL92 standard and to be as much as possible standard compliant. This would include all the standard constructs: select, from, where, group-by + having, order by, limit, join (inner + outer). Several extensions  such as support for pig's UDFs and possibly streaming, multiquery and support for pig's complex types would be helpful.
> This work is dependent on metadata support outlined in https://issues.apache.org/jira/browse/PIG-823

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Assigned: (PIG-824) SQL interface for Pig

Posted by "Thejas M Nair (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/PIG-824?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Thejas M Nair reassigned PIG-824:
---------------------------------

    Assignee: Thejas M Nair

> SQL interface for Pig
> ---------------------
>
>                 Key: PIG-824
>                 URL: https://issues.apache.org/jira/browse/PIG-824
>             Project: Pig
>          Issue Type: New Feature
>            Reporter: Olga Natkovich
>            Assignee: Thejas M Nair
>         Attachments: java-cup-11a-runtime.jar, PIG-824.1.patch, PIG-824.binfiles.tar.gz, pig_sql_beta.pdf, pigsql.patch, SQL_IN_PIG.html
>
>
> In the last 18 month PigLatin has gained significant popularity within the open source community. Many users like its data flow model, its rich type system and its ability to work with any data available on HDFS or outside. We have also heard from many users that having Pig speak SQL would bring many more users. Having a single system that exports multiple interfaces is a big advantage as it guarantees consistent semantics, custom code reuse, and reduces the amount of maintenance. This is especially relevant for project where using both interfaces for different parts of the system is relevant.  For instance, in a 
> data warehousing system, you would have ETL component that brings data  into the warehouse and a component that analyzes the data and produces reports. PigLatin is uniquely suited for ETL processing while SQL might be a better fit for report generation.
> To start, it would make sense to implement a subset of SQL92 standard and to be as much as possible standard compliant. This would include all the standard constructs: select, from, where, group-by + having, order by, limit, join (inner + outer). Several extensions  such as support for pig's UDFs and possibly streaming, multiquery and support for pig's complex types would be helpful.
> This work is dependent on metadata support outlined in https://issues.apache.org/jira/browse/PIG-823

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (PIG-824) SQL interface for Pig

Posted by "Thejas M Nair (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/PIG-824?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Thejas M Nair updated PIG-824:
------------------------------

    Attachment: pigsql_tutorial.txt

Attaching SQL tutorial (pigsql_tutorial.txt) - 
This Pig SQL tutorial shows you how to run SQL scripts in local mode and mapreduce mode.
The metadata is stored using Owl. In this tutorial a jetty/derby based owl setup is used so that only minimal setup needs to be done to get started.


> SQL interface for Pig
> ---------------------
>
>                 Key: PIG-824
>                 URL: https://issues.apache.org/jira/browse/PIG-824
>             Project: Pig
>          Issue Type: New Feature
>            Reporter: Olga Natkovich
>            Assignee: Thejas M Nair
>         Attachments: java-cup-11a-runtime.jar, java-cup-11a.jar, PIG-824.1.patch, PIG-824.binfiles.tar.gz, pig_sql_beta.pdf, pigsql.patch, pigsql_tutorial.txt, SQL_IN_PIG.html, students2.bin, students_attr.bin
>
>
> In the last 18 month PigLatin has gained significant popularity within the open source community. Many users like its data flow model, its rich type system and its ability to work with any data available on HDFS or outside. We have also heard from many users that having Pig speak SQL would bring many more users. Having a single system that exports multiple interfaces is a big advantage as it guarantees consistent semantics, custom code reuse, and reduces the amount of maintenance. This is especially relevant for project where using both interfaces for different parts of the system is relevant.  For instance, in a 
> data warehousing system, you would have ETL component that brings data  into the warehouse and a component that analyzes the data and produces reports. PigLatin is uniquely suited for ETL processing while SQL might be a better fit for report generation.
> To start, it would make sense to implement a subset of SQL92 standard and to be as much as possible standard compliant. This would include all the standard constructs: select, from, where, group-by + having, order by, limit, join (inner + outer). Several extensions  such as support for pig's UDFs and possibly streaming, multiquery and support for pig's complex types would be helpful.
> This work is dependent on metadata support outlined in https://issues.apache.org/jira/browse/PIG-823

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (PIG-824) SQL interface for Pig

Posted by "Thejas M Nair (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/PIG-824?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Thejas M Nair updated PIG-824:
------------------------------

    Attachment: pigsql.patch
                pig_sql_beta.pdf
                java-cup-11a-runtime.jar

 SQL patch (pigsql.patch) based on version of owl in svn  and documentation (pig_sql_beta.pdf).  Patch is against the trunk revision 941018 .


> SQL interface for Pig
> ---------------------
>
>                 Key: PIG-824
>                 URL: https://issues.apache.org/jira/browse/PIG-824
>             Project: Pig
>          Issue Type: New Feature
>            Reporter: Olga Natkovich
>         Attachments: java-cup-11a-runtime.jar, PIG-824.1.patch, PIG-824.binfiles.tar.gz, pig_sql_beta.pdf, pigsql.patch, SQL_IN_PIG.html
>
>
> In the last 18 month PigLatin has gained significant popularity within the open source community. Many users like its data flow model, its rich type system and its ability to work with any data available on HDFS or outside. We have also heard from many users that having Pig speak SQL would bring many more users. Having a single system that exports multiple interfaces is a big advantage as it guarantees consistent semantics, custom code reuse, and reduces the amount of maintenance. This is especially relevant for project where using both interfaces for different parts of the system is relevant.  For instance, in a 
> data warehousing system, you would have ETL component that brings data  into the warehouse and a component that analyzes the data and produces reports. PigLatin is uniquely suited for ETL processing while SQL might be a better fit for report generation.
> To start, it would make sense to implement a subset of SQL92 standard and to be as much as possible standard compliant. This would include all the standard constructs: select, from, where, group-by + having, order by, limit, join (inner + outer). Several extensions  such as support for pig's UDFs and possibly streaming, multiquery and support for pig's complex types would be helpful.
> This work is dependent on metadata support outlined in https://issues.apache.org/jira/browse/PIG-823

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (PIG-824) SQL interface for Pig

Posted by "Thejas M Nair (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/PIG-824?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12744238#action_12744238 ] 

Thejas M Nair commented on PIG-824:
-----------------------------------

JFlex.jar (required for build this patch) can be downloaded from http://www.jflex.de/download.html . 

> SQL interface for Pig
> ---------------------
>
>                 Key: PIG-824
>                 URL: https://issues.apache.org/jira/browse/PIG-824
>             Project: Pig
>          Issue Type: New Feature
>            Reporter: Olga Natkovich
>         Attachments: PIG-824.1.patch, PIG-824.binfiles.tar.gz, SQL_IN_PIG.html
>
>
> In the last 18 month PigLatin has gained significant popularity within the open source community. Many users like its data flow model, its rich type system and its ability to work with any data available on HDFS or outside. We have also heard from many users that having Pig speak SQL would bring many more users. Having a single system that exports multiple interfaces is a big advantage as it guarantees consistent semantics, custom code reuse, and reduces the amount of maintenance. This is especially relevant for project where using both interfaces for different parts of the system is relevant.  For instance, in a 
> data warehousing system, you would have ETL component that brings data  into the warehouse and a component that analyzes the data and produces reports. PigLatin is uniquely suited for ETL processing while SQL might be a better fit for report generation.
> To start, it would make sense to implement a subset of SQL92 standard and to be as much as possible standard compliant. This would include all the standard constructs: select, from, where, group-by + having, order by, limit, join (inner + outer). Several extensions  such as support for pig's UDFs and possibly streaming, multiquery and support for pig's complex types would be helpful.
> This work is dependent on metadata support outlined in https://issues.apache.org/jira/browse/PIG-823

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (PIG-824) SQL interface for Pig

Posted by "Jeff Hammerbacher (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/PIG-824?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12894453#action_12894453 ] 

Jeff Hammerbacher commented on PIG-824:
---------------------------------------

Hey Min,

To the best of my knowledge, the development on this issue has stopped in favor of adapting Owl (Pig's metastore) to work with Hive. To follow the Howl project, first check out the overview at http://wiki.apache.org/pig/Howl and join the mailing list at http://tech.groups.yahoo.com/group/howldev/.

Regards,
Jeff

> SQL interface for Pig
> ---------------------
>
>                 Key: PIG-824
>                 URL: https://issues.apache.org/jira/browse/PIG-824
>             Project: Pig
>          Issue Type: New Feature
>            Reporter: Olga Natkovich
>            Assignee: Thejas M Nair
>         Attachments: java-cup-11a-runtime.jar, java-cup-11a.jar, PIG-824.1.patch, PIG-824.binfiles.tar.gz, pig_sql_beta.pdf, pigsql.patch, pigsql_tutorial.txt, SQL_IN_PIG.html, students2.bin, students_attr.bin
>
>
> In the last 18 month PigLatin has gained significant popularity within the open source community. Many users like its data flow model, its rich type system and its ability to work with any data available on HDFS or outside. We have also heard from many users that having Pig speak SQL would bring many more users. Having a single system that exports multiple interfaces is a big advantage as it guarantees consistent semantics, custom code reuse, and reduces the amount of maintenance. This is especially relevant for project where using both interfaces for different parts of the system is relevant.  For instance, in a 
> data warehousing system, you would have ETL component that brings data  into the warehouse and a component that analyzes the data and produces reports. PigLatin is uniquely suited for ETL processing while SQL might be a better fit for report generation.
> To start, it would make sense to implement a subset of SQL92 standard and to be as much as possible standard compliant. This would include all the standard constructs: select, from, where, group-by + having, order by, limit, join (inner + outer). Several extensions  such as support for pig's UDFs and possibly streaming, multiquery and support for pig's complex types would be helpful.
> This work is dependent on metadata support outlined in https://issues.apache.org/jira/browse/PIG-823

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (PIG-824) SQL interface for Pig

Posted by "eric baldeschwieler (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/PIG-824?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12715853#action_12715853 ] 

eric baldeschwieler commented on PIG-824:
-----------------------------------------

Hi Jeff,

Reasonable parties can clearly disagree on approach.  We've been planning this approach since before Hive's inception, as we discussed before Hive's inception.  Your team chose to explore an alternative approach rather than implement a SQL parser of Pig.

The Hadoop community is richer for that.

Having looked at the cost benefit for our organization, we've concluded that we still believe that having a single set of tools that supports Pig and SQL syntax will reduce the overall cost of running the diverse workloads we support and we are willing to invest to get to that saving. 

I believe the Hadoop community will be richer for having that alternative too.

Let's keep talking!

E14

> SQL interface for Pig
> ---------------------
>
>                 Key: PIG-824
>                 URL: https://issues.apache.org/jira/browse/PIG-824
>             Project: Pig
>          Issue Type: New Feature
>            Reporter: Olga Natkovich
>
> In the last 18 month PigLatin has gained significant popularity within the open source community. Many users like its data flow model, its rich type system and its ability to work with any data available on HDFS or outside. We have also heard from many users that having Pig speak SQL would bring many more users. Having a single system that exports multiple interfaces is a big advantage as it guarantees consistent semantics, custom code reuse, and reduces the amount of maintenance. This is especially relevant for project where using both interfaces for different parts of the system is relevant.  For instance, in a 
> data warehousing system, you would have ETL component that brings data  into the warehouse and a component that analyzes the data and produces reports. PigLatin is uniquely suited for ETL processing while SQL might be a better fit for report generation.
> To start, it would make sense to implement a subset of SQL92 standard and to be as much as possible standard compliant. This would include all the standard constructs: select, from, where, group-by + having, order by, limit, join (inner + outer). Several extensions  such as support for pig's UDFs and possibly streaming, multiquery and support for pig's complex types would be helpful.
> This work is dependent on metadata support outlined in https://issues.apache.org/jira/browse/PIG-823

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (PIG-824) SQL interface for Pig

Posted by "Jeff Hammerbacher (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/PIG-824?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12714571#action_12714571 ] 

Jeff Hammerbacher commented on PIG-824:
---------------------------------------

Sigh. Really? Why build another SQL interface to Hadoop when we have two already (CloudBase, Hive)? Extending Pig to share Hive's metadata repository seems to be a much, much shorter path to a solution.

> SQL interface for Pig
> ---------------------
>
>                 Key: PIG-824
>                 URL: https://issues.apache.org/jira/browse/PIG-824
>             Project: Pig
>          Issue Type: New Feature
>            Reporter: Olga Natkovich
>
> In the last 18 month PigLatin has gained significant popularity within the open source community. Many users like its data flow model, its rich type system and its ability to work with any data available on HDFS or outside. We have also heard from many users that having Pig speak SQL would bring many more users. Having a single system that exports multiple interfaces is a big advantage as it guarantees consistent semantics, custom code reuse, and reduces the amount of maintenance. This is especially relevant for project where using both interfaces for different parts of the system is relevant.  For instance, in a 
> data warehousing system, you would have ETL component that brings data  into the warehouse and a component that analyzes the data and produces reports. PigLatin is uniquely suited for ETL processing while SQL might be a better fit for report generation.
> To start, it would make sense to implement a subset of SQL92 standard and to be as much as possible standard compliant. This would include all the standard constructs: select, from, where, group-by + having, order by, limit, join (inner + outer). Several extensions  such as support for pig's UDFs and possibly streaming, multiquery and support for pig's complex types would be helpful.
> This work is dependent on metadata support outlined in https://issues.apache.org/jira/browse/PIG-823

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (PIG-824) SQL interface for Pig

Posted by "Thejas M Nair (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/PIG-824?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Thejas M Nair updated PIG-824:
------------------------------

    Attachment: SQL_IN_PIG.html
                PIG-824.1.patch
                PIG-824.binfiles.tar.gz

PIG-824.binfiles.tar.gz - contains libs that it depends on
PIG-824.1.patch - patch
SQL_IN_PIG.html - (brief) document

JFlex.jar has not been included because it covered by GPL. It will have to be downloaded to lib dir for building with the patch. In future Ivy will be setup to download it .

> SQL interface for Pig
> ---------------------
>
>                 Key: PIG-824
>                 URL: https://issues.apache.org/jira/browse/PIG-824
>             Project: Pig
>          Issue Type: New Feature
>            Reporter: Olga Natkovich
>         Attachments: PIG-824.1.patch, PIG-824.binfiles.tar.gz, SQL_IN_PIG.html
>
>
> In the last 18 month PigLatin has gained significant popularity within the open source community. Many users like its data flow model, its rich type system and its ability to work with any data available on HDFS or outside. We have also heard from many users that having Pig speak SQL would bring many more users. Having a single system that exports multiple interfaces is a big advantage as it guarantees consistent semantics, custom code reuse, and reduces the amount of maintenance. This is especially relevant for project where using both interfaces for different parts of the system is relevant.  For instance, in a 
> data warehousing system, you would have ETL component that brings data  into the warehouse and a component that analyzes the data and produces reports. PigLatin is uniquely suited for ETL processing while SQL might be a better fit for report generation.
> To start, it would make sense to implement a subset of SQL92 standard and to be as much as possible standard compliant. This would include all the standard constructs: select, from, where, group-by + having, order by, limit, join (inner + outer). Several extensions  such as support for pig's UDFs and possibly streaming, multiquery and support for pig's complex types would be helpful.
> This work is dependent on metadata support outlined in https://issues.apache.org/jira/browse/PIG-823

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (PIG-824) SQL interface for Pig

Posted by "Amr Awadallah (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/PIG-824?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12894455#action_12894455 ] 

Amr Awadallah commented on PIG-824:
-----------------------------------

I am out of office on vacation and will be slower than usual in
responding to emails. If this is urgent then please call my cell phone
(or send an sms), otherwise I will reply to your email when I get
back.

Thanks for your patience,

-- amr


> SQL interface for Pig
> ---------------------
>
>                 Key: PIG-824
>                 URL: https://issues.apache.org/jira/browse/PIG-824
>             Project: Pig
>          Issue Type: New Feature
>            Reporter: Olga Natkovich
>            Assignee: Thejas M Nair
>         Attachments: java-cup-11a-runtime.jar, java-cup-11a.jar, PIG-824.1.patch, PIG-824.binfiles.tar.gz, pig_sql_beta.pdf, pigsql.patch, pigsql_tutorial.txt, SQL_IN_PIG.html, students2.bin, students_attr.bin
>
>
> In the last 18 month PigLatin has gained significant popularity within the open source community. Many users like its data flow model, its rich type system and its ability to work with any data available on HDFS or outside. We have also heard from many users that having Pig speak SQL would bring many more users. Having a single system that exports multiple interfaces is a big advantage as it guarantees consistent semantics, custom code reuse, and reduces the amount of maintenance. This is especially relevant for project where using both interfaces for different parts of the system is relevant.  For instance, in a 
> data warehousing system, you would have ETL component that brings data  into the warehouse and a component that analyzes the data and produces reports. PigLatin is uniquely suited for ETL processing while SQL might be a better fit for report generation.
> To start, it would make sense to implement a subset of SQL92 standard and to be as much as possible standard compliant. This would include all the standard constructs: select, from, where, group-by + having, order by, limit, join (inner + outer). Several extensions  such as support for pig's UDFs and possibly streaming, multiquery and support for pig's complex types would be helpful.
> This work is dependent on metadata support outlined in https://issues.apache.org/jira/browse/PIG-823

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (PIG-824) SQL interface for Pig

Posted by "Min Zhou (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/PIG-824?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12894251#action_12894251 ] 

Min Zhou commented on PIG-824:
------------------------------

Any further progress on this issue?

> SQL interface for Pig
> ---------------------
>
>                 Key: PIG-824
>                 URL: https://issues.apache.org/jira/browse/PIG-824
>             Project: Pig
>          Issue Type: New Feature
>            Reporter: Olga Natkovich
>            Assignee: Thejas M Nair
>         Attachments: java-cup-11a-runtime.jar, java-cup-11a.jar, PIG-824.1.patch, PIG-824.binfiles.tar.gz, pig_sql_beta.pdf, pigsql.patch, pigsql_tutorial.txt, SQL_IN_PIG.html, students2.bin, students_attr.bin
>
>
> In the last 18 month PigLatin has gained significant popularity within the open source community. Many users like its data flow model, its rich type system and its ability to work with any data available on HDFS or outside. We have also heard from many users that having Pig speak SQL would bring many more users. Having a single system that exports multiple interfaces is a big advantage as it guarantees consistent semantics, custom code reuse, and reduces the amount of maintenance. This is especially relevant for project where using both interfaces for different parts of the system is relevant.  For instance, in a 
> data warehousing system, you would have ETL component that brings data  into the warehouse and a component that analyzes the data and produces reports. PigLatin is uniquely suited for ETL processing while SQL might be a better fit for report generation.
> To start, it would make sense to implement a subset of SQL92 standard and to be as much as possible standard compliant. This would include all the standard constructs: select, from, where, group-by + having, order by, limit, join (inner + outer). Several extensions  such as support for pig's UDFs and possibly streaming, multiquery and support for pig's complex types would be helpful.
> This work is dependent on metadata support outlined in https://issues.apache.org/jira/browse/PIG-823

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.