You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@cassandra.apache.org by "Brandon Williams (JIRA)" <ji...@apache.org> on 2011/06/16 00:44:50 UTC

[jira] [Created] (CASSANDRA-2777) Pig storage handler should implement LoadMetadata

Pig storage handler should implement LoadMetadata
-------------------------------------------------

                 Key: CASSANDRA-2777
                 URL: https://issues.apache.org/jira/browse/CASSANDRA-2777
             Project: Cassandra
          Issue Type: Improvement
          Components: Contrib
            Reporter: Brandon Williams
            Assignee: Brandon Williams
            Priority: Minor


The reason for this is many builtin functions like SUM won't work on longs (you can workaround using LongSum, but that's lame) because the query planner doesn't know about the types beforehand, even though we are casting to native longs.

There is some impact to this, though.  With LoadMetadata implemented, existing scripts that specify schema will need to remove it (since LM is doing it for them) and they will need to conform to LM's terminology (key, columns, name, value) within the script.  This is trivial to change, however, and the increased functionality is worth the switch.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (CASSANDRA-2777) Pig storage handler should implement LoadMetadata

Posted by "Jeremy Hanna (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/CASSANDRA-2777?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13053862#comment-13053862 ] 

Jeremy Hanna commented on CASSANDRA-2777:
-----------------------------------------

Brandon and I were still trying to track down a problem that I was seeing in one of the tests I was running.  I'd like to get that resolved before it gets in if possible.

> Pig storage handler should implement LoadMetadata
> -------------------------------------------------
>
>                 Key: CASSANDRA-2777
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-2777
>             Project: Cassandra
>          Issue Type: Improvement
>          Components: Contrib
>            Reporter: Brandon Williams
>            Assignee: Brandon Williams
>            Priority: Minor
>             Fix For: 0.7.7
>
>         Attachments: 2777.txt
>
>
> The reason for this is many builtin functions like SUM won't work on longs (you can workaround using LongSum, but that's lame) because the query planner doesn't know about the types beforehand, even though we are casting to native longs.
> There is some impact to this, though.  With LoadMetadata implemented, existing scripts that specify schema will need to remove it (since LM is doing it for them) and they will need to conform to LM's terminology (key, columns, name, value) within the script.  This is trivial to change, however, and the increased functionality is worth the switch.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (CASSANDRA-2777) Pig storage handler should implement LoadMetadata

Posted by "Steeve Morin (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/CASSANDRA-2777?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13119279#comment-13119279 ] 

Steeve Morin commented on CASSANDRA-2777:
-----------------------------------------

Please note that this patch doesn't work for Pig 0.9, it doesn't like the {{AS ();}}.
{{2011-10-03 14:41:21,033 [main] ERROR org.apache.pig.tools.grunt.Grunt - ERROR 1200: <file test.pig, line 8, column 78>  mismatched input ')' expecting IDENTIFIER_L}}
                
> Pig storage handler should implement LoadMetadata
> -------------------------------------------------
>
>                 Key: CASSANDRA-2777
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-2777
>             Project: Cassandra
>          Issue Type: Improvement
>          Components: Contrib
>            Reporter: Brandon Williams
>            Assignee: Brandon Williams
>            Priority: Minor
>             Fix For: 0.8.7
>
>         Attachments: 2777-v2.txt, 2777.txt
>
>
> The reason for this is many builtin functions like SUM won't work on longs (you can workaround using LongSum, but that's lame) because the query planner doesn't know about the types beforehand, even though we are casting to native longs.
> There is some impact to this, though.  With LoadMetadata implemented, existing scripts that specify schema will need to remove it (since LM is doing it for them) and they will need to conform to LM's terminology (key, columns, name, value) within the script.  This is trivial to change, however, and the increased functionality is worth the switch.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (CASSANDRA-2777) Pig storage handler should implement LoadMetadata

Posted by "Adam Denenberg (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/CASSANDRA-2777?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13142614#comment-13142614 ] 

Adam Denenberg commented on CASSANDRA-2777:
-------------------------------------------

same here for 0.9.1

[main] ERROR org.apache.pig.tools.grunt.Grunt - ERROR 1200: <line 2, column 65>  mismatched input '(' expecting SEMI_COLON
                
> Pig storage handler should implement LoadMetadata
> -------------------------------------------------
>
>                 Key: CASSANDRA-2777
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-2777
>             Project: Cassandra
>          Issue Type: Improvement
>          Components: Contrib
>            Reporter: Brandon Williams
>            Assignee: Brandon Williams
>            Priority: Minor
>             Fix For: 0.8.7
>
>         Attachments: 2777-v2.txt, 2777.txt
>
>
> The reason for this is many builtin functions like SUM won't work on longs (you can workaround using LongSum, but that's lame) because the query planner doesn't know about the types beforehand, even though we are casting to native longs.
> There is some impact to this, though.  With LoadMetadata implemented, existing scripts that specify schema will need to remove it (since LM is doing it for them) and they will need to conform to LM's terminology (key, columns, name, value) within the script.  This is trivial to change, however, and the increased functionality is worth the switch.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Updated] (CASSANDRA-2777) Pig storage handler should implement LoadMetadata

Posted by "Brandon Williams (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/CASSANDRA-2777?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Brandon Williams updated CASSANDRA-2777:
----------------------------------------

    Attachment: 2777.txt

Patch implements the LoadMetadata interface.  Doesn't handle supercolumns since we already punted on deserializing those.

> Pig storage handler should implement LoadMetadata
> -------------------------------------------------
>
>                 Key: CASSANDRA-2777
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-2777
>             Project: Cassandra
>          Issue Type: Improvement
>          Components: Contrib
>            Reporter: Brandon Williams
>            Assignee: Brandon Williams
>            Priority: Minor
>         Attachments: 2777.txt
>
>
> The reason for this is many builtin functions like SUM won't work on longs (you can workaround using LongSum, but that's lame) because the query planner doesn't know about the types beforehand, even though we are casting to native longs.
> There is some impact to this, though.  With LoadMetadata implemented, existing scripts that specify schema will need to remove it (since LM is doing it for them) and they will need to conform to LM's terminology (key, columns, name, value) within the script.  This is trivial to change, however, and the increased functionality is worth the switch.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (CASSANDRA-2777) Pig storage handler should implement LoadMetadata

Posted by "Jeremy Hanna (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/CASSANDRA-2777?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13050159#comment-13050159 ] 

Jeremy Hanna commented on CASSANDRA-2777:
-----------------------------------------

while we're add it can we remove the redundant addMutation call on line 505 and on line 513 add the e param on:
{quote}
throw new IOException(e + " Output must be (key, {(column,value)...}) for ColumnFamily or (key, {supercolumn:{(column,value)...}...}) for SuperColumnFamily", e);
{quote}

> Pig storage handler should implement LoadMetadata
> -------------------------------------------------
>
>                 Key: CASSANDRA-2777
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-2777
>             Project: Cassandra
>          Issue Type: Improvement
>          Components: Contrib
>            Reporter: Brandon Williams
>            Assignee: Brandon Williams
>            Priority: Minor
>             Fix For: 0.7.7
>
>         Attachments: 2777.txt
>
>
> The reason for this is many builtin functions like SUM won't work on longs (you can workaround using LongSum, but that's lame) because the query planner doesn't know about the types beforehand, even though we are casting to native longs.
> There is some impact to this, though.  With LoadMetadata implemented, existing scripts that specify schema will need to remove it (since LM is doing it for them) and they will need to conform to LM's terminology (key, columns, name, value) within the script.  This is trivial to change, however, and the increased functionality is worth the switch.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Issue Comment Edited] (CASSANDRA-2777) Pig storage handler should implement LoadMetadata

Posted by "Jeremy Hanna (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/CASSANDRA-2777?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13050159#comment-13050159 ] 

Jeremy Hanna edited comment on CASSANDRA-2777 at 6/16/11 12:21 AM:
-------------------------------------------------------------------

while we're add it can we remove the redundant addMutation call on line 505 and on line 513 add the e param on:
{code}
throw new IOException(e + " Output must be (key, {(column,value)...}) for ColumnFamily or (key, {supercolumn:{(column,value)...}...}) for SuperColumnFamily", e);
{code}

      was (Author: jeromatron):
    while we're add it can we remove the redundant addMutation call on line 505 and on line 513 add the e param on:
{quote}
throw new IOException(e + " Output must be (key, {(column,value)...}) for ColumnFamily or (key, {supercolumn:{(column,value)...}...}) for SuperColumnFamily", e);
{quote}
  
> Pig storage handler should implement LoadMetadata
> -------------------------------------------------
>
>                 Key: CASSANDRA-2777
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-2777
>             Project: Cassandra
>          Issue Type: Improvement
>          Components: Contrib
>            Reporter: Brandon Williams
>            Assignee: Brandon Williams
>            Priority: Minor
>             Fix For: 0.7.7
>
>         Attachments: 2777.txt
>
>
> The reason for this is many builtin functions like SUM won't work on longs (you can workaround using LongSum, but that's lame) because the query planner doesn't know about the types beforehand, even though we are casting to native longs.
> There is some impact to this, though.  With LoadMetadata implemented, existing scripts that specify schema will need to remove it (since LM is doing it for them) and they will need to conform to LM's terminology (key, columns, name, value) within the script.  This is trivial to change, however, and the increased functionality is worth the switch.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Issue Comment Edited] (CASSANDRA-2777) Pig storage handler should implement LoadMetadata

Posted by "Steeve Morin (Issue Comment Edited) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/CASSANDRA-2777?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13116819#comment-13116819 ] 

Steeve Morin edited comment on CASSANDRA-2777 at 10/3/11 12:54 PM:
-------------------------------------------------------------------

Fixed it for me on Pig -0.9- 0.8.3 and Cassandra 0.8.6 (Brisk).

Pig 0.9 complains:
{{2011-10-03 14:41:21,033 [main] ERROR org.apache.pig.tools.grunt.Grunt - ERROR 1200: <file test.pig, line 8, column 78>  mismatched input ')' expecting IDENTIFIER_L}}
                
      was (Author: steeve):
    Fixed it for me on Pig 0.9 and Cassandra 0.8.6 (Brisk).
                  
> Pig storage handler should implement LoadMetadata
> -------------------------------------------------
>
>                 Key: CASSANDRA-2777
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-2777
>             Project: Cassandra
>          Issue Type: Improvement
>          Components: Contrib
>            Reporter: Brandon Williams
>            Assignee: Brandon Williams
>            Priority: Minor
>             Fix For: 0.8.7
>
>         Attachments: 2777-v2.txt, 2777.txt
>
>
> The reason for this is many builtin functions like SUM won't work on longs (you can workaround using LongSum, but that's lame) because the query planner doesn't know about the types beforehand, even though we are casting to native longs.
> There is some impact to this, though.  With LoadMetadata implemented, existing scripts that specify schema will need to remove it (since LM is doing it for them) and they will need to conform to LM's terminology (key, columns, name, value) within the script.  This is trivial to change, however, and the increased functionality is worth the switch.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Updated] (CASSANDRA-2777) Pig storage handler should implement LoadMetadata

Posted by "Jeremy Hanna (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/CASSANDRA-2777?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Jeremy Hanna updated CASSANDRA-2777:
------------------------------------

    Reviewer: jeromatron

> Pig storage handler should implement LoadMetadata
> -------------------------------------------------
>
>                 Key: CASSANDRA-2777
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-2777
>             Project: Cassandra
>          Issue Type: Improvement
>          Components: Contrib
>            Reporter: Brandon Williams
>            Assignee: Brandon Williams
>            Priority: Minor
>         Attachments: 2777.txt
>
>
> The reason for this is many builtin functions like SUM won't work on longs (you can workaround using LongSum, but that's lame) because the query planner doesn't know about the types beforehand, even though we are casting to native longs.
> There is some impact to this, though.  With LoadMetadata implemented, existing scripts that specify schema will need to remove it (since LM is doing it for them) and they will need to conform to LM's terminology (key, columns, name, value) within the script.  This is trivial to change, however, and the increased functionality is worth the switch.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (CASSANDRA-2777) Pig storage handler should implement LoadMetadata

Posted by "Brandon Williams (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/CASSANDRA-2777?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13050160#comment-13050160 ] 

Brandon Williams commented on CASSANDRA-2777:
---------------------------------------------

Sure, will add those on commit.

> Pig storage handler should implement LoadMetadata
> -------------------------------------------------
>
>                 Key: CASSANDRA-2777
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-2777
>             Project: Cassandra
>          Issue Type: Improvement
>          Components: Contrib
>            Reporter: Brandon Williams
>            Assignee: Brandon Williams
>            Priority: Minor
>             Fix For: 0.7.7
>
>         Attachments: 2777.txt
>
>
> The reason for this is many builtin functions like SUM won't work on longs (you can workaround using LongSum, but that's lame) because the query planner doesn't know about the types beforehand, even though we are casting to native longs.
> There is some impact to this, though.  With LoadMetadata implemented, existing scripts that specify schema will need to remove it (since LM is doing it for them) and they will need to conform to LM's terminology (key, columns, name, value) within the script.  This is trivial to change, however, and the increased functionality is worth the switch.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Issue Comment Edited] (CASSANDRA-2777) Pig storage handler should implement LoadMetadata

Posted by "Steeve Morin (Issue Comment Edited) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/CASSANDRA-2777?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13119279#comment-13119279 ] 

Steeve Morin edited comment on CASSANDRA-2777 at 10/3/11 12:56 PM:
-------------------------------------------------------------------

Please note that this patch doesn't work for Pig 0.9, it doesn't like the AS ();.
2011-10-03 14:41:21,033 [main] ERROR org.apache.pig.tools.grunt.Grunt - ERROR 1200: <file test.pig, line 8, column 78>  mismatched input ')' expecting IDENTIFIER_L
                
      was (Author: steeve):
    Please note that this patch doesn't work for Pig 0.9, it doesn't like the {{AS ();}}.

{{2011-10-03 14:41:21,033 [main] ERROR org.apache.pig.tools.grunt.Grunt - ERROR 1200: <file test.pig, line 8, column 78>  mismatched input ')' expecting IDENTIFIER_L}}
                  
> Pig storage handler should implement LoadMetadata
> -------------------------------------------------
>
>                 Key: CASSANDRA-2777
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-2777
>             Project: Cassandra
>          Issue Type: Improvement
>          Components: Contrib
>            Reporter: Brandon Williams
>            Assignee: Brandon Williams
>            Priority: Minor
>             Fix For: 0.8.7
>
>         Attachments: 2777-v2.txt, 2777.txt
>
>
> The reason for this is many builtin functions like SUM won't work on longs (you can workaround using LongSum, but that's lame) because the query planner doesn't know about the types beforehand, even though we are casting to native longs.
> There is some impact to this, though.  With LoadMetadata implemented, existing scripts that specify schema will need to remove it (since LM is doing it for them) and they will need to conform to LM's terminology (key, columns, name, value) within the script.  This is trivial to change, however, and the increased functionality is worth the switch.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (CASSANDRA-2777) Pig storage handler should implement LoadMetadata

Posted by "Hudson (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/CASSANDRA-2777?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13116855#comment-13116855 ] 

Hudson commented on CASSANDRA-2777:
-----------------------------------

Integrated in Cassandra-0.8 #348 (See [https://builds.apache.org/job/Cassandra-0.8/348/])
    Pig storage handler implements LoadMetadata interface.
Patch by brandonwilliams, reviewed by Jeremy Hanna for CASSANDRA-2777

brandonwilliams : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1177083
Files : 
* /cassandra/branches/cassandra-0.8/contrib/pig/src/java/org/apache/cassandra/hadoop/pig/CassandraStorage.java

                
> Pig storage handler should implement LoadMetadata
> -------------------------------------------------
>
>                 Key: CASSANDRA-2777
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-2777
>             Project: Cassandra
>          Issue Type: Improvement
>          Components: Contrib
>            Reporter: Brandon Williams
>            Assignee: Brandon Williams
>            Priority: Minor
>             Fix For: 0.8.7
>
>         Attachments: 2777-v2.txt, 2777.txt
>
>
> The reason for this is many builtin functions like SUM won't work on longs (you can workaround using LongSum, but that's lame) because the query planner doesn't know about the types beforehand, even though we are casting to native longs.
> There is some impact to this, though.  With LoadMetadata implemented, existing scripts that specify schema will need to remove it (since LM is doing it for them) and they will need to conform to LM's terminology (key, columns, name, value) within the script.  This is trivial to change, however, and the increased functionality is worth the switch.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (CASSANDRA-2777) Pig storage handler should implement LoadMetadata

Posted by "Jonathan Ellis (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/CASSANDRA-2777?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13053666#comment-13053666 ] 

Jonathan Ellis commented on CASSANDRA-2777:
-------------------------------------------

Is that +1 otherwise, Jeremy?

> Pig storage handler should implement LoadMetadata
> -------------------------------------------------
>
>                 Key: CASSANDRA-2777
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-2777
>             Project: Cassandra
>          Issue Type: Improvement
>          Components: Contrib
>            Reporter: Brandon Williams
>            Assignee: Brandon Williams
>            Priority: Minor
>             Fix For: 0.7.7
>
>         Attachments: 2777.txt
>
>
> The reason for this is many builtin functions like SUM won't work on longs (you can workaround using LongSum, but that's lame) because the query planner doesn't know about the types beforehand, even though we are casting to native longs.
> There is some impact to this, though.  With LoadMetadata implemented, existing scripts that specify schema will need to remove it (since LM is doing it for them) and they will need to conform to LM's terminology (key, columns, name, value) within the script.  This is trivial to change, however, and the increased functionality is worth the switch.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Updated] (CASSANDRA-2777) Pig storage handler should implement LoadMetadata

Posted by "Brandon Williams (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/CASSANDRA-2777?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Brandon Williams updated CASSANDRA-2777:
----------------------------------------

    Attachment: 2777-v2.txt

v2 rebased.

> Pig storage handler should implement LoadMetadata
> -------------------------------------------------
>
>                 Key: CASSANDRA-2777
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-2777
>             Project: Cassandra
>          Issue Type: Improvement
>          Components: Contrib
>            Reporter: Brandon Williams
>            Assignee: Brandon Williams
>            Priority: Minor
>             Fix For: 0.7.9
>
>         Attachments: 2777-v2.txt, 2777.txt
>
>
> The reason for this is many builtin functions like SUM won't work on longs (you can workaround using LongSum, but that's lame) because the query planner doesn't know about the types beforehand, even though we are casting to native longs.
> There is some impact to this, though.  With LoadMetadata implemented, existing scripts that specify schema will need to remove it (since LM is doing it for them) and they will need to conform to LM's terminology (key, columns, name, value) within the script.  This is trivial to change, however, and the increased functionality is worth the switch.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Issue Comment Edited] (CASSANDRA-2777) Pig storage handler should implement LoadMetadata

Posted by "Steeve Morin (Issue Comment Edited) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/CASSANDRA-2777?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13119279#comment-13119279 ] 

Steeve Morin edited comment on CASSANDRA-2777 at 10/3/11 12:55 PM:
-------------------------------------------------------------------

Please note that this patch doesn't work for Pig 0.9, it doesn't like the {{AS ();}}.

{{2011-10-03 14:41:21,033 [main] ERROR org.apache.pig.tools.grunt.Grunt - ERROR 1200: <file test.pig, line 8, column 78>  mismatched input ')' expecting IDENTIFIER_L}}
                
      was (Author: steeve):
    Please note that this patch doesn't work for Pig 0.9, it doesn't like the {{AS ();}}.
{{2011-10-03 14:41:21,033 [main] ERROR org.apache.pig.tools.grunt.Grunt - ERROR 1200: <file test.pig, line 8, column 78>  mismatched input ')' expecting IDENTIFIER_L}}
                  
> Pig storage handler should implement LoadMetadata
> -------------------------------------------------
>
>                 Key: CASSANDRA-2777
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-2777
>             Project: Cassandra
>          Issue Type: Improvement
>          Components: Contrib
>            Reporter: Brandon Williams
>            Assignee: Brandon Williams
>            Priority: Minor
>             Fix For: 0.8.7
>
>         Attachments: 2777-v2.txt, 2777.txt
>
>
> The reason for this is many builtin functions like SUM won't work on longs (you can workaround using LongSum, but that's lame) because the query planner doesn't know about the types beforehand, even though we are casting to native longs.
> There is some impact to this, though.  With LoadMetadata implemented, existing scripts that specify schema will need to remove it (since LM is doing it for them) and they will need to conform to LM's terminology (key, columns, name, value) within the script.  This is trivial to change, however, and the increased functionality is worth the switch.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Issue Comment Edited] (CASSANDRA-2777) Pig storage handler should implement LoadMetadata

Posted by "Steeve Morin (Issue Comment Edited) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/CASSANDRA-2777?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13116819#comment-13116819 ] 

Steeve Morin edited comment on CASSANDRA-2777 at 10/3/11 12:56 PM:
-------------------------------------------------------------------

Fixed it for me on Pig -0.9- 0.8.3 and Cassandra 0.8.6 (Brisk).

Pig 0.9 complains:
2011-10-03 14:41:21,033 [main] ERROR org.apache.pig.tools.grunt.Grunt - ERROR 1200: <file test.pig, line 8, column 78>  mismatched input ')' expecting IDENTIFIER_L
                
      was (Author: steeve):
    Fixed it for me on Pig -0.9- 0.8.3 and Cassandra 0.8.6 (Brisk).

Pig 0.9 complains:
{{2011-10-03 14:41:21,033 [main] ERROR org.apache.pig.tools.grunt.Grunt - ERROR 1200: <file test.pig, line 8, column 78>  mismatched input ')' expecting IDENTIFIER_L}}
                  
> Pig storage handler should implement LoadMetadata
> -------------------------------------------------
>
>                 Key: CASSANDRA-2777
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-2777
>             Project: Cassandra
>          Issue Type: Improvement
>          Components: Contrib
>            Reporter: Brandon Williams
>            Assignee: Brandon Williams
>            Priority: Minor
>             Fix For: 0.8.7
>
>         Attachments: 2777-v2.txt, 2777.txt
>
>
> The reason for this is many builtin functions like SUM won't work on longs (you can workaround using LongSum, but that's lame) because the query planner doesn't know about the types beforehand, even though we are casting to native longs.
> There is some impact to this, though.  With LoadMetadata implemented, existing scripts that specify schema will need to remove it (since LM is doing it for them) and they will need to conform to LM's terminology (key, columns, name, value) within the script.  This is trivial to change, however, and the increased functionality is worth the switch.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (CASSANDRA-2777) Pig storage handler should implement LoadMetadata

Posted by "Steeve Morin (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/CASSANDRA-2777?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13116819#comment-13116819 ] 

Steeve Morin commented on CASSANDRA-2777:
-----------------------------------------

Fixed it for me on Pig 0.9 and Cassandra 0.8.6 (Brisk).
                
> Pig storage handler should implement LoadMetadata
> -------------------------------------------------
>
>                 Key: CASSANDRA-2777
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-2777
>             Project: Cassandra
>          Issue Type: Improvement
>          Components: Contrib
>            Reporter: Brandon Williams
>            Assignee: Brandon Williams
>            Priority: Minor
>             Fix For: 0.8.7
>
>         Attachments: 2777-v2.txt, 2777.txt
>
>
> The reason for this is many builtin functions like SUM won't work on longs (you can workaround using LongSum, but that's lame) because the query planner doesn't know about the types beforehand, even though we are casting to native longs.
> There is some impact to this, though.  With LoadMetadata implemented, existing scripts that specify schema will need to remove it (since LM is doing it for them) and they will need to conform to LM's terminology (key, columns, name, value) within the script.  This is trivial to change, however, and the increased functionality is worth the switch.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Updated] (CASSANDRA-2777) Pig storage handler should implement LoadMetadata

Posted by "Brandon Williams (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/CASSANDRA-2777?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Brandon Williams updated CASSANDRA-2777:
----------------------------------------

    Fix Version/s: 0.7.7

> Pig storage handler should implement LoadMetadata
> -------------------------------------------------
>
>                 Key: CASSANDRA-2777
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-2777
>             Project: Cassandra
>          Issue Type: Improvement
>          Components: Contrib
>            Reporter: Brandon Williams
>            Assignee: Brandon Williams
>            Priority: Minor
>             Fix For: 0.7.7
>
>         Attachments: 2777.txt
>
>
> The reason for this is many builtin functions like SUM won't work on longs (you can workaround using LongSum, but that's lame) because the query planner doesn't know about the types beforehand, even though we are casting to native longs.
> There is some impact to this, though.  With LoadMetadata implemented, existing scripts that specify schema will need to remove it (since LM is doing it for them) and they will need to conform to LM's terminology (key, columns, name, value) within the script.  This is trivial to change, however, and the increased functionality is worth the switch.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (CASSANDRA-2777) Pig storage handler should implement LoadMetadata

Posted by "Jeremy Hanna (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/CASSANDRA-2777?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13116824#comment-13116824 ] 

Jeremy Hanna commented on CASSANDRA-2777:
-----------------------------------------

+1 - if we find any issues with it in production, we'll submit bug reports.
                
> Pig storage handler should implement LoadMetadata
> -------------------------------------------------
>
>                 Key: CASSANDRA-2777
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-2777
>             Project: Cassandra
>          Issue Type: Improvement
>          Components: Contrib
>            Reporter: Brandon Williams
>            Assignee: Brandon Williams
>            Priority: Minor
>             Fix For: 0.8.7
>
>         Attachments: 2777-v2.txt, 2777.txt
>
>
> The reason for this is many builtin functions like SUM won't work on longs (you can workaround using LongSum, but that's lame) because the query planner doesn't know about the types beforehand, even though we are casting to native longs.
> There is some impact to this, though.  With LoadMetadata implemented, existing scripts that specify schema will need to remove it (since LM is doing it for them) and they will need to conform to LM's terminology (key, columns, name, value) within the script.  This is trivial to change, however, and the increased functionality is worth the switch.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira