You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@hive.apache.org by "Raghotham Murthy (JIRA)" <ji...@apache.org> on 2008/12/02 01:55:44 UTC

[jira] Created: (HIVE-93) dynamic serde does not handle '_' as the first character of a column name in a DDL

dynamic serde does not handle '_' as the first character of a column name in a DDL
----------------------------------------------------------------------------------

                 Key: HIVE-93
                 URL: https://issues.apache.org/jira/browse/HIVE-93
             Project: Hadoop Hive
          Issue Type: Bug
    Affects Versions: 0.20.0
            Reporter: Raghotham Murthy
            Priority: Minor
             Fix For: 0.20.0


For example cannot initialize a dynamic serde with : struct result { string _c0}

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HIVE-93) dynamic serde does not handle '_' as the first character of a column name in a DDL

Posted by "Ashish Thusoo (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HIVE-93?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12652577#action_12652577 ] 

Ashish Thusoo commented on HIVE-93:
-----------------------------------

It seems that these rules only allow identifier to start with a letter or an _ while Hive.g also allows Digits. Can we change it to allow identifiers that start with digit as well.

> dynamic serde does not handle '_' as the first character of a column name in a DDL
> ----------------------------------------------------------------------------------
>
>                 Key: HIVE-93
>                 URL: https://issues.apache.org/jira/browse/HIVE-93
>             Project: Hadoop Hive
>          Issue Type: Bug
>          Components: Serializers/Deserializers
>    Affects Versions: 0.20.0
>            Reporter: Raghotham Murthy
>            Assignee: Pete Wyckoff
>            Priority: Minor
>             Fix For: 0.20.0
>
>         Attachments: HIVE-93.txt
>
>
> For example cannot initialize a dynamic serde with : struct result { string _c0}

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (HIVE-93) dynamic serde does not handle '_' as the first character of a column name in a DDL

Posted by "Ashish Thusoo (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HIVE-93?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Ashish Thusoo updated HIVE-93:
------------------------------

    Component/s: Query Processor

categorizing..

I am not sure if this is a bug. All our identifiers are restricted to not start with _ so that we can handle the character set names. So any column name or table name beginning with an _ would lead to a parse error. Is that what you are getting? Can you upload the error string and then mark this as an improvement as opposed to a bug.

> dynamic serde does not handle '_' as the first character of a column name in a DDL
> ----------------------------------------------------------------------------------
>
>                 Key: HIVE-93
>                 URL: https://issues.apache.org/jira/browse/HIVE-93
>             Project: Hadoop Hive
>          Issue Type: Bug
>          Components: Query Processor
>    Affects Versions: 0.20.0
>            Reporter: Raghotham Murthy
>            Priority: Minor
>             Fix For: 0.20.0
>
>
> For example cannot initialize a dynamic serde with : struct result { string _c0}

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HIVE-93) dynamic serde does not handle '_' as the first character of a column name in a DDL

Posted by "Pete Wyckoff (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HIVE-93?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12652260#action_12652260 ] 

Pete Wyckoff commented on HIVE-93:
----------------------------------

incidentally, this is the offending code:

{code}
--- src/contrib/hive/serde/src/java/org/apache/hadoop/hive/serde2/dynamic_type/thrift_grammar.jjt       (revision 132500)
+++ src/contrib/hive/serde/src/java/org/apache/hadoop/hive/serde2/dynamic_type/thrift_grammar.jjt       (working copy)
@@ -163,7 +163,7 @@
 |
 <tok_double_constant:   ["+","-"](<DIGIT>)*"."(<DIGIT>)+(["e","E"](["+","-"])?(<DIGIT>)+)?>
 |
-<IDENTIFIER: <LETTER>(<LETTER>|<DIGIT>|"."|"_")*>
+<IDENTIFIER: (<LETTER>|<DIGIT>|"."|"_")*>
 |
 <#LETTER: (["a"-"z", "A"-"Z" ]) >
 |

{code}

> dynamic serde does not handle '_' as the first character of a column name in a DDL
> ----------------------------------------------------------------------------------
>
>                 Key: HIVE-93
>                 URL: https://issues.apache.org/jira/browse/HIVE-93
>             Project: Hadoop Hive
>          Issue Type: Bug
>          Components: Query Processor
>    Affects Versions: 0.20.0
>            Reporter: Raghotham Murthy
>            Priority: Minor
>             Fix For: 0.20.0
>
>
> For example cannot initialize a dynamic serde with : struct result { string _c0}

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HIVE-93) dynamic serde does not handle '_' as the first character of a column name in a DDL

Posted by "Zheng Shao (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HIVE-93?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12653017#action_12653017 ] 

Zheng Shao commented on HIVE-93:
--------------------------------

Ashish: Hive.g allows Letter/Digit but not '_'. Is that what we want? Let's decide what we want here (so we know which patch to take). We can change Hive.g in another jira if needed.

Identifier
    :
    (Letter | Digit) (Letter | Digit | '_')*
    | '`' (Letter | Digit) (Letter | Digit | '_')* '`'
    ;



> dynamic serde does not handle '_' as the first character of a column name in a DDL
> ----------------------------------------------------------------------------------
>
>                 Key: HIVE-93
>                 URL: https://issues.apache.org/jira/browse/HIVE-93
>             Project: Hadoop Hive
>          Issue Type: Bug
>          Components: Serializers/Deserializers
>    Affects Versions: 0.20.0
>            Reporter: Raghotham Murthy
>            Assignee: Pete Wyckoff
>            Priority: Minor
>             Fix For: 0.20.0
>
>         Attachments: HIVE-93.txt, HIVE-93.txt
>
>
> For example cannot initialize a dynamic serde with : struct result { string _c0}

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HIVE-93) dynamic serde does not handle '_' as the first character of a column name in a DDL

Posted by "Ashish Thusoo (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HIVE-93?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12652598#action_12652598 ] 

Ashish Thusoo commented on HIVE-93:
-----------------------------------

+1

 Looks good.

(Though Raghu pointed out that most SQL dialects have identifiers starting with non numeric) - but we should be fine with this.

> dynamic serde does not handle '_' as the first character of a column name in a DDL
> ----------------------------------------------------------------------------------
>
>                 Key: HIVE-93
>                 URL: https://issues.apache.org/jira/browse/HIVE-93
>             Project: Hadoop Hive
>          Issue Type: Bug
>          Components: Serializers/Deserializers
>    Affects Versions: 0.20.0
>            Reporter: Raghotham Murthy
>            Assignee: Pete Wyckoff
>            Priority: Minor
>             Fix For: 0.20.0
>
>         Attachments: HIVE-93.txt, HIVE-93.txt
>
>
> For example cannot initialize a dynamic serde with : struct result { string _c0}

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HIVE-93) dynamic serde does not handle '_' as the first character of a column name in a DDL

Posted by "Raghotham Murthy (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HIVE-93?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12652565#action_12652565 ] 

Raghotham Murthy commented on HIVE-93:
--------------------------------------

+1

looks good. I also made sure that the failing test in service passes with this fix.

> dynamic serde does not handle '_' as the first character of a column name in a DDL
> ----------------------------------------------------------------------------------
>
>                 Key: HIVE-93
>                 URL: https://issues.apache.org/jira/browse/HIVE-93
>             Project: Hadoop Hive
>          Issue Type: Bug
>          Components: Serializers/Deserializers
>    Affects Versions: 0.20.0
>            Reporter: Raghotham Murthy
>            Assignee: Pete Wyckoff
>            Priority: Minor
>             Fix For: 0.20.0
>
>         Attachments: HIVE-93.txt
>
>
> For example cannot initialize a dynamic serde with : struct result { string _c0}

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HIVE-93) dynamic serde does not handle '_' as the first character of a column name in a DDL

Posted by "Pete Wyckoff (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HIVE-93?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12652908#action_12652908 ] 

Pete Wyckoff commented on HIVE-93:
----------------------------------

agreed but then we should open another JIRA to change Hive.g, no?

> dynamic serde does not handle '_' as the first character of a column name in a DDL
> ----------------------------------------------------------------------------------
>
>                 Key: HIVE-93
>                 URL: https://issues.apache.org/jira/browse/HIVE-93
>             Project: Hadoop Hive
>          Issue Type: Bug
>          Components: Serializers/Deserializers
>    Affects Versions: 0.20.0
>            Reporter: Raghotham Murthy
>            Assignee: Pete Wyckoff
>            Priority: Minor
>             Fix For: 0.20.0
>
>         Attachments: HIVE-93.txt, HIVE-93.txt
>
>
> For example cannot initialize a dynamic serde with : struct result { string _c0}

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (HIVE-93) dynamic serde does not handle '_' as the first character of a column name in a DDL

Posted by "Zheng Shao (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HIVE-93?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Zheng Shao updated HIVE-93:
---------------------------

      Resolution: Fixed
    Hadoop Flags: [Reviewed]
          Status: Resolved  (was: Patch Available)

HIVE-93. Dynamic serde to handle _-prefixed column names in DDL. (Pete Wyckoff through zshao)

> dynamic serde does not handle '_' as the first character of a column name in a DDL
> ----------------------------------------------------------------------------------
>
>                 Key: HIVE-93
>                 URL: https://issues.apache.org/jira/browse/HIVE-93
>             Project: Hadoop Hive
>          Issue Type: Bug
>          Components: Serializers/Deserializers
>    Affects Versions: 0.20.0
>            Reporter: Raghotham Murthy
>            Assignee: Pete Wyckoff
>            Priority: Minor
>             Fix For: 0.20.0
>
>         Attachments: HIVE-93.txt, HIVE-93.txt
>
>
> For example cannot initialize a dynamic serde with : struct result { string _c0}

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HIVE-93) dynamic serde does not handle '_' as the first character of a column name in a DDL

Posted by "Zheng Shao (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HIVE-93?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12654649#action_12654649 ] 

Zheng Shao commented on HIVE-93:
--------------------------------

Committed revision 724555.
HIVE-93.txt 2008-12-02 03:14 PM Pete Wyckoff 33 kb 

> dynamic serde does not handle '_' as the first character of a column name in a DDL
> ----------------------------------------------------------------------------------
>
>                 Key: HIVE-93
>                 URL: https://issues.apache.org/jira/browse/HIVE-93
>             Project: Hadoop Hive
>          Issue Type: Bug
>          Components: Serializers/Deserializers
>    Affects Versions: 0.20.0
>            Reporter: Raghotham Murthy
>            Assignee: Pete Wyckoff
>            Priority: Minor
>             Fix For: 0.20.0
>
>         Attachments: HIVE-93.txt, HIVE-93.txt
>
>
> For example cannot initialize a dynamic serde with : struct result { string _c0}

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HIVE-93) dynamic serde does not handle '_' as the first character of a column name in a DDL

Posted by "Ashish Thusoo (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HIVE-93?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12653023#action_12653023 ] 

Ashish Thusoo commented on HIVE-93:
-----------------------------------

Lets do that in a separate JIRA.


> dynamic serde does not handle '_' as the first character of a column name in a DDL
> ----------------------------------------------------------------------------------
>
>                 Key: HIVE-93
>                 URL: https://issues.apache.org/jira/browse/HIVE-93
>             Project: Hadoop Hive
>          Issue Type: Bug
>          Components: Serializers/Deserializers
>    Affects Versions: 0.20.0
>            Reporter: Raghotham Murthy
>            Assignee: Pete Wyckoff
>            Priority: Minor
>             Fix For: 0.20.0
>
>         Attachments: HIVE-93.txt, HIVE-93.txt
>
>
> For example cannot initialize a dynamic serde with : struct result { string _c0}

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (HIVE-93) dynamic serde does not handle '_' as the first character of a column name in a DDL

Posted by "Carl Steinbach (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HIVE-93?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Carl Steinbach updated HIVE-93:
-------------------------------

    Fix Version/s: 0.3.0
                       (was: 0.6.0)

> dynamic serde does not handle '_' as the first character of a column name in a DDL
> ----------------------------------------------------------------------------------
>
>                 Key: HIVE-93
>                 URL: https://issues.apache.org/jira/browse/HIVE-93
>             Project: Hadoop Hive
>          Issue Type: Bug
>          Components: Serializers/Deserializers
>            Reporter: Raghotham Murthy
>            Assignee: Pete Wyckoff
>            Priority: Minor
>             Fix For: 0.3.0
>
>         Attachments: HIVE-93.txt, HIVE-93.txt
>
>
> For example cannot initialize a dynamic serde with : struct result { string _c0}

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HIVE-93) dynamic serde does not handle '_' as the first character of a column name in a DDL

Posted by "Zheng Shao (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HIVE-93?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12652671#action_12652671 ] 

Zheng Shao commented on HIVE-93:
--------------------------------

Allowing identifiers starting with number will introduce ambiguity to the language if later we allow 1e10 to be a number.

I think the old patch that does not allow numbers is better. What do you think?


> dynamic serde does not handle '_' as the first character of a column name in a DDL
> ----------------------------------------------------------------------------------
>
>                 Key: HIVE-93
>                 URL: https://issues.apache.org/jira/browse/HIVE-93
>             Project: Hadoop Hive
>          Issue Type: Bug
>          Components: Serializers/Deserializers
>    Affects Versions: 0.20.0
>            Reporter: Raghotham Murthy
>            Assignee: Pete Wyckoff
>            Priority: Minor
>             Fix For: 0.20.0
>
>         Attachments: HIVE-93.txt, HIVE-93.txt
>
>
> For example cannot initialize a dynamic serde with : struct result { string _c0}

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HIVE-93) dynamic serde does not handle '_' as the first character of a column name in a DDL

Posted by "Pete Wyckoff (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HIVE-93?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12652259#action_12652259 ] 

Pete Wyckoff commented on HIVE-93:
----------------------------------

from being able to handle any thrift DDL point of view, we should probably fix this as thrift allows it.  but, as Ashish points out, we may not need to expose it.


> dynamic serde does not handle '_' as the first character of a column name in a DDL
> ----------------------------------------------------------------------------------
>
>                 Key: HIVE-93
>                 URL: https://issues.apache.org/jira/browse/HIVE-93
>             Project: Hadoop Hive
>          Issue Type: Bug
>          Components: Query Processor
>    Affects Versions: 0.20.0
>            Reporter: Raghotham Murthy
>            Priority: Minor
>             Fix For: 0.20.0
>
>
> For example cannot initialize a dynamic serde with : struct result { string _c0}

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (HIVE-93) dynamic serde does not handle '_' as the first character of a column name in a DDL

Posted by "Pete Wyckoff (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HIVE-93?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Pete Wyckoff updated HIVE-93:
-----------------------------

    Attachment: HIVE-93.txt

This is the change I put in a previous comment + changing one of the test cases to name a variable _hello. everything else is generated code.


> dynamic serde does not handle '_' as the first character of a column name in a DDL
> ----------------------------------------------------------------------------------
>
>                 Key: HIVE-93
>                 URL: https://issues.apache.org/jira/browse/HIVE-93
>             Project: Hadoop Hive
>          Issue Type: Bug
>          Components: Query Processor
>    Affects Versions: 0.20.0
>            Reporter: Raghotham Murthy
>            Priority: Minor
>             Fix For: 0.20.0
>
>         Attachments: HIVE-93.txt
>
>
> For example cannot initialize a dynamic serde with : struct result { string _c0}

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (HIVE-93) dynamic serde does not handle '_' as the first character of a column name in a DDL

Posted by "Pete Wyckoff (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HIVE-93?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Pete Wyckoff updated HIVE-93:
-----------------------------

    Attachment: HIVE-93.txt

new patch also allows ids to start with a digit


> dynamic serde does not handle '_' as the first character of a column name in a DDL
> ----------------------------------------------------------------------------------
>
>                 Key: HIVE-93
>                 URL: https://issues.apache.org/jira/browse/HIVE-93
>             Project: Hadoop Hive
>          Issue Type: Bug
>          Components: Serializers/Deserializers
>    Affects Versions: 0.20.0
>            Reporter: Raghotham Murthy
>            Assignee: Pete Wyckoff
>            Priority: Minor
>             Fix For: 0.20.0
>
>         Attachments: HIVE-93.txt, HIVE-93.txt
>
>
> For example cannot initialize a dynamic serde with : struct result { string _c0}

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (HIVE-93) dynamic serde does not handle '_' as the first character of a column name in a DDL

Posted by "Pete Wyckoff (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HIVE-93?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Pete Wyckoff updated HIVE-93:
-----------------------------

    Component/s:     (was: Query Processor)
                 Serializers/Deserializers
       Assignee: Pete Wyckoff

> dynamic serde does not handle '_' as the first character of a column name in a DDL
> ----------------------------------------------------------------------------------
>
>                 Key: HIVE-93
>                 URL: https://issues.apache.org/jira/browse/HIVE-93
>             Project: Hadoop Hive
>          Issue Type: Bug
>          Components: Serializers/Deserializers
>    Affects Versions: 0.20.0
>            Reporter: Raghotham Murthy
>            Assignee: Pete Wyckoff
>            Priority: Minor
>             Fix For: 0.20.0
>
>         Attachments: HIVE-93.txt
>
>
> For example cannot initialize a dynamic serde with : struct result { string _c0}

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (HIVE-93) dynamic serde does not handle '_' as the first character of a column name in a DDL

Posted by "Pete Wyckoff (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HIVE-93?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Pete Wyckoff updated HIVE-93:
-----------------------------

    Status: Patch Available  (was: Open)

ant test passes.


> dynamic serde does not handle '_' as the first character of a column name in a DDL
> ----------------------------------------------------------------------------------
>
>                 Key: HIVE-93
>                 URL: https://issues.apache.org/jira/browse/HIVE-93
>             Project: Hadoop Hive
>          Issue Type: Bug
>          Components: Serializers/Deserializers
>    Affects Versions: 0.20.0
>            Reporter: Raghotham Murthy
>            Assignee: Pete Wyckoff
>            Priority: Minor
>             Fix For: 0.20.0
>
>         Attachments: HIVE-93.txt
>
>
> For example cannot initialize a dynamic serde with : struct result { string _c0}

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.