You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@orc.apache.org by "noirello (Jira)" <ji...@apache.org> on 2021/11/05 09:48:00 UTC

[jira] [Updated] (ORC-1047) [C++] Handle quoted field names during string schema parsing

     [ https://issues.apache.org/jira/browse/ORC-1047?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

noirello updated ORC-1047:
--------------------------
    Description: 
The current implementation of _Type::buildTypeFromString_ cannot handle string schemas with quoted field names. The following code will raise a logic error of "Unrecognized character.":
{code:java}
auto schema = Type::buildTypeFromString("struct<`quoted.field`:string>"); // Fails
{code}
Besides that, two other limitations have been encountered:
 * Cannot parse a string schema that only has a _timestamp with local time zone_ type in root.
{code:java}
Type::buildTypeFromString("timestamp with local time zone"); // Fails
{code}

 * It allows to create struct types without setting a field name, which (based on the Java implementation) should not be a valid ORC schema.
{code:java}
Type::buildTypeFromString("struct<struct<bigint>>"); // Works
{code}

  was:
The current implementation of _Type::buildTypeFromString_ cannot handle string schemas with quoted field names. The following code will raise a logic error of "Unrecognized character.":
{code:java}
auto schema = Type::buildTypeFromString("struct<`quoted.field`:string>");
{code}
Besides that, two other limitations have been encountered:
 * Cannot parse a string schema that only has a _timestamp with local time zone_ type in root.
{code:java}
Type::buildTypeFromString("timestamp with local time zone");
{code}

 * It allows to create struct types without setting a field name, which (based on the Java implementation) should not be a valid ORC schema.
{code:java}
Type::buildTypeFromString("struct<struct<bigint>>");
{code}


> [C++] Handle quoted field names during string schema parsing
> ------------------------------------------------------------
>
>                 Key: ORC-1047
>                 URL: https://issues.apache.org/jira/browse/ORC-1047
>             Project: ORC
>          Issue Type: Bug
>          Components: C++
>    Affects Versions: 1.7.0, 1.8.0, 1.6.11
>            Reporter: noirello
>            Assignee: noirello
>            Priority: Major
>
> The current implementation of _Type::buildTypeFromString_ cannot handle string schemas with quoted field names. The following code will raise a logic error of "Unrecognized character.":
> {code:java}
> auto schema = Type::buildTypeFromString("struct<`quoted.field`:string>"); // Fails
> {code}
> Besides that, two other limitations have been encountered:
>  * Cannot parse a string schema that only has a _timestamp with local time zone_ type in root.
> {code:java}
> Type::buildTypeFromString("timestamp with local time zone"); // Fails
> {code}
>  * It allows to create struct types without setting a field name, which (based on the Java implementation) should not be a valid ORC schema.
> {code:java}
> Type::buildTypeFromString("struct<struct<bigint>>"); // Works
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)