You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@orc.apache.org by "Owen O'Malley (JIRA)" <ji...@apache.org> on 2016/12/12 22:24:58 UTC

[jira] [Created] (ORC-120) Create a backwards compatibility mode of ignoring names for evolution

Owen O'Malley created ORC-120:
---------------------------------

             Summary: Create a backwards compatibility mode of ignoring names for evolution
                 Key: ORC-120
                 URL: https://issues.apache.org/jira/browse/ORC-120
             Project: Orc
          Issue Type: Task
            Reporter: Owen O'Malley


ORC's schema evolution uses the column names when they are available. Hive 2.1 uses a positional schema, so ORC should support a backward compatibility mode for Hive users during the transition.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Re: [jira] [Created] (ORC-120) Create a backwards compatibility mode of ignoring names for evolution

Posted by Dain Sundstrom <da...@iq80.com>.
Oh, I see this is an ORC feature like the Parquet schema evolution stuff.  We implemented support for ordering by the top level struct names in Presto a while back.

-dain

> On Dec 12, 2016, at 4:44 PM, Owen O'Malley <om...@apache.org> wrote:
> 
> On Mon, Dec 12, 2016 at 4:42 PM, Dain Sundstrom <da...@iq80.com> wrote:
> 
>> So, rename column is not expected to work anymore?
> 
> 
> ORC-120 will add an option to force positional mapping.
> 
> .. Owen


Re: [jira] [Created] (ORC-120) Create a backwards compatibility mode of ignoring names for evolution

Posted by Owen O'Malley <om...@apache.org>.
On Mon, Dec 12, 2016 at 4:42 PM, Dain Sundstrom <da...@iq80.com> wrote:

> So, rename column is not expected to work anymore?


ORC-120 will add an option to force positional mapping.

.. Owen

Re: [jira] [Created] (ORC-120) Create a backwards compatibility mode of ignoring names for evolution

Posted by Dain Sundstrom <da...@iq80.com>.
So, rename column is not expected to work anymore?

-dain

> On Dec 12, 2016, at 4:14 PM, Owen O'Malley <om...@apache.org> wrote:
> 
> No, unfortunately, but it needs to be. The basic rules from
> SchemaEvolution.java look like:
> 
> structs (including the top row):
>   if field names are available (post HIVE-4243), use name matching
>   otherwise use positional matching
> 
> lists, maps, unions:
>  children must match
> 
> Many primitives can convert to each other, but this list needs to be
> cleaned up:
>  boolean, byte, short, int, long, float, double, decimal -> boolean, byte,
> short, int, long, float, double, decimal, string, char, varchar, timestamp
>  string, char, varchar -> all
>  timestamp -> boolean, byte, short, int, long, float, double, decimal,
> string, char, varchar, date
>  date -> string, char, varchar, timestamp
>  binary -> string, char, varchar, date
> 
> .. Owen
> 
> On Mon, Dec 12, 2016 at 2:55 PM, Dain Sundstrom <da...@iq80.com> wrote:
> 
>> Is "ORC's schema evolution uses the column names when they are available”
>> documented somewhere?
>> 
>> -dain
>> 
>>> On Dec 12, 2016, at 2:24 PM, Owen O'Malley (JIRA) <ji...@apache.org>
>> wrote:
>>> 
>>> Owen O'Malley created ORC-120:
>>> ---------------------------------
>>> 
>>>            Summary: Create a backwards compatibility mode of ignoring
>> names for evolution
>>>                Key: ORC-120
>>>                URL: https://issues.apache.org/jira/browse/ORC-120
>>>            Project: Orc
>>>         Issue Type: Task
>>>           Reporter: Owen O'Malley
>>> 
>>> 
>>> ORC's schema evolution uses the column names when they are available.
>> Hive 2.1 uses a positional schema, so ORC should support a backward
>> compatibility mode for Hive users during the transition.
>>> 
>>> 
>>> 
>>> --
>>> This message was sent by Atlassian JIRA
>>> (v6.3.4#6332)
>> 
>> 


Re: [jira] [Created] (ORC-120) Create a backwards compatibility mode of ignoring names for evolution

Posted by Owen O'Malley <om...@apache.org>.
No, unfortunately, but it needs to be. The basic rules from
SchemaEvolution.java look like:

structs (including the top row):
   if field names are available (post HIVE-4243), use name matching
   otherwise use positional matching

lists, maps, unions:
  children must match

Many primitives can convert to each other, but this list needs to be
cleaned up:
  boolean, byte, short, int, long, float, double, decimal -> boolean, byte,
short, int, long, float, double, decimal, string, char, varchar, timestamp
  string, char, varchar -> all
  timestamp -> boolean, byte, short, int, long, float, double, decimal,
string, char, varchar, date
  date -> string, char, varchar, timestamp
  binary -> string, char, varchar, date

.. Owen

On Mon, Dec 12, 2016 at 2:55 PM, Dain Sundstrom <da...@iq80.com> wrote:

> Is "ORC's schema evolution uses the column names when they are available”
> documented somewhere?
>
> -dain
>
> > On Dec 12, 2016, at 2:24 PM, Owen O'Malley (JIRA) <ji...@apache.org>
> wrote:
> >
> > Owen O'Malley created ORC-120:
> > ---------------------------------
> >
> >             Summary: Create a backwards compatibility mode of ignoring
> names for evolution
> >                 Key: ORC-120
> >                 URL: https://issues.apache.org/jira/browse/ORC-120
> >             Project: Orc
> >          Issue Type: Task
> >            Reporter: Owen O'Malley
> >
> >
> > ORC's schema evolution uses the column names when they are available.
> Hive 2.1 uses a positional schema, so ORC should support a backward
> compatibility mode for Hive users during the transition.
> >
> >
> >
> > --
> > This message was sent by Atlassian JIRA
> > (v6.3.4#6332)
>
>

Re: [jira] [Created] (ORC-120) Create a backwards compatibility mode of ignoring names for evolution

Posted by Dain Sundstrom <da...@iq80.com>.
Is "ORC's schema evolution uses the column names when they are available” documented somewhere?

-dain

> On Dec 12, 2016, at 2:24 PM, Owen O'Malley (JIRA) <ji...@apache.org> wrote:
> 
> Owen O'Malley created ORC-120:
> ---------------------------------
> 
>             Summary: Create a backwards compatibility mode of ignoring names for evolution
>                 Key: ORC-120
>                 URL: https://issues.apache.org/jira/browse/ORC-120
>             Project: Orc
>          Issue Type: Task
>            Reporter: Owen O'Malley
> 
> 
> ORC's schema evolution uses the column names when they are available. Hive 2.1 uses a positional schema, so ORC should support a backward compatibility mode for Hive users during the transition.
> 
> 
> 
> --
> This message was sent by Atlassian JIRA
> (v6.3.4#6332)