You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@tajo.apache.org by "Mattmann, Chris A (3980)" <ch...@jpl.nasa.gov> on 2016/03/29 00:37:59 UTC

Re: Tajo - hive ORC issue

Hi Sahir,

Copying both dev lists..

Cheers,
Chris

++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
Chris Mattmann, Ph.D.
Chief Architect
Instrument Software and Science Data Systems Section (398)
NASA Jet Propulsion Laboratory Pasadena, CA 91109 USA
Office: 168-519, Mailstop: 168-527
Email: chris.a.mattmann@nasa.gov
WWW:  http://sunset.usc.edu/~mattmann/
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
Director, Information Retrieval and Data Science Group (IRDS)
Adjunct Associate Professor, Computer Science Department
University of Southern California, Los Angeles, CA 90089 USA
WWW: http://irds.usc.edu/
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++





-----Original Message-----
From: Sahir Ahmed <sa...@truecaller.com>
Date: Thursday, March 24, 2016 at 2:18 AM
To: "charsyam@naver.com" <ch...@naver.com>, jpluser
<ch...@jpl.nasa.gov>, "hyunsik@apache.org"
<hy...@apache.org>, "jhkim@apache.org" <jh...@apache.org>
Subject: Tajo - hive ORC issue

>Hi Guys,
>
>
>I tried the issues mailing list but got some error mail back so I try
>randomly to you guys,
>
>
>
>
>Me and a collegue tried integrating tajo with hive which works fine on
>some tables. But when trying to query a table with ORC as file type we
>get following error message:
>
>
>ERROR: internal error:
>org.apache.tajo.exception.UnknownDataFormatException: unknown data
>format: 'org.apache.hadoop.hive.ql.io.orc.OrcInputFormat'
>
>
>Could a JIRA ticket be opened for this issue?
>
>
>Kind regards,
>Sahir
>


Re: Tajo - hive ORC issue

Posted by Jihoon Son <ji...@apache.org>.
It sounds nice! I will ping you as soon as when we are ready.

Cheers,
Jihoon

2016년 3월 30일 (수) 오전 12:00, Sahir Ahmed <sa...@truecaller.com>님이 작성:

> Thanks Jihoon! I already subscribed, I'm really looking forward to that
> version then because a part of my thesis project is to find a suitable low
> latency dwh product to support hive and maybe tajo can be truecaller's
> answer ;)
>
> Best,
>
> Sahir
>
> ------------------------------
> *Från:* Jihoon Son <ji...@apache.org>
> *Skickat:* den 29 mars 2016 12:53
> *Till:* Sahir Ahmed; dev@tajo.apache.org; dev@hive.apache.org
>
> *Ämne:* Re: Tajo - hive ORC issue
> Sahir, we don't support array and struct types yet, but it is one of our
> highest priority issues of 0.12.0 version.
> Hyunsik already started implementation, and I expect it will be finished
> in April.
>
> Besides, please subscribe mailing lists of Tajo (
> http://tajo.apache.org/mailing-lists.html) and Hive (
> http://hive.apache.org/mailing_lists.html). Other contributors in both
> communities cannot receive your mail if you don't subscribe.
>
> Regards,
> Jihoon
>
> 2016년 3월 29일 (화) 오후 5:16, Sahir Ahmed <sa...@truecaller.com>님이 작성:
>
>> Thanks for the reply guys I'll have a look, also does tajo support arrays
>> and structs in the tables?
>>
>>
>> Kind regards,
>>
>> Sahir
>>
>>
>> ------------------------------
>> *Från:* Jihoon Son <ji...@apache.org>
>> *Skickat:* den 29 mars 2016 09:51
>> *Till:* dev@tajo.apache.org; dev@hive.apache.org; Sahir Ahmed
>> *Ämne:* Re: Tajo - hive ORC issue
>>
>> Gopal, thanks for your kind reply.
>>
>> Sahir, as Hyunsik and Gopal said, it was a dependency problem and already
>> fixed in Tajo trunk and 0.11.2 branches.
>>
>> I expect 0.11.2 will be released in next few weeks.
>> However, if you want to test before our release, you can build the source
>> by yourself, or simply download our nightly build version (
>> https://builds.apache.org/job/Tajo-0.11.2-nightly/lastSuccessfulBuild/artifact/tajo-dist/target/
>> ).
>>
>> I hope this will be helpful to you.
>>
>> Regards,
>> Jihoon
>>
>> 2016년 3월 29일 (화) 오후 2:29, Gopal Vijayaraghavan <go...@apache.org>님이 작성:
>>
>>>
>>> >>Me and a collegue tried integrating tajo with hive which works fine on
>>> >>some tables. But when trying to query a table with ORC as file type we
>>> >>get following error message:
>>> >>
>>> >>
>>> >>ERROR: internal error:
>>> >>org.apache.tajo.exception.UnknownDataFormatException: unknown data
>>> >>format: 'org.apache.hadoop.hive.ql.io.orc.OrcInputFormat'
>>>
>>> You might want to check whether Tajo build you're on is using Facebook
>>> DWRF fork or Apache ORC.
>>>
>>> I'm guessing this is the boundary ticket -
>>> https://issues.apache.org/jira/browse/TAJO-2102
>>>
>>>
>>> Cheers,
>>> Gopal
>>>
>>>
>>>

Re: Tajo - hive ORC issue

Posted by Jihoon Son <ji...@apache.org>.
It sounds nice! I will ping you as soon as when we are ready.

Cheers,
Jihoon

2016년 3월 30일 (수) 오전 12:00, Sahir Ahmed <sa...@truecaller.com>님이 작성:

> Thanks Jihoon! I already subscribed, I'm really looking forward to that
> version then because a part of my thesis project is to find a suitable low
> latency dwh product to support hive and maybe tajo can be truecaller's
> answer ;)
>
> Best,
>
> Sahir
>
> ------------------------------
> *Från:* Jihoon Son <ji...@apache.org>
> *Skickat:* den 29 mars 2016 12:53
> *Till:* Sahir Ahmed; dev@tajo.apache.org; dev@hive.apache.org
>
> *Ämne:* Re: Tajo - hive ORC issue
> Sahir, we don't support array and struct types yet, but it is one of our
> highest priority issues of 0.12.0 version.
> Hyunsik already started implementation, and I expect it will be finished
> in April.
>
> Besides, please subscribe mailing lists of Tajo (
> http://tajo.apache.org/mailing-lists.html) and Hive (
> http://hive.apache.org/mailing_lists.html). Other contributors in both
> communities cannot receive your mail if you don't subscribe.
>
> Regards,
> Jihoon
>
> 2016년 3월 29일 (화) 오후 5:16, Sahir Ahmed <sa...@truecaller.com>님이 작성:
>
>> Thanks for the reply guys I'll have a look, also does tajo support arrays
>> and structs in the tables?
>>
>>
>> Kind regards,
>>
>> Sahir
>>
>>
>> ------------------------------
>> *Från:* Jihoon Son <ji...@apache.org>
>> *Skickat:* den 29 mars 2016 09:51
>> *Till:* dev@tajo.apache.org; dev@hive.apache.org; Sahir Ahmed
>> *Ämne:* Re: Tajo - hive ORC issue
>>
>> Gopal, thanks for your kind reply.
>>
>> Sahir, as Hyunsik and Gopal said, it was a dependency problem and already
>> fixed in Tajo trunk and 0.11.2 branches.
>>
>> I expect 0.11.2 will be released in next few weeks.
>> However, if you want to test before our release, you can build the source
>> by yourself, or simply download our nightly build version (
>> https://builds.apache.org/job/Tajo-0.11.2-nightly/lastSuccessfulBuild/artifact/tajo-dist/target/
>> ).
>>
>> I hope this will be helpful to you.
>>
>> Regards,
>> Jihoon
>>
>> 2016년 3월 29일 (화) 오후 2:29, Gopal Vijayaraghavan <go...@apache.org>님이 작성:
>>
>>>
>>> >>Me and a collegue tried integrating tajo with hive which works fine on
>>> >>some tables. But when trying to query a table with ORC as file type we
>>> >>get following error message:
>>> >>
>>> >>
>>> >>ERROR: internal error:
>>> >>org.apache.tajo.exception.UnknownDataFormatException: unknown data
>>> >>format: 'org.apache.hadoop.hive.ql.io.orc.OrcInputFormat'
>>>
>>> You might want to check whether Tajo build you're on is using Facebook
>>> DWRF fork or Apache ORC.
>>>
>>> I'm guessing this is the boundary ticket -
>>> https://issues.apache.org/jira/browse/TAJO-2102
>>>
>>>
>>> Cheers,
>>> Gopal
>>>
>>>
>>>

Re: Tajo - hive ORC issue

Posted by Jihoon Son <ji...@apache.org>.
Sahir, we don't support array and struct types yet, but it is one of our
highest priority issues of 0.12.0 version.
Hyunsik already started implementation, and I expect it will be finished in
April.

Besides, please subscribe mailing lists of Tajo (
http://tajo.apache.org/mailing-lists.html) and Hive (
http://hive.apache.org/mailing_lists.html). Other contributors in both
communities cannot receive your mail if you don't subscribe.

Regards,
Jihoon

2016년 3월 29일 (화) 오후 5:16, Sahir Ahmed <sa...@truecaller.com>님이 작성:

> Thanks for the reply guys I'll have a look, also does tajo support arrays
> and structs in the tables?
>
>
> Kind regards,
>
> Sahir
>
>
> ------------------------------
> *Från:* Jihoon Son <ji...@apache.org>
> *Skickat:* den 29 mars 2016 09:51
> *Till:* dev@tajo.apache.org; dev@hive.apache.org; Sahir Ahmed
> *Ämne:* Re: Tajo - hive ORC issue
>
> Gopal, thanks for your kind reply.
>
> Sahir, as Hyunsik and Gopal said, it was a dependency problem and already
> fixed in Tajo trunk and 0.11.2 branches.
>
> I expect 0.11.2 will be released in next few weeks.
> However, if you want to test before our release, you can build the source
> by yourself, or simply download our nightly build version (
> https://builds.apache.org/job/Tajo-0.11.2-nightly/lastSuccessfulBuild/artifact/tajo-dist/target/
> ).
>
> I hope this will be helpful to you.
>
> Regards,
> Jihoon
>
> 2016년 3월 29일 (화) 오후 2:29, Gopal Vijayaraghavan <go...@apache.org>님이 작성:
>
>>
>> >>Me and a collegue tried integrating tajo with hive which works fine on
>> >>some tables. But when trying to query a table with ORC as file type we
>> >>get following error message:
>> >>
>> >>
>> >>ERROR: internal error:
>> >>org.apache.tajo.exception.UnknownDataFormatException: unknown data
>> >>format: 'org.apache.hadoop.hive.ql.io.orc.OrcInputFormat'
>>
>> You might want to check whether Tajo build you're on is using Facebook
>> DWRF fork or Apache ORC.
>>
>> I'm guessing this is the boundary ticket -
>> https://issues.apache.org/jira/browse/TAJO-2102
>>
>>
>> Cheers,
>> Gopal
>>
>>
>>

Re: Tajo - hive ORC issue

Posted by Jihoon Son <ji...@apache.org>.
Sahir, we don't support array and struct types yet, but it is one of our
highest priority issues of 0.12.0 version.
Hyunsik already started implementation, and I expect it will be finished in
April.

Besides, please subscribe mailing lists of Tajo (
http://tajo.apache.org/mailing-lists.html) and Hive (
http://hive.apache.org/mailing_lists.html). Other contributors in both
communities cannot receive your mail if you don't subscribe.

Regards,
Jihoon

2016년 3월 29일 (화) 오후 5:16, Sahir Ahmed <sa...@truecaller.com>님이 작성:

> Thanks for the reply guys I'll have a look, also does tajo support arrays
> and structs in the tables?
>
>
> Kind regards,
>
> Sahir
>
>
> ------------------------------
> *Från:* Jihoon Son <ji...@apache.org>
> *Skickat:* den 29 mars 2016 09:51
> *Till:* dev@tajo.apache.org; dev@hive.apache.org; Sahir Ahmed
> *Ämne:* Re: Tajo - hive ORC issue
>
> Gopal, thanks for your kind reply.
>
> Sahir, as Hyunsik and Gopal said, it was a dependency problem and already
> fixed in Tajo trunk and 0.11.2 branches.
>
> I expect 0.11.2 will be released in next few weeks.
> However, if you want to test before our release, you can build the source
> by yourself, or simply download our nightly build version (
> https://builds.apache.org/job/Tajo-0.11.2-nightly/lastSuccessfulBuild/artifact/tajo-dist/target/
> ).
>
> I hope this will be helpful to you.
>
> Regards,
> Jihoon
>
> 2016년 3월 29일 (화) 오후 2:29, Gopal Vijayaraghavan <go...@apache.org>님이 작성:
>
>>
>> >>Me and a collegue tried integrating tajo with hive which works fine on
>> >>some tables. But when trying to query a table with ORC as file type we
>> >>get following error message:
>> >>
>> >>
>> >>ERROR: internal error:
>> >>org.apache.tajo.exception.UnknownDataFormatException: unknown data
>> >>format: 'org.apache.hadoop.hive.ql.io.orc.OrcInputFormat'
>>
>> You might want to check whether Tajo build you're on is using Facebook
>> DWRF fork or Apache ORC.
>>
>> I'm guessing this is the boundary ticket -
>> https://issues.apache.org/jira/browse/TAJO-2102
>>
>>
>> Cheers,
>> Gopal
>>
>>
>>

SV: Tajo - hive ORC issue

Posted by Sahir Ahmed <sa...@truecaller.com>.
Thanks for the reply guys I'll have a look, also does tajo support arrays and structs in the tables?


Kind regards,

Sahir


________________________________
Från: Jihoon Son <ji...@apache.org>
Skickat: den 29 mars 2016 09:51
Till: dev@tajo.apache.org; dev@hive.apache.org; Sahir Ahmed
Ämne: Re: Tajo - hive ORC issue

Gopal, thanks for your kind reply.

Sahir, as Hyunsik and Gopal said, it was a dependency problem and already fixed in Tajo trunk and 0.11.2 branches.

I expect 0.11.2 will be released in next few weeks.
However, if you want to test before our release, you can build the source by yourself, or simply download our nightly build version (https://builds.apache.org/job/Tajo-0.11.2-nightly/lastSuccessfulBuild/artifact/tajo-dist/target/).

I hope this will be helpful to you.

Regards,
Jihoon

2016? 3? 29? (?) ?? 2:29, Gopal Vijayaraghavan <go...@apache.org>>?? ??:

>>Me and a collegue tried integrating tajo with hive which works fine on
>>some tables. But when trying to query a table with ORC as file type we
>>get following error message:
>>
>>
>>ERROR: internal error:
>>org.apache.tajo.exception.UnknownDataFormatException: unknown data
>>format: 'org.apache.hadoop.hive.ql.io<http://org.apache.hadoop.hive.ql.io>.orc.OrcInputFormat'

You might want to check whether Tajo build you're on is using Facebook
DWRF fork or Apache ORC.

I'm guessing this is the boundary ticket -
https://issues.apache.org/jira/browse/TAJO-2102


Cheers,
Gopal



Re: Tajo - hive ORC issue

Posted by Jihoon Son <ji...@apache.org>.
Gopal, thanks for your kind reply.

Sahir, as Hyunsik and Gopal said, it was a dependency problem and already
fixed in Tajo trunk and 0.11.2 branches.

I expect 0.11.2 will be released in next few weeks.
However, if you want to test before our release, you can build the source
by yourself, or simply download our nightly build version (
https://builds.apache.org/job/Tajo-0.11.2-nightly/lastSuccessfulBuild/artifact/tajo-dist/target/
).

I hope this will be helpful to you.

Regards,
Jihoon

2016년 3월 29일 (화) 오후 2:29, Gopal Vijayaraghavan <go...@apache.org>님이 작성:

>
> >>Me and a collegue tried integrating tajo with hive which works fine on
> >>some tables. But when trying to query a table with ORC as file type we
> >>get following error message:
> >>
> >>
> >>ERROR: internal error:
> >>org.apache.tajo.exception.UnknownDataFormatException: unknown data
> >>format: 'org.apache.hadoop.hive.ql.io.orc.OrcInputFormat'
>
> You might want to check whether Tajo build you're on is using Facebook
> DWRF fork or Apache ORC.
>
> I'm guessing this is the boundary ticket -
> https://issues.apache.org/jira/browse/TAJO-2102
>
>
> Cheers,
> Gopal
>
>
>

Re: Tajo - hive ORC issue

Posted by Jihoon Son <ji...@apache.org>.
Gopal, thanks for your kind reply.

Sahir, as Hyunsik and Gopal said, it was a dependency problem and already
fixed in Tajo trunk and 0.11.2 branches.

I expect 0.11.2 will be released in next few weeks.
However, if you want to test before our release, you can build the source
by yourself, or simply download our nightly build version (
https://builds.apache.org/job/Tajo-0.11.2-nightly/lastSuccessfulBuild/artifact/tajo-dist/target/
).

I hope this will be helpful to you.

Regards,
Jihoon

2016년 3월 29일 (화) 오후 2:29, Gopal Vijayaraghavan <go...@apache.org>님이 작성:

>
> >>Me and a collegue tried integrating tajo with hive which works fine on
> >>some tables. But when trying to query a table with ORC as file type we
> >>get following error message:
> >>
> >>
> >>ERROR: internal error:
> >>org.apache.tajo.exception.UnknownDataFormatException: unknown data
> >>format: 'org.apache.hadoop.hive.ql.io.orc.OrcInputFormat'
>
> You might want to check whether Tajo build you're on is using Facebook
> DWRF fork or Apache ORC.
>
> I'm guessing this is the boundary ticket -
> https://issues.apache.org/jira/browse/TAJO-2102
>
>
> Cheers,
> Gopal
>
>
>

Re: Tajo - hive ORC issue

Posted by Gopal Vijayaraghavan <go...@apache.org>.
>>Me and a collegue tried integrating tajo with hive which works fine on
>>some tables. But when trying to query a table with ORC as file type we
>>get following error message:
>>
>>
>>ERROR: internal error:
>>org.apache.tajo.exception.UnknownDataFormatException: unknown data
>>format: 'org.apache.hadoop.hive.ql.io.orc.OrcInputFormat'

You might want to check whether Tajo build you're on is using Facebook
DWRF fork or Apache ORC.

I'm guessing this is the boundary ticket -
https://issues.apache.org/jira/browse/TAJO-2102


Cheers,
Gopal



Re: Tajo - hive ORC issue

Posted by Gopal Vijayaraghavan <go...@apache.org>.
>>Me and a collegue tried integrating tajo with hive which works fine on
>>some tables. But when trying to query a table with ORC as file type we
>>get following error message:
>>
>>
>>ERROR: internal error:
>>org.apache.tajo.exception.UnknownDataFormatException: unknown data
>>format: 'org.apache.hadoop.hive.ql.io.orc.OrcInputFormat'

You might want to check whether Tajo build you're on is using Facebook
DWRF fork or Apache ORC.

I'm guessing this is the boundary ticket -
https://issues.apache.org/jira/browse/TAJO-2102


Cheers,
Gopal