You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@hive.apache.org by Biswajit Nayak <bi...@altiscale.com> on 2016/03/01 05:50:44 UTC

Hive Cli ORC table read error with limit option

Hi All,

I am trying to run a simple query of select with limit option, it fails.
Below are the details.

Versions:-

Hadoop :-  2.7.1
Hive      :-  1.2.0
Sqoop   :-  1.4.5


Query:-

The table table_orc is partitioned based on year, month and day. And the
table is ORC storage.

hive> select date from testdb.table_orc where year = 2016 and month =1
and day =29 limit 10;
OK
Failed with exception java.io.IOException:java.lang.RuntimeException:
serious problem
Time taken: 0.32 seconds
hive>


While the query select date from testdb.table_orc where year = 2016 and
month =1 and day =29 limit 10;   without limit works perfectly. Even the
count (*) or select * works perfectly fine.


Any one faced this issue.

Regards
Biswa

Re: Hive Cli ORC table read error with limit option

Posted by Biswajit Nayak <bi...@altiscale.com>.
Thanks Prasanth for the update. I will test it and update it here the
outcome.

Thanks
Biswa

On Tue, Apr 19, 2016 at 6:26 AM, Prasanth Jayachandran <
pjayachandran@hortonworks.com> wrote:

> Hi Biswajit
>
> You might need patch from https://issues.apache.org/jira/browse/HIVE-11546
>
> Can you apply this patch to your hive build and see if it solves the
> issue? (recommended)
>
> Alternatively, you can use “hive.exec.orc.split.strategy”=“BI” as
> workaround.
> Its highly not recommended to use this config as it will disable split
> elimination
> and may generate sub-optiomal splits resulting in less map-side
> parallelism.
> This config is just provided as an workaround and is suitable when all orc
> files
> are small (<less than stripe size or block size).
>
> Thanks
> Prasanth
>
>
> On Apr 18, 2016, at 7:44 PM, Biswajit Nayak <bi...@altiscale.com>
> wrote:
>
> Hi All,
>
> I seriously need help on this aspect. Any reference or pointer to
> troubleshoot or fix this, could be helpful.
>
> Regards
> Biswa
>
> On Fri, Mar 25, 2016 at 11:24 PM, Biswajit Nayak <bi...@altiscale.com>
> wrote:
>
>> Prashanth,
>>
>> Apologies for the delay in response.
>>
>> Below is the orcfiledump of the empty orc file from a broken partition.
>>
>> *$ hive --orcfiledump /hive/*testdb*.db/*table_orc
>> */year=2016/month=1/day=29/000000_0*
>> *Structure for  /hive/*testdb*.db/*table_orc
>> */year=2016/month=1/day=29/000000_0*
>> *File Version: 0.12 with HIVE_8732*
>> *16/03/25 17:49:09 INFO orc.ReaderImpl: Reading ORC rows from  /hive/*
>> testdb*.db/*table_orc*/year=2016/month=1/day=29/000000_0 with {include:
>> null, offset: 0, length: 9223372036854775807}*
>> *16/03/25 17:49:09 INFO orc.RecordReaderFactory: Schema is not specified
>> on read. Using file schema.*
>> *Rows: 0*
>> *Compression: SNAPPY*
>> *Compression size: 262144*
>> *Type: struct<>*
>>
>> *Stripe Statistics:*
>>
>> *File Statistics:*
>> *  Column 0: count: 0 hasNull: false*
>>
>> *Stripes:*
>>
>> *File length: 49 bytes*
>> *Padding length: 0 bytes*
>> *Padding ratio: 0%*
>> *$ *
>>
>>
>> I still not able to figure it out whats causing this odd behaviour?
>>
>>
>> Regards
>> Biswa
>>
>> On Thu, Mar 10, 2016 at 3:12 PM, Prasanth Jayachandran <
>> pjayachandran@hortonworks.com> wrote:
>>
>>> Alternatively you can send orcfiledump output for the empty orc file
>>> from broken partition.
>>>
>>> Thanks
>>> Prasanth
>>>
>>> On Mar 10, 2016, at 5:11 PM, Prasanth Jayachandran <
>>> pjayachandran@hortonworks.com> wrote:
>>>
>>> Could you attach the emtpy orc files from one of the broken partition
>>> somewhere? I can run some tests on it to see why its happening.
>>>
>>> Thanks
>>> Prasanth
>>>
>>> On Mar 8, 2016, at 12:02 AM, Biswajit Nayak <bi...@altiscale.com>
>>> wrote:
>>>
>>> Both the parameters are set to false by default.
>>>
>>> *hive> set hive.optimize.index.filter;*
>>> *hive.optimize.index.filter=false*
>>> *hive> set hive.orc.splits.include.file.footer;*
>>> *hive.orc.splits.include.file.footer=false*
>>> *hive> *
>>>
>>> >>>I suspect this might be related to having 0 row files in the buckets
>>> not
>>> having any recorded schema.
>>>
>>> yes there are few files with 0 row, but the query works with other
>>> partition (which has 0 row files). Out of 30 partition (for a month), 3-4
>>> partition are having this issue. Even reload of the data does not yield
>>> anything. Query works fine in MR now, but having issue in tez.
>>>
>>>
>>>
>>> On Tue, Mar 8, 2016 at 2:43 AM, Gopal Vijayaraghavan <go...@apache.org>
>>> wrote:
>>>
>>>>
>>>> > c                varchar(2)
>>>> ...
>>>> > Num Buckets:         7
>>>>
>>>> I suspect this might be related to having 0 row files in the buckets not
>>>> having any recorded schema.
>>>>
>>>> You can also experiment with hive.optimize.index.filter=false, to see if
>>>> the zero row case is artificially produced via predicate push-down.
>>>>
>>>>
>>>> That shouldn't be a problem unless you've turned on
>>>> hive.orc.splits.include.file.footer=true (recommended to be false).
>>>>
>>>> Your row-locations don't actually match any Apache source jar in my
>>>> builds, are there any other patches to consider?
>>>>
>>>> Cheers,
>>>> Gopal
>>>>
>>>>
>>>>
>>>
>>>
>>>
>>
>
>

Re: Hive Cli ORC table read error with limit option

Posted by Prasanth Jayachandran <pj...@hortonworks.com>.
Hi Biswajit

You might need patch from https://issues.apache.org/jira/browse/HIVE-11546

Can you apply this patch to your hive build and see if it solves the issue? (recommended)

Alternatively, you can use “hive.exec.orc.split.strategy”=“BI” as workaround.
Its highly not recommended to use this config as it will disable split elimination
and may generate sub-optiomal splits resulting in less map-side parallelism.
This config is just provided as an workaround and is suitable when all orc files
are small (<less than stripe size or block size).

Thanks
Prasanth

On Apr 18, 2016, at 7:44 PM, Biswajit Nayak <bi...@altiscale.com>> wrote:

Hi All,

I seriously need help on this aspect. Any reference or pointer to troubleshoot or fix this, could be helpful.

Regards
Biswa

On Fri, Mar 25, 2016 at 11:24 PM, Biswajit Nayak <bi...@altiscale.com>> wrote:
Prashanth,

Apologies for the delay in response.

Below is the orcfiledump of the empty orc file from a broken partition.

$ hive --orcfiledump /hive/testdb.db/table_orc/year=2016/month=1/day=29/000000_0
Structure for  /hive/testdb.db/table_orc/year=2016/month=1/day=29/000000_0
File Version: 0.12 with HIVE_8732
16/03/25 17:49:09 INFO orc.ReaderImpl: Reading ORC rows from  /hive/testdb.db/table_orc/year=2016/month=1/day=29/000000_0 with {include: null, offset: 0, length: 9223372036854775807}
16/03/25 17:49:09 INFO orc.RecordReaderFactory: Schema is not specified on read. Using file schema.
Rows: 0
Compression: SNAPPY
Compression size: 262144
Type: struct<>

Stripe Statistics:

File Statistics:
  Column 0: count: 0 hasNull: false

Stripes:

File length: 49 bytes
Padding length: 0 bytes
Padding ratio: 0%
$


I still not able to figure it out whats causing this odd behaviour?


Regards
Biswa

On Thu, Mar 10, 2016 at 3:12 PM, Prasanth Jayachandran <pj...@hortonworks.com>> wrote:
Alternatively you can send orcfiledump output for the empty orc file from broken partition.

Thanks
Prasanth

On Mar 10, 2016, at 5:11 PM, Prasanth Jayachandran <pj...@hortonworks.com>> wrote:

Could you attach the emtpy orc files from one of the broken partition somewhere? I can run some tests on it to see why its happening.

Thanks
Prasanth

On Mar 8, 2016, at 12:02 AM, Biswajit Nayak <bi...@altiscale.com>> wrote:

Both the parameters are set to false by default.

hive> set hive.optimize.index.filter;
hive.optimize.index.filter=false
hive> set hive.orc.splits.include.file.footer;
hive.orc.splits.include.file.footer=false
hive>

>>>I suspect this might be related to having 0 row files in the buckets not
having any recorded schema.

yes there are few files with 0 row, but the query works with other partition (which has 0 row files). Out of 30 partition (for a month), 3-4 partition are having this issue. Even reload of the data does not yield anything. Query works fine in MR now, but having issue in tez.



On Tue, Mar 8, 2016 at 2:43 AM, Gopal Vijayaraghavan <go...@apache.org>> wrote:

> c                varchar(2)
...
> Num Buckets:         7

I suspect this might be related to having 0 row files in the buckets not
having any recorded schema.

You can also experiment with hive.optimize.index.filter=false, to see if
the zero row case is artificially produced via predicate push-down.


That shouldn't be a problem unless you've turned on
hive.orc.splits.include.file.footer=true (recommended to be false).

Your row-locations don't actually match any Apache source jar in my
builds, are there any other patches to consider?

Cheers,
Gopal









Re: Hive Cli ORC table read error with limit option

Posted by Biswajit Nayak <bi...@altiscale.com>.
Hi All,

I seriously need help on this aspect. Any reference or pointer to
troubleshoot or fix this, could be helpful.

Regards
Biswa

On Fri, Mar 25, 2016 at 11:24 PM, Biswajit Nayak <bi...@altiscale.com>
wrote:

> Prashanth,
>
> Apologies for the delay in response.
>
> Below is the orcfiledump of the empty orc file from a broken partition.
>
> *$ hive --orcfiledump /hive/*testdb*.db/*table_orc
> */year=2016/month=1/day=29/000000_0*
>
> *Structure for  /hive/*testdb*.db/*table_orc
> */year=2016/month=1/day=29/000000_0*
>
> *File Version: 0.12 with HIVE_8732*
>
> *16/03/25 17:49:09 INFO orc.ReaderImpl: Reading ORC rows from  /hive/*
> testdb*.db/*table_orc*/year=2016/month=1/day=29/000000_0 with {include:
> null, offset: 0, length: 9223372036854775807}*
>
> *16/03/25 17:49:09 INFO orc.RecordReaderFactory: Schema is not specified
> on read. Using file schema.*
>
> *Rows: 0*
>
> *Compression: SNAPPY*
>
> *Compression size: 262144*
>
> *Type: struct<>*
>
>
> *Stripe Statistics:*
>
>
> *File Statistics:*
>
> *  Column 0: count: 0 hasNull: false*
>
>
> *Stripes:*
>
>
> *File length: 49 bytes*
>
> *Padding length: 0 bytes*
>
> *Padding ratio: 0%*
>
> *$ *
>
>
> I still not able to figure it out whats causing this odd behaviour?
>
>
> Regards
> Biswa
>
> On Thu, Mar 10, 2016 at 3:12 PM, Prasanth Jayachandran <
> pjayachandran@hortonworks.com> wrote:
>
>> Alternatively you can send orcfiledump output for the empty orc file from
>> broken partition.
>>
>> Thanks
>> Prasanth
>>
>> On Mar 10, 2016, at 5:11 PM, Prasanth Jayachandran <
>> pjayachandran@hortonworks.com> wrote:
>>
>> Could you attach the emtpy orc files from one of the broken partition
>> somewhere? I can run some tests on it to see why its happening.
>>
>> Thanks
>> Prasanth
>>
>> On Mar 8, 2016, at 12:02 AM, Biswajit Nayak <bi...@altiscale.com>
>> wrote:
>>
>> Both the parameters are set to false by default.
>>
>> *hive> set hive.optimize.index.filter;*
>> *hive.optimize.index.filter=false*
>> *hive> set hive.orc.splits.include.file.footer;*
>> *hive.orc.splits.include.file.footer=false*
>> *hive> *
>>
>> >>>I suspect this might be related to having 0 row files in the buckets
>> not
>> having any recorded schema.
>>
>> yes there are few files with 0 row, but the query works with other
>> partition (which has 0 row files). Out of 30 partition (for a month), 3-4
>> partition are having this issue. Even reload of the data does not yield
>> anything. Query works fine in MR now, but having issue in tez.
>>
>>
>>
>> On Tue, Mar 8, 2016 at 2:43 AM, Gopal Vijayaraghavan <go...@apache.org>
>> wrote:
>>
>>>
>>> > c                varchar(2)
>>> ...
>>> > Num Buckets:         7
>>>
>>> I suspect this might be related to having 0 row files in the buckets not
>>> having any recorded schema.
>>>
>>> You can also experiment with hive.optimize.index.filter=false, to see if
>>> the zero row case is artificially produced via predicate push-down.
>>>
>>>
>>> That shouldn't be a problem unless you've turned on
>>> hive.orc.splits.include.file.footer=true (recommended to be false).
>>>
>>> Your row-locations don't actually match any Apache source jar in my
>>> builds, are there any other patches to consider?
>>>
>>> Cheers,
>>> Gopal
>>>
>>>
>>>
>>
>>
>>
>

Re: Hive Cli ORC table read error with limit option

Posted by Biswajit Nayak <bi...@altiscale.com>.
Prashanth,

Apologies for the delay in response.

Below is the orcfiledump of the empty orc file from a broken partition.

*$ hive --orcfiledump /hive/*testdb*.db/*table_orc
*/year=2016/month=1/day=29/000000_0*

*Structure for  /hive/*testdb*.db/*table_orc
*/year=2016/month=1/day=29/000000_0*

*File Version: 0.12 with HIVE_8732*

*16/03/25 17:49:09 INFO orc.ReaderImpl: Reading ORC rows from  /hive/*testdb
*.db/*table_orc*/year=2016/month=1/day=29/000000_0 with {include: null,
offset: 0, length: 9223372036854775807}*

*16/03/25 17:49:09 INFO orc.RecordReaderFactory: Schema is not specified on
read. Using file schema.*

*Rows: 0*

*Compression: SNAPPY*

*Compression size: 262144*

*Type: struct<>*


*Stripe Statistics:*


*File Statistics:*

*  Column 0: count: 0 hasNull: false*


*Stripes:*


*File length: 49 bytes*

*Padding length: 0 bytes*

*Padding ratio: 0%*

*$ *


I still not able to figure it out whats causing this odd behaviour?


Regards
Biswa

On Thu, Mar 10, 2016 at 3:12 PM, Prasanth Jayachandran <
pjayachandran@hortonworks.com> wrote:

> Alternatively you can send orcfiledump output for the empty orc file from
> broken partition.
>
> Thanks
> Prasanth
>
> On Mar 10, 2016, at 5:11 PM, Prasanth Jayachandran <
> pjayachandran@hortonworks.com> wrote:
>
> Could you attach the emtpy orc files from one of the broken partition
> somewhere? I can run some tests on it to see why its happening.
>
> Thanks
> Prasanth
>
> On Mar 8, 2016, at 12:02 AM, Biswajit Nayak <bi...@altiscale.com>
> wrote:
>
> Both the parameters are set to false by default.
>
> *hive> set hive.optimize.index.filter;*
> *hive.optimize.index.filter=false*
> *hive> set hive.orc.splits.include.file.footer;*
> *hive.orc.splits.include.file.footer=false*
> *hive> *
>
> >>>I suspect this might be related to having 0 row files in the buckets
> not
> having any recorded schema.
>
> yes there are few files with 0 row, but the query works with other
> partition (which has 0 row files). Out of 30 partition (for a month), 3-4
> partition are having this issue. Even reload of the data does not yield
> anything. Query works fine in MR now, but having issue in tez.
>
>
>
> On Tue, Mar 8, 2016 at 2:43 AM, Gopal Vijayaraghavan <go...@apache.org>
> wrote:
>
>>
>> > c                varchar(2)
>> ...
>> > Num Buckets:         7
>>
>> I suspect this might be related to having 0 row files in the buckets not
>> having any recorded schema.
>>
>> You can also experiment with hive.optimize.index.filter=false, to see if
>> the zero row case is artificially produced via predicate push-down.
>>
>>
>> That shouldn't be a problem unless you've turned on
>> hive.orc.splits.include.file.footer=true (recommended to be false).
>>
>> Your row-locations don't actually match any Apache source jar in my
>> builds, are there any other patches to consider?
>>
>> Cheers,
>> Gopal
>>
>>
>>
>
>
>

Re: Hive Cli ORC table read error with limit option

Posted by Prasanth Jayachandran <pj...@hortonworks.com>.
Alternatively you can send orcfiledump output for the empty orc file from broken partition.

Thanks
Prasanth
On Mar 10, 2016, at 5:11 PM, Prasanth Jayachandran <pj...@hortonworks.com>> wrote:

Could you attach the emtpy orc files from one of the broken partition somewhere? I can run some tests on it to see why its happening.

Thanks
Prasanth

On Mar 8, 2016, at 12:02 AM, Biswajit Nayak <bi...@altiscale.com>> wrote:

Both the parameters are set to false by default.

hive> set hive.optimize.index.filter;
hive.optimize.index.filter=false
hive> set hive.orc.splits.include.file.footer;
hive.orc.splits.include.file.footer=false
hive>

>>>I suspect this might be related to having 0 row files in the buckets not
having any recorded schema.

yes there are few files with 0 row, but the query works with other partition (which has 0 row files). Out of 30 partition (for a month), 3-4 partition are having this issue. Even reload of the data does not yield anything. Query works fine in MR now, but having issue in tez.



On Tue, Mar 8, 2016 at 2:43 AM, Gopal Vijayaraghavan <go...@apache.org>> wrote:

> c                varchar(2)
...
> Num Buckets:         7

I suspect this might be related to having 0 row files in the buckets not
having any recorded schema.

You can also experiment with hive.optimize.index.filter=false, to see if
the zero row case is artificially produced via predicate push-down.


That shouldn't be a problem unless you've turned on
hive.orc.splits.include.file.footer=true (recommended to be false).

Your row-locations don't actually match any Apache source jar in my
builds, are there any other patches to consider?

Cheers,
Gopal






Re: Hive Cli ORC table read error with limit option

Posted by Prasanth Jayachandran <pj...@hortonworks.com>.
Could you attach the emtpy orc files from one of the broken partition somewhere? I can run some tests on it to see why its happening.

Thanks
Prasanth

On Mar 8, 2016, at 12:02 AM, Biswajit Nayak <bi...@altiscale.com>> wrote:

Both the parameters are set to false by default.

hive> set hive.optimize.index.filter;
hive.optimize.index.filter=false
hive> set hive.orc.splits.include.file.footer;
hive.orc.splits.include.file.footer=false
hive>

>>>I suspect this might be related to having 0 row files in the buckets not
having any recorded schema.

yes there are few files with 0 row, but the query works with other partition (which has 0 row files). Out of 30 partition (for a month), 3-4 partition are having this issue. Even reload of the data does not yield anything. Query works fine in MR now, but having issue in tez.



On Tue, Mar 8, 2016 at 2:43 AM, Gopal Vijayaraghavan <go...@apache.org>> wrote:

> c                varchar(2)
...
> Num Buckets:         7

I suspect this might be related to having 0 row files in the buckets not
having any recorded schema.

You can also experiment with hive.optimize.index.filter=false, to see if
the zero row case is artificially produced via predicate push-down.


That shouldn't be a problem unless you've turned on
hive.orc.splits.include.file.footer=true (recommended to be false).

Your row-locations don't actually match any Apache source jar in my
builds, are there any other patches to consider?

Cheers,
Gopal





Re: Hive Cli ORC table read error with limit option

Posted by Biswajit Nayak <bi...@altiscale.com>.
Both the parameters are set to false by default.

*hive> set hive.optimize.index.filter;*

*hive.optimize.index.filter=false*

*hive> set hive.orc.splits.include.file.footer;*

*hive.orc.splits.include.file.footer=false*

*hive> *

>>>I suspect this might be related to having 0 row files in the buckets not
having any recorded schema.

yes there are few files with 0 row, but the query works with other
partition (which has 0 row files). Out of 30 partition (for a month), 3-4
partition are having this issue. Even reload of the data does not yield
anything. Query works fine in MR now, but having issue in tez.



On Tue, Mar 8, 2016 at 2:43 AM, Gopal Vijayaraghavan <go...@apache.org>
wrote:

>
> > c                varchar(2)
> ...
> > Num Buckets:         7
>
> I suspect this might be related to having 0 row files in the buckets not
> having any recorded schema.
>
> You can also experiment with hive.optimize.index.filter=false, to see if
> the zero row case is artificially produced via predicate push-down.
>
>
> That shouldn't be a problem unless you've turned on
> hive.orc.splits.include.file.footer=true (recommended to be false).
>
> Your row-locations don't actually match any Apache source jar in my
> builds, are there any other patches to consider?
>
> Cheers,
> Gopal
>
>
>

Re: Hive Cli ORC table read error with limit option

Posted by Gopal Vijayaraghavan <go...@apache.org>.
           
> c                varchar(2)
...
> Num Buckets:         7

I suspect this might be related to having 0 row files in the buckets not
having any recorded schema.

You can also experiment with hive.optimize.index.filter=false, to see if
the zero row case is artificially produced via predicate push-down.


That shouldn't be a problem unless you've turned on
hive.orc.splits.include.file.footer=true (recommended to be false).

Your row-locations don't actually match any Apache source jar in my
builds, are there any other patches to consider?

Cheers,
Gopal



Re: Hive Cli ORC table read error with limit option

Posted by Biswajit Nayak <bi...@altiscale.com>.
Hi Gopal,


I had already pasted the table format in this thread. Will repeat it again.


*hive> desc formatted *testdb.table_orc*;*

*OK*

*# col_name             data_type            comment             *



*row_id               bigint                                   *

*a                 int                                      *

*b                  int                                      *

*c                varchar(2)                               *

*d     bigint                                   *

*e           int                                      *

*f        bigint                                   *

*g                float                                    *

*h                 int                                      *

*i                  int                                      *



*# Partition Information    *

*# col_name             data_type            comment             *



*year                 int                                      *

*month                int                                      *

*day                  int                                      *



*# Detailed Table Information    *

*Database:            *testdb

*Owner:               *************    *

*CreateTime:          Mon Jan 25 22:32:22 UTC 2016  *

*LastAccessTime:      UNKNOWN               *

*Protect Mode:        None                  *

*Retention:           0                     *

*Location:            hdfs://***************:8020/hive/*testdb*.db/table_orc
 *

*Table Type:          MANAGED_TABLE         *

*Table Parameters:    *

* last_modified_by     **************          *

* last_modified_time   **************          *

* orc.compress         SNAPPY              *

* transient_lastDdlTime 1454104669          *



*# Storage Information    *

*SerDe Library:       org.apache.hadoop.hive.ql.io.orc.OrcSerde  *

*InputFormat:         org.apache.hadoop.hive.ql.io.orc.OrcInputFormat  *

*OutputFormat:        org.apache.hadoop.hive.ql.io.orc.OrcOutputFormat  *

*Compressed:          No                    *

*Num Buckets:         7                     *

*Bucket Columns:      [f]        *

*Sort Columns:        []                    *

*Storage Desc Params:    *

* field.delim          \t                  *

* serialization.format \t                  *

*Time taken: 0.105 seconds, Fetched: 46 row(s)*

*hive> *


>>>Depends on whether any of those columns are paritition columns or not & whether
the table is marked transactional.

Yes those columns are partitioned and they are not marked as transactional.


>>>Usually that and a copy of --orcfiledump output to check the
offsets/types.

there are around 10 files, so copying all the orcfiledump will be a mess
here. Is there any way to find the defective file so that i could isolate
it and copy the orcfiledump of it here.

Thanks
Biswa


On Sat, Mar 5, 2016 at 12:21 AM, Gopal Vijayaraghavan <go...@apache.org>
wrote:

>
> > Any one has any idea about this.. Really stuck with this.
> ...
> > hive> select h from testdb.table_orc where year = 2016 and month =1 and
> >day >29 limit 10;
>
> Depends on whether any of those columns are paritition columns or not &
> whether the table is marked transactional.
>
> > Caused by: java.lang.IndexOutOfBoundsException: Index: 0
> > at java.util.Collections$EmptyList.get(Collections.java:3212)
> > at
> >org.apache.hadoop.hive.ql.io.orc.OrcProto$Type.getSubtypes(OrcProto.java:1
> >2240)
>
> If you need answers to rare problems, these emails need at least the table
> format ("desc formatted").
>
>
> Usually that and a copy of --orcfiledump output to check the offsets/types.
>
> Cheers,
> Gopal
>
>
>

Re: Hive Cli ORC table read error with limit option

Posted by Gopal Vijayaraghavan <go...@apache.org>.
> Any one has any idea about this.. Really stuck with this.
...
> hive> select h from testdb.table_orc where year = 2016 and month =1 and
>day >29 limit 10;

Depends on whether any of those columns are paritition columns or not &
whether the table is marked transactional.

> Caused by: java.lang.IndexOutOfBoundsException: Index: 0
> at java.util.Collections$EmptyList.get(Collections.java:3212)
> at 
>org.apache.hadoop.hive.ql.io.orc.OrcProto$Type.getSubtypes(OrcProto.java:1
>2240)

If you need answers to rare problems, these emails need at least the table
format ("desc formatted").


Usually that and a copy of --orcfiledump output to check the offsets/types.

Cheers,
Gopal



Re: Hive Cli ORC table read error with limit option

Posted by Biswajit Nayak <bi...@altiscale.com>.
Any one has any idea about this.. Really stuck with this.

On Tue, Mar 1, 2016 at 4:09 PM, Biswajit Nayak <bi...@altiscale.com>
wrote:

> Hi,
>
> It works for MR engine, while in TEZ it fails.
>
> *hive> set hive.execution.engine=tez;*
>
> *hive> set hive.fetch.task.conversion=none;*
>
> *hive> select h from test*db.table_orc* where year = 2016 and month =1
> and day >29 limit 10;*
>
> *Query ID = 26f9a510-c10c-475c-9988-081998b66b0c*
>
> *Total jobs = 1*
>
> *Launching Job 1 out of 1*
>
>
>
> *Status: Running (Executing on YARN cluster with App id
> application_1456379707708_1135)*
>
>
>
> *--------------------------------------------------------------------------------*
>
> *        VERTICES      STATUS  TOTAL  COMPLETED  RUNNING  PENDING  FAILED
> KILLED*
>
>
> *--------------------------------------------------------------------------------*
>
> *Map 1                 FAILED     -1          0        0       -1       0
>       0*
>
>
> *--------------------------------------------------------------------------------*
>
> *VERTICES: 00/01  [>>--------------------------] 0%    ELAPSED TIME: 0.37
> s     *
>
>
> *--------------------------------------------------------------------------------*
>
> *Status: Failed*
>
> *Vertex failed, vertexName=Map 1, vertexId=vertex_1456379707708_1135_1_00,
> diagnostics=[Vertex vertex_1456379707708_1135_1_00 [Map 1] killed/failed
> due to:ROOT_INPUT_INIT_FAILURE, Vertex Input: t*able_orc* initializer
> failed, vertex=vertex_1456379707708_1135_1_00 [Map 1],
> java.lang.RuntimeException: serious problem*
>
> * at
> org.apache.hadoop.hive.ql.io.orc.OrcInputFormat.generateSplitsInfo(OrcInputFormat.java:1021)*
>
> * at
> org.apache.hadoop.hive.ql.io.orc.OrcInputFormat.getSplits(OrcInputFormat.java:1048)*
>
> * at
> org.apache.hadoop.hive.ql.io.HiveInputFormat.addSplitsForGroup(HiveInputFormat.java:306)*
>
> * at
> org.apache.hadoop.hive.ql.io.HiveInputFormat.getSplits(HiveInputFormat.java:408)*
>
> * at
> org.apache.hadoop.hive.ql.exec.tez.HiveSplitGenerator.initialize(HiveSplitGenerator.java:131)*
>
> * at
> org.apache.tez.dag.app.dag.RootInputInitializerManager$InputInitializerCallable$1.run(RootInputInitializerManager.java:245)*
>
> * at
> org.apache.tez.dag.app.dag.RootInputInitializerManager$InputInitializerCallable$1.run(RootInputInitializerManager.java:239)*
>
> * at java.security.AccessController.doPrivileged(Native Method)*
>
> * at javax.security.auth.Subject.doAs(Subject.java:415)*
>
> * at
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1657)*
>
> * at
> org.apache.tez.dag.app.dag.RootInputInitializerManager$InputInitializerCallable.call(RootInputInitializerManager.java:239)*
>
> * at
> org.apache.tez.dag.app.dag.RootInputInitializerManager$InputInitializerCallable.call(RootInputInitializerManager.java:226)*
>
> * at java.util.concurrent.FutureTask.run(FutureTask.java:262)*
>
> * at
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)*
>
> * at
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)*
>
> * at java.lang.Thread.run(Thread.java:744)*
>
> *Caused by: java.util.concurrent.ExecutionException:
> java.lang.IndexOutOfBoundsException: Index: 0*
>
> * at java.util.concurrent.FutureTask.report(FutureTask.java:122)*
>
> * at java.util.concurrent.FutureTask.get(FutureTask.java:188)*
>
> * at
> org.apache.hadoop.hive.ql.io.orc.OrcInputFormat.generateSplitsInfo(OrcInputFormat.java:1016)*
>
> * ... 15 more*
>
> *Caused by: java.lang.IndexOutOfBoundsException: Index: 0*
>
> * at java.util.Collections$EmptyList.get(Collections.java:3212)*
>
> * at
> org.apache.hadoop.hive.ql.io.orc.OrcProto$Type.getSubtypes(OrcProto.java:12240)*
>
> * at
> org.apache.hadoop.hive.ql.io.orc.ReaderImpl.getColumnIndicesFromNames(ReaderImpl.java:651)*
>
> * at
> org.apache.hadoop.hive.ql.io.orc.ReaderImpl.getRawDataSizeOfColumns(ReaderImpl.java:634)*
>
> * at
> org.apache.hadoop.hive.ql.io.orc.OrcInputFormat$SplitGenerator.populateAndCacheStripeDetails(OrcInputFormat.java:927)*
>
> * at
> org.apache.hadoop.hive.ql.io.orc.OrcInputFormat$SplitGenerator.call(OrcInputFormat.java:836)*
>
> * at
> org.apache.hadoop.hive.ql.io.orc.OrcInputFormat$SplitGenerator.call(OrcInputFormat.java:702)*
>
> * ... 4 more*
>
> *]*
>
> *DAG did not succeed due to VERTEX_FAILURE. failedVertices:1
> killedVertices:0*
>
> *FAILED: Execution Error, return code 2 from
> org.apache.hadoop.hive.ql.exec.tez.TezTask. Vertex failed, vertexName=Map
> 1, vertexId=vertex_1456379707708_1135_1_00, diagnostics=[Vertex
> vertex_1456379707708_1135_1_00 [Map 1] killed/failed due
> to:ROOT_INPUT_INIT_FAILURE, Vertex Input: *table_orc* initializer failed,
> vertex=vertex_1456379707708_1135_1_00 [Map 1], java.lang.RuntimeException:
> serious problem*
>
> * at
> org.apache.hadoop.hive.ql.io.orc.OrcInputFormat.generateSplitsInfo(OrcInputFormat.java:1021)*
>
> * at
> org.apache.hadoop.hive.ql.io.orc.OrcInputFormat.getSplits(OrcInputFormat.java:1048)*
>
> * at
> org.apache.hadoop.hive.ql.io.HiveInputFormat.addSplitsForGroup(HiveInputFormat.java:306)*
>
> * at
> org.apache.hadoop.hive.ql.io.HiveInputFormat.getSplits(HiveInputFormat.java:408)*
>
> * at
> org.apache.hadoop.hive.ql.exec.tez.HiveSplitGenerator.initialize(HiveSplitGenerator.java:131)*
>
> * at
> org.apache.tez.dag.app.dag.RootInputInitializerManager$InputInitializerCallable$1.run(RootInputInitializerManager.java:245)*
>
> * at
> org.apache.tez.dag.app.dag.RootInputInitializerManager$InputInitializerCallable$1.run(RootInputInitializerManager.java:239)*
>
> * at java.security.AccessController.doPrivileged(Native Method)*
>
> * at javax.security.auth.Subject.doAs(Subject.java:415)*
>
> * at
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1657)*
>
> * at
> org.apache.tez.dag.app.dag.RootInputInitializerManager$InputInitializerCallable.call(RootInputInitializerManager.java:239)*
>
> * at
> org.apache.tez.dag.app.dag.RootInputInitializerManager$InputInitializerCallable.call(RootInputInitializerManager.java:226)*
>
> * at java.util.concurrent.FutureTask.run(FutureTask.java:262)*
>
> * at
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)*
>
> * at
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)*
>
> * at java.lang.Thread.run(Thread.java:744)*
>
> *Caused by: java.util.concurrent.ExecutionException:
> java.lang.IndexOutOfBoundsException: Index: 0*
>
> * at java.util.concurrent.FutureTask.report(FutureTask.java:122)*
>
> * at java.util.concurrent.FutureTask.get(FutureTask.java:188)*
>
> * at
> org.apache.hadoop.hive.ql.io.orc.OrcInputFormat.generateSplitsInfo(OrcInputFormat.java:1016)*
>
> * ... 15 more*
>
> *Caused by: java.lang.IndexOutOfBoundsException: Index: 0*
>
> * at java.util.Collections$EmptyList.get(Collections.java:3212)*
>
> * at
> org.apache.hadoop.hive.ql.io.orc.OrcProto$Type.getSubtypes(OrcProto.java:12240)*
>
> * at
> org.apache.hadoop.hive.ql.io.orc.ReaderImpl.getColumnIndicesFromNames(ReaderImpl.java:651)*
>
> * at
> org.apache.hadoop.hive.ql.io.orc.ReaderImpl.getRawDataSizeOfColumns(ReaderImpl.java:634)*
>
> * at
> org.apache.hadoop.hive.ql.io.orc.OrcInputFormat$SplitGenerator.populateAndCacheStripeDetails(OrcInputFormat.java:927)*
>
> * at
> org.apache.hadoop.hive.ql.io.orc.OrcInputFormat$SplitGenerator.call(OrcInputFormat.java:836)*
>
> * at
> org.apache.hadoop.hive.ql.io.orc.OrcInputFormat$SplitGenerator.call(OrcInputFormat.java:702)*
>
> * ... 4 more*
>
> *]DAG did not succeed due to VERTEX_FAILURE. failedVertices:1
> killedVertices:0*
>
> *hive> *
>
>
> On Tue, Mar 1, 2016 at 1:09 PM, Biswajit Nayak <bi...@altiscale.com>
> wrote:
>
>> Gopal,
>>
>> Any plan of provide the fix to Hive 1.x versions or to backport it?
>>
>> Regards
>> Biswa
>>
>> On Tue, Mar 1, 2016 at 11:44 AM, Biswajit Nayak <bi...@altiscale.com>
>> wrote:
>>
>>> Thanks Gopal for the details .. happy to know it has been counted and
>>> fixed.
>>>
>>> Biswa
>>>
>>>
>>> On Tue, Mar 1, 2016 at 11:37 AM, Gopal Vijayaraghavan <gopalv@apache.org
>>> > wrote:
>>>
>>>>
>>>> > Yes it is kerberos cluster.
>>>> ...
>>>> > After disabling the optimization in hive cli, it works with limit
>>>> >option.
>>>>
>>>> Alright, then it is fixed in -
>>>> https://issues.apache.org/jira/browse/HIVE-13120
>>>>
>>>>
>>>> Cheers,
>>>> Gopal
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>
>>
>

Re: Hive Cli ORC table read error with limit option

Posted by Biswajit Nayak <bi...@altiscale.com>.
Hi,

It works for MR engine, while in TEZ it fails.

*hive> set hive.execution.engine=tez;*

*hive> set hive.fetch.task.conversion=none;*

*hive> select h from test*db.table_orc* where year = 2016 and month =1 and
day >29 limit 10;*

*Query ID = 26f9a510-c10c-475c-9988-081998b66b0c*

*Total jobs = 1*

*Launching Job 1 out of 1*



*Status: Running (Executing on YARN cluster with App id
application_1456379707708_1135)*


*--------------------------------------------------------------------------------*

*        VERTICES      STATUS  TOTAL  COMPLETED  RUNNING  PENDING  FAILED
KILLED*

*--------------------------------------------------------------------------------*

*Map 1                 FAILED     -1          0        0       -1       0
    0*

*--------------------------------------------------------------------------------*

*VERTICES: 00/01  [>>--------------------------] 0%    ELAPSED TIME: 0.37
s     *

*--------------------------------------------------------------------------------*

*Status: Failed*

*Vertex failed, vertexName=Map 1, vertexId=vertex_1456379707708_1135_1_00,
diagnostics=[Vertex vertex_1456379707708_1135_1_00 [Map 1] killed/failed
due to:ROOT_INPUT_INIT_FAILURE, Vertex Input: t*able_orc* initializer
failed, vertex=vertex_1456379707708_1135_1_00 [Map 1],
java.lang.RuntimeException: serious problem*

* at
org.apache.hadoop.hive.ql.io.orc.OrcInputFormat.generateSplitsInfo(OrcInputFormat.java:1021)*

* at
org.apache.hadoop.hive.ql.io.orc.OrcInputFormat.getSplits(OrcInputFormat.java:1048)*

* at
org.apache.hadoop.hive.ql.io.HiveInputFormat.addSplitsForGroup(HiveInputFormat.java:306)*

* at
org.apache.hadoop.hive.ql.io.HiveInputFormat.getSplits(HiveInputFormat.java:408)*

* at
org.apache.hadoop.hive.ql.exec.tez.HiveSplitGenerator.initialize(HiveSplitGenerator.java:131)*

* at
org.apache.tez.dag.app.dag.RootInputInitializerManager$InputInitializerCallable$1.run(RootInputInitializerManager.java:245)*

* at
org.apache.tez.dag.app.dag.RootInputInitializerManager$InputInitializerCallable$1.run(RootInputInitializerManager.java:239)*

* at java.security.AccessController.doPrivileged(Native Method)*

* at javax.security.auth.Subject.doAs(Subject.java:415)*

* at
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1657)*

* at
org.apache.tez.dag.app.dag.RootInputInitializerManager$InputInitializerCallable.call(RootInputInitializerManager.java:239)*

* at
org.apache.tez.dag.app.dag.RootInputInitializerManager$InputInitializerCallable.call(RootInputInitializerManager.java:226)*

* at java.util.concurrent.FutureTask.run(FutureTask.java:262)*

* at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)*

* at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)*

* at java.lang.Thread.run(Thread.java:744)*

*Caused by: java.util.concurrent.ExecutionException:
java.lang.IndexOutOfBoundsException: Index: 0*

* at java.util.concurrent.FutureTask.report(FutureTask.java:122)*

* at java.util.concurrent.FutureTask.get(FutureTask.java:188)*

* at
org.apache.hadoop.hive.ql.io.orc.OrcInputFormat.generateSplitsInfo(OrcInputFormat.java:1016)*

* ... 15 more*

*Caused by: java.lang.IndexOutOfBoundsException: Index: 0*

* at java.util.Collections$EmptyList.get(Collections.java:3212)*

* at
org.apache.hadoop.hive.ql.io.orc.OrcProto$Type.getSubtypes(OrcProto.java:12240)*

* at
org.apache.hadoop.hive.ql.io.orc.ReaderImpl.getColumnIndicesFromNames(ReaderImpl.java:651)*

* at
org.apache.hadoop.hive.ql.io.orc.ReaderImpl.getRawDataSizeOfColumns(ReaderImpl.java:634)*

* at
org.apache.hadoop.hive.ql.io.orc.OrcInputFormat$SplitGenerator.populateAndCacheStripeDetails(OrcInputFormat.java:927)*

* at
org.apache.hadoop.hive.ql.io.orc.OrcInputFormat$SplitGenerator.call(OrcInputFormat.java:836)*

* at
org.apache.hadoop.hive.ql.io.orc.OrcInputFormat$SplitGenerator.call(OrcInputFormat.java:702)*

* ... 4 more*

*]*

*DAG did not succeed due to VERTEX_FAILURE. failedVertices:1
killedVertices:0*

*FAILED: Execution Error, return code 2 from
org.apache.hadoop.hive.ql.exec.tez.TezTask. Vertex failed, vertexName=Map
1, vertexId=vertex_1456379707708_1135_1_00, diagnostics=[Vertex
vertex_1456379707708_1135_1_00 [Map 1] killed/failed due
to:ROOT_INPUT_INIT_FAILURE, Vertex Input: *table_orc* initializer failed,
vertex=vertex_1456379707708_1135_1_00 [Map 1], java.lang.RuntimeException:
serious problem*

* at
org.apache.hadoop.hive.ql.io.orc.OrcInputFormat.generateSplitsInfo(OrcInputFormat.java:1021)*

* at
org.apache.hadoop.hive.ql.io.orc.OrcInputFormat.getSplits(OrcInputFormat.java:1048)*

* at
org.apache.hadoop.hive.ql.io.HiveInputFormat.addSplitsForGroup(HiveInputFormat.java:306)*

* at
org.apache.hadoop.hive.ql.io.HiveInputFormat.getSplits(HiveInputFormat.java:408)*

* at
org.apache.hadoop.hive.ql.exec.tez.HiveSplitGenerator.initialize(HiveSplitGenerator.java:131)*

* at
org.apache.tez.dag.app.dag.RootInputInitializerManager$InputInitializerCallable$1.run(RootInputInitializerManager.java:245)*

* at
org.apache.tez.dag.app.dag.RootInputInitializerManager$InputInitializerCallable$1.run(RootInputInitializerManager.java:239)*

* at java.security.AccessController.doPrivileged(Native Method)*

* at javax.security.auth.Subject.doAs(Subject.java:415)*

* at
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1657)*

* at
org.apache.tez.dag.app.dag.RootInputInitializerManager$InputInitializerCallable.call(RootInputInitializerManager.java:239)*

* at
org.apache.tez.dag.app.dag.RootInputInitializerManager$InputInitializerCallable.call(RootInputInitializerManager.java:226)*

* at java.util.concurrent.FutureTask.run(FutureTask.java:262)*

* at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)*

* at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)*

* at java.lang.Thread.run(Thread.java:744)*

*Caused by: java.util.concurrent.ExecutionException:
java.lang.IndexOutOfBoundsException: Index: 0*

* at java.util.concurrent.FutureTask.report(FutureTask.java:122)*

* at java.util.concurrent.FutureTask.get(FutureTask.java:188)*

* at
org.apache.hadoop.hive.ql.io.orc.OrcInputFormat.generateSplitsInfo(OrcInputFormat.java:1016)*

* ... 15 more*

*Caused by: java.lang.IndexOutOfBoundsException: Index: 0*

* at java.util.Collections$EmptyList.get(Collections.java:3212)*

* at
org.apache.hadoop.hive.ql.io.orc.OrcProto$Type.getSubtypes(OrcProto.java:12240)*

* at
org.apache.hadoop.hive.ql.io.orc.ReaderImpl.getColumnIndicesFromNames(ReaderImpl.java:651)*

* at
org.apache.hadoop.hive.ql.io.orc.ReaderImpl.getRawDataSizeOfColumns(ReaderImpl.java:634)*

* at
org.apache.hadoop.hive.ql.io.orc.OrcInputFormat$SplitGenerator.populateAndCacheStripeDetails(OrcInputFormat.java:927)*

* at
org.apache.hadoop.hive.ql.io.orc.OrcInputFormat$SplitGenerator.call(OrcInputFormat.java:836)*

* at
org.apache.hadoop.hive.ql.io.orc.OrcInputFormat$SplitGenerator.call(OrcInputFormat.java:702)*

* ... 4 more*

*]DAG did not succeed due to VERTEX_FAILURE. failedVertices:1
killedVertices:0*

*hive> *


On Tue, Mar 1, 2016 at 1:09 PM, Biswajit Nayak <bi...@altiscale.com>
wrote:

> Gopal,
>
> Any plan of provide the fix to Hive 1.x versions or to backport it?
>
> Regards
> Biswa
>
> On Tue, Mar 1, 2016 at 11:44 AM, Biswajit Nayak <bi...@altiscale.com>
> wrote:
>
>> Thanks Gopal for the details .. happy to know it has been counted and
>> fixed.
>>
>> Biswa
>>
>>
>> On Tue, Mar 1, 2016 at 11:37 AM, Gopal Vijayaraghavan <go...@apache.org>
>> wrote:
>>
>>>
>>> > Yes it is kerberos cluster.
>>> ...
>>> > After disabling the optimization in hive cli, it works with limit
>>> >option.
>>>
>>> Alright, then it is fixed in -
>>> https://issues.apache.org/jira/browse/HIVE-13120
>>>
>>>
>>> Cheers,
>>> Gopal
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>
>

Re: Hive Cli ORC table read error with limit option

Posted by Biswajit Nayak <bi...@altiscale.com>.
Gopal,

Any plan of provide the fix to Hive 1.x versions or to backport it?

Regards
Biswa

On Tue, Mar 1, 2016 at 11:44 AM, Biswajit Nayak <bi...@altiscale.com>
wrote:

> Thanks Gopal for the details .. happy to know it has been counted and
> fixed.
>
> Biswa
>
>
> On Tue, Mar 1, 2016 at 11:37 AM, Gopal Vijayaraghavan <go...@apache.org>
> wrote:
>
>>
>> > Yes it is kerberos cluster.
>> ...
>> > After disabling the optimization in hive cli, it works with limit
>> >option.
>>
>> Alright, then it is fixed in -
>> https://issues.apache.org/jira/browse/HIVE-13120
>>
>>
>> Cheers,
>> Gopal
>>
>>
>>
>>
>>
>>
>>
>

Re: Hive Cli ORC table read error with limit option

Posted by Biswajit Nayak <bi...@altiscale.com>.
Thanks Gopal for the details .. happy to know it has been counted and
fixed.

Biswa


On Tue, Mar 1, 2016 at 11:37 AM, Gopal Vijayaraghavan <go...@apache.org>
wrote:

>
> > Yes it is kerberos cluster.
> ...
> > After disabling the optimization in hive cli, it works with limit
> >option.
>
> Alright, then it is fixed in -
> https://issues.apache.org/jira/browse/HIVE-13120
>
>
> Cheers,
> Gopal
>
>
>
>
>
>
>

Re: Hive Cli ORC table read error with limit option

Posted by Gopal Vijayaraghavan <go...@apache.org>.
> Yes it is kerberos cluster.
...
> After disabling the optimization in hive cli, it works with limit
>option. 

Alright, then it is fixed in -
https://issues.apache.org/jira/browse/HIVE-13120


Cheers,
Gopal







Re: Hive Cli ORC table read error with limit option

Posted by Biswajit Nayak <bi...@altiscale.com>.
Thanks Gopal for the response.

Yes it is kerberos cluster.

After disabling the optimization in hive cli, it works with limit option.
Below is the DESC details of the table that you asked for.


*hive> desc formatted *testdb.table_orc*;*

*OK*

*# col_name            data_type           comment             *



*row_id              bigint                                  *

*a                int                                     *

*b                 int                                     *

*c               varchar(2)                              *

*d    bigint                                  *

*e         int                                     *

*f       bigint                                  *

*g               float                                   *

*h               int                                     *

*i                 int                                     *



*# Partition Information  *

*# col_name            data_type           comment             *



*year                int                                     *

*month               int                                     *

*day                 int                                     *



*# Detailed Table Information  *

*Database:           *testdb

*Owner:              *************   *

*CreateTime:         Mon Jan 25 22:32:22 UTC 2016  *

*LastAccessTime:     UNKNOWN              *

*Protect Mode:       None                 *

*Retention:          0                    *

*Location:           hdfs://***************:8020/hive/*testdb*.db/table_orc
 *

*Table Type:         MANAGED_TABLE        *

*Table Parameters:  *

* last_modified_by    **************          *

* last_modified_time  **************          *

* orc.compress        SNAPPY              *

* transient_lastDdlTime 1454104669          *



*# Storage Information  *

*SerDe Library:      org.apache.hadoop.hive.ql.io.orc.OrcSerde  *

*InputFormat:        org.apache.hadoop.hive.ql.io.orc.OrcInputFormat  *

*OutputFormat:       org.apache.hadoop.hive.ql.io.orc.OrcOutputFormat  *

*Compressed:         No                   *

*Num Buckets:        7                    *

*Bucket Columns:     [f]       *

*Sort Columns:       []                   *

*Storage Desc Params:  *

* field.delim         \t                  *

* serialization.format \t                  *

*Time taken: 0.105 seconds, Fetched: 46 row(s)*

*hive> *



On Tue, Mar 1, 2016 at 10:55 AM, Gopal Vijayaraghavan <go...@apache.org>
wrote:

>
> > Failed with exception java.io.IOException:java.lang.RuntimeException:
> >serious problem
> > Time taken: 0.32 seconds
> ...
> > Any one faced this issue.
>
> No, but that sounds like one of the codepaths I put in - is this a
> Kerberos secure cluster?
>
> Try disabling the optimization and see if it works.
>
> set hive.fetch.task.conversion=none;
>
> If it does, reply back with "desc formatted <table>" & I can help you
> debug deeper.
>
> Cheers,
> Gopal
>
>
>

Re: Hive Cli ORC table read error with limit option

Posted by Gopal Vijayaraghavan <go...@apache.org>.
> Failed with exception java.io.IOException:java.lang.RuntimeException:
>serious problem
> Time taken: 0.32 seconds
...
> Any one faced this issue.

No, but that sounds like one of the codepaths I put in - is this a
Kerberos secure cluster?

Try disabling the optimization and see if it works.

set hive.fetch.task.conversion=none;

If it does, reply back with "desc formatted <table>" & I can help you
debug deeper.

Cheers,
Gopal