You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@hive.apache.org by Suresh Krishnappa <su...@gmail.com> on 2013/03/07 16:20:05 UTC

hive issue with sub-directories

Hi All,
I have the following directory structure in hdfs

/test/a/
/test/a/1.avro
/test/a/2.avro
/test/a/b/
/test/a/b/3.avro

I created an external HIVE table using Avro Serde and added /test/a as a
partition to this table.

I am not able to run a select query. Always getting the error 'not a file'
on '/test/a/b'

Is this by design, a bug or am I missing some configuration?
I am using HIVE 0.10

Thanks
Suresh

Re: hive issue with sub-directories

Posted by Ramki Palle <ra...@gmail.com>.
One way it was solved by an user earlier was by subclassing the InputFormat
class and overriding the listStatus method so that you can ignore
subdirectories. This was done in 0.7.1 version. Not sure if there is any
better way in later versions. At least you can use this approach until
someone comes up with a better way.

Please see the message at

http://mail-archives.apache.org/mod_mbox/hive-user/201108.mbox/%3CCAC80dVSwRz3GpTSAzpS1eQw_P79ko+-=jmQVkOE2VMvMam=0iQ@mail.gmail.com%3E


-Ramki.



On Sun, Mar 10, 2013 at 11:36 PM, <be...@yahoo.com> wrote:

> **
> Hi Suresh
>
> AFAIK as of now a partition cannot contain sub directories, it can contain
> only files.
>
> You may have to move the sub dirs out of the parent dir 'a' and create
> separate partitions for those.
> Regards
> Bejoy KS
>
> Sent from remote device, Please excuse typos
> ------------------------------
> *From: * Suresh Krishnappa <su...@gmail.com>
> *Date: *Mon, 11 Mar 2013 10:58:05 +0530
> *To: *<us...@hive.apache.org>
> *ReplyTo: * user@hive.apache.org
> *Subject: *Re: hive issue with sub-directories
>
> Hi Mark,
> I am using external table in HIVE.
>
> This is how I am adding the partition
>
> > alter table <mytable> add partition (pt=1) location '/test/a/';
>
> I am able to run HIVE queries only if '/test/a/b' folder is deleted.
>
> How can I retain this folder structure and still issue queries?
>
> Thanks
> Suresh
>
> On Sun, Mar 10, 2013 at 12:48 AM, Mark Grover <grover.markgrover@gmail.com
> > wrote:
>
>> Suresh,
>> By default, the partition column name has to be appear in HDFS
>> directory structure.
>>
>> e.g.
>> /user/hive/warehouse/<table name>/<partition col name>=<partition col
>> value>/data1.txt
>> /user/hive/warehouse/<table name>/<partition col name>=<partition col
>> value>/data2.txt
>>
>>
>> On Thu, Mar 7, 2013 at 7:20 AM, Suresh Krishnappa
>> <su...@gmail.com> wrote:
>> > Hi All,
>> > I have the following directory structure in hdfs
>> >
>> > /test/a/
>> > /test/a/1.avro
>> > /test/a/2.avro
>> > /test/a/b/
>> > /test/a/b/3.avro
>> >
>> > I created an external HIVE table using Avro Serde and added /test/a as a
>> > partition to this table.
>> >
>> > I am not able to run a select query. Always getting the error 'not a
>> file'
>> > on '/test/a/b'
>> >
>> > Is this by design, a bug or am I missing some configuration?
>> > I am using HIVE 0.10
>> >
>> > Thanks
>> > Suresh
>> >
>>
>
>

Re: hive issue with sub-directories

Posted by be...@yahoo.com.
Hi Suresh

AFAIK as of now a partition cannot contain sub directories, it can contain only files.

You may have to move the sub dirs out of the parent dir 'a' and create separate partitions for those.

Regards 
Bejoy KS

Sent from remote device, Please excuse typos

-----Original Message-----
From: Suresh Krishnappa <su...@gmail.com>
Date: Mon, 11 Mar 2013 10:58:05 
To: <us...@hive.apache.org>
Reply-To: user@hive.apache.org
Subject: Re: hive issue with sub-directories

Hi Mark,
I am using external table in HIVE.

This is how I am adding the partition

> alter table <mytable> add partition (pt=1) location '/test/a/';

I am able to run HIVE queries only if '/test/a/b' folder is deleted.

How can I retain this folder structure and still issue queries?

Thanks
Suresh

On Sun, Mar 10, 2013 at 12:48 AM, Mark Grover
<gr...@gmail.com>wrote:

> Suresh,
> By default, the partition column name has to be appear in HDFS
> directory structure.
>
> e.g.
> /user/hive/warehouse/<table name>/<partition col name>=<partition col
> value>/data1.txt
> /user/hive/warehouse/<table name>/<partition col name>=<partition col
> value>/data2.txt
>
>
> On Thu, Mar 7, 2013 at 7:20 AM, Suresh Krishnappa
> <su...@gmail.com> wrote:
> > Hi All,
> > I have the following directory structure in hdfs
> >
> > /test/a/
> > /test/a/1.avro
> > /test/a/2.avro
> > /test/a/b/
> > /test/a/b/3.avro
> >
> > I created an external HIVE table using Avro Serde and added /test/a as a
> > partition to this table.
> >
> > I am not able to run a select query. Always getting the error 'not a
> file'
> > on '/test/a/b'
> >
> > Is this by design, a bug or am I missing some configuration?
> > I am using HIVE 0.10
> >
> > Thanks
> > Suresh
> >
>


Re: hive issue with sub-directories

Posted by Suresh Krishnappa <su...@gmail.com>.
Hi Mark,
I am using external table in HIVE.

This is how I am adding the partition

> alter table <mytable> add partition (pt=1) location '/test/a/';

I am able to run HIVE queries only if '/test/a/b' folder is deleted.

How can I retain this folder structure and still issue queries?

Thanks
Suresh

On Sun, Mar 10, 2013 at 12:48 AM, Mark Grover
<gr...@gmail.com>wrote:

> Suresh,
> By default, the partition column name has to be appear in HDFS
> directory structure.
>
> e.g.
> /user/hive/warehouse/<table name>/<partition col name>=<partition col
> value>/data1.txt
> /user/hive/warehouse/<table name>/<partition col name>=<partition col
> value>/data2.txt
>
>
> On Thu, Mar 7, 2013 at 7:20 AM, Suresh Krishnappa
> <su...@gmail.com> wrote:
> > Hi All,
> > I have the following directory structure in hdfs
> >
> > /test/a/
> > /test/a/1.avro
> > /test/a/2.avro
> > /test/a/b/
> > /test/a/b/3.avro
> >
> > I created an external HIVE table using Avro Serde and added /test/a as a
> > partition to this table.
> >
> > I am not able to run a select query. Always getting the error 'not a
> file'
> > on '/test/a/b'
> >
> > Is this by design, a bug or am I missing some configuration?
> > I am using HIVE 0.10
> >
> > Thanks
> > Suresh
> >
>

Re: hive issue with sub-directories

Posted by Mark Grover <gr...@gmail.com>.
Suresh,
By default, the partition column name has to be appear in HDFS
directory structure.

e.g.
/user/hive/warehouse/<table name>/<partition col name>=<partition col
value>/data1.txt
/user/hive/warehouse/<table name>/<partition col name>=<partition col
value>/data2.txt


On Thu, Mar 7, 2013 at 7:20 AM, Suresh Krishnappa
<su...@gmail.com> wrote:
> Hi All,
> I have the following directory structure in hdfs
>
> /test/a/
> /test/a/1.avro
> /test/a/2.avro
> /test/a/b/
> /test/a/b/3.avro
>
> I created an external HIVE table using Avro Serde and added /test/a as a
> partition to this table.
>
> I am not able to run a select query. Always getting the error 'not a file'
> on '/test/a/b'
>
> Is this by design, a bug or am I missing some configuration?
> I am using HIVE 0.10
>
> Thanks
> Suresh
>