You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@tez.apache.org by Rohini Palaniswamy <ro...@gmail.com> on 2017/01/10 22:54:43 UTC

Re: tez + union stmt

The implementation in hive does look wrong. The concept of VertexGroups was
added in Tez specifically for the case of union to support writing to same
directory from different vertices. Sub-directories should not be required
as a workaround.

Regards,
Rohini

On Sun, Dec 25, 2016 at 10:58 AM, Stephen Sprague <sp...@gmail.com>
wrote:

> Thanks Elliot.  Nice christmas present.   Those settings in that
> stackoverflow link look to me to be exactly what i need to set for MR jobs
> to pick that data up that Tez created.
>
> Cheers,
> Stephen.
>
> On Sun, Dec 25, 2016 at 2:45 AM, Elliot West <te...@gmail.com> wrote:
>
>> I believe that tez will generate subfolders for unioned data. As far as I
>> know, this is the expected behaviour and there is no alternative.
>> Presumably this is to prevent multiple tasks from attempting to write the
>> same file?
>>
>> We've experienced issues when switching from mr to tez; downstream jobs
>> weren't expecting subfolders and had trouble reading previously accessible
>> datasets.
>>
>> Apparently there are workarounds within Hive:
>> http://stackoverflow.com/questions/39511585/hive-create-
>> table-not-insert-data
>>
>> Merry Christmas,
>>
>> Elliot.
>>
>> On Sun, 25 Dec 2016 at 03:11, Rajesh Balamohan <rb...@apache.org>
>> wrote:
>>
>>> Are there any exceptions in hive.log?. Is tmp_pv_v4* table part of the
>>> select query?
>>>
>>> Assuming you are creating the table in staging.db, it would have created
>>> the table location as staging.db/foo (as you have not specified the
>>> location).
>>>
>>> Adding user@hive.apache.org as this is hive related.
>>>
>>>
>>> ~Rajesh.B
>>>
>>> On Sun, Dec 25, 2016 at 12:08 AM, Stephen Sprague <sp...@gmail.com>
>>> wrote:
>>>
>>> all,
>>>
>>> i'm running tez with the sql pattern:
>>>
>>>     * create table foo as select * from (select... UNION select... UNION
>>> select...)
>>>
>>> in the logs the final step is this:
>>>
>>>     * Moving data to directory hdfs://dwrnn1.sv2.trulia.com:8
>>> 020/user/hive/warehouse/staging.db/tmp_pv_v4c__loc_4 from hdfs://
>>> dwrnn1.sv2.trulia.com:8020/user/hive/warehouse/stagin
>>> g.db/.hive-staging_hive_2016-12-24_10-05-40_048_48964123148
>>> 07355668-899/-ext-10002
>>>
>>>
>>> when querying the table i got zero rows returned which made me curious.
>>> so i queried the hdfs location and see this:
>>>
>>>   $ hdfs dfs -ls hdfs://dwrnn1.sv2.trulia.com:8
>>> 020/user/hive/warehouse/staging.db/tmp_pv_v4c__loc_4
>>>
>>>   Found 3 items
>>>   drwxrwxrwx   - dwr supergroup          0 2016-12-24 10:05 hdfs://
>>> dwrnn1.sv2.trulia.com:8020/user/hive/warehouse/stagin
>>> g.db/tmp_pv_v4c__loc_4/1
>>>   drwxrwxrwx   - dwr supergroup          0 2016-12-24 10:06 hdfs://
>>> dwrnn1.sv2.trulia.com:8020/user/hive/warehouse/stagin
>>> g.db/tmp_pv_v4c__loc_4/2
>>>   drwxrwxrwx   - dwr supergroup          0 2016-12-24 10:06 hdfs://
>>> dwrnn1.sv2.trulia.com:8020/user/hive/warehouse/stagin
>>> g.db/tmp_pv_v4c__loc_4/3
>>>
>>> and yes the data files are under these three dirs.
>>>
>>> so i ask... i'm not used to seeing sub-directories under the tablename
>>> unless the table is partitioned. is this legit? might there be some config
>>> settings i need to set to see this data via sql?
>>>
>>> thanks,
>>> Stephen.
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>

Re: tez + union stmt

Posted by Elliot West <te...@gmail.com>.
Thank you.

On Wed, 11 Jan 2017 at 07:21, Chris Drome <cd...@yahoo-inc.com> wrote:

> Elliot,
>
> Mithun already created the following ticket to track the issue:
>
> https://issues.apache.org/jira/browse/HIVE-15575
>
> chris
>
>
> On Tuesday, January 10, 2017 11:05 PM, Elliot West <te...@gmail.com>
> wrote:
>
>
> Thanks Rohini,
>
> This is good to know. Could you perhaps raise an issue in the Hive JIRA?
>
> Thanks,
>
> Elliot.
>
> On Tue, 10 Jan 2017 at 22:55, Rohini Palaniswamy <ro...@gmail.com>
> wrote:
>
> The implementation in hive does look wrong. The concept of VertexGroups
> was added in Tez specifically for the case of union to support writing to
> same directory from different vertices. Sub-directories should not be
> required as a workaround.
>
> Regards,
> Rohini
>
>
> On Sun, Dec 25, 2016 at 10:58 AM, Stephen Sprague <sp...@gmail.com>
> wrote:
>
> Thanks Elliot.  Nice christmas present.   Those settings in that
> stackoverflow link look to me to be exactly what i need to set for MR jobs
> to pick that data up that Tez created.
>
> Cheers,
> Stephen.
>
> On Sun, Dec 25, 2016 at 2:45 AM, Elliot West <te...@gmail.com> wrote:
>
> I believe that tez will generate subfolders for unioned data. As far as I
> know, this is the expected behaviour and there is no alternative.
> Presumably this is to prevent multiple tasks from attempting to write the
> same file?
>
> We've experienced issues when switching from mr to tez; downstream jobs
> weren't expecting subfolders and had trouble reading previously accessible
> datasets.
>
> Apparently there are workarounds within Hive:
>
> http://stackoverflow.com/questions/39511585/hive-create-table-not-insert-data
>
> Merry Christmas,
>
> Elliot.
>
> On Sun, 25 Dec 2016 at 03:11, Rajesh Balamohan <rb...@apache.org>
> wrote:
>
> Are there any exceptions in hive.log?. Is tmp_pv_v4* table part of the
> select query?
>
> Assuming you are creating the table in staging.db, it would have created
> the table location as staging.db/foo (as you have not specified the
> location).
>
> Adding user@hive.apache.org as this is hive related.
>
>
> ~Rajesh.B
>
> On Sun, Dec 25, 2016 at 12:08 AM, Stephen Sprague <sp...@gmail.com>
> wrote:
>
> all,
>
> i'm running tez with the sql pattern:
>
>     * create table foo as select * from (select... UNION select... UNION
> select...)
>
> in the logs the final step is this:
>
>     * Moving data to directory hdfs://
> dwrnn1.sv2.trulia.com:8020/user/hive/warehouse/staging.db/tmp_pv_v4c__loc_4
> from hdfs://
> dwrnn1.sv2.trulia.com:8020/user/hive/warehouse/staging.db/.hive-staging_hive_2016-12-24_10-05-40_048_4896412314807355668-899/-ext-10002
>
>
> when querying the table i got zero rows returned which made me curious. so
> i queried the hdfs location and see this:
>
>   $ hdfs dfs -ls hdfs://
> dwrnn1.sv2.trulia.com:8020/user/hive/warehouse/staging.db/tmp_pv_v4c__loc_4
>
>   Found 3 items
>   drwxrwxrwx   - dwr supergroup          0 2016-12-24 10:05 hdfs://
> dwrnn1.sv2.trulia.com:8020/user/hive/warehouse/staging.db/tmp_pv_v4c__loc_4/1
>   drwxrwxrwx   - dwr supergroup          0 2016-12-24 10:06 hdfs://
> dwrnn1.sv2.trulia.com:8020/user/hive/warehouse/staging.db/tmp_pv_v4c__loc_4/2
>   drwxrwxrwx   - dwr supergroup          0 2016-12-24 10:06 hdfs://
> dwrnn1.sv2.trulia.com:8020/user/hive/warehouse/staging.db/tmp_pv_v4c__loc_4/3
>
> and yes the data files are under these three dirs.
>
> so i ask... i'm not used to seeing sub-directories under the tablename
> unless the table is partitioned. is this legit? might there be some config
> settings i need to set to see this data via sql?
>
> thanks,
> Stephen.
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>

Re: tez + union stmt

Posted by Chris Drome <cd...@yahoo-inc.com>.
Elliot,

Mithun already created the following ticket to track the issue:
https://issues.apache.org/jira/browse/HIVE-15575
chris
 

    On Tuesday, January 10, 2017 11:05 PM, Elliot West <te...@gmail.com> wrote:
 

 Thanks Rohini,
This is good to know. Could you perhaps raise an issue in the Hive JIRA?
Thanks,
Elliot.
On Tue, 10 Jan 2017 at 22:55, Rohini Palaniswamy <ro...@gmail.com> wrote:

The implementation in hive does look wrong. The concept of VertexGroups was added in Tez specifically for the case of union to support writing to same directory from different vertices. Sub-directories should not be required as a workaround.
Regards,Rohini

On Sun, Dec 25, 2016 at 10:58 AM, Stephen Sprague <sp...@gmail.com> wrote:

Thanks Elliot.  Nice christmas present.   Those settings in that stackoverflow link look to me to be exactly what i need to set for MR jobs to pick that data up that Tez created.  

Cheers,
Stephen.

On Sun, Dec 25, 2016 at 2:45 AM, Elliot West <te...@gmail.com> wrote:

I believe that tez will generate subfolders for unioned data. As far as I know, this is the expected behaviour and there is no alternative. Presumably this is to prevent multiple tasks from attempting to write the same file?
We've experienced issues when switching from mr to tez; downstream jobs weren't expecting subfolders and had trouble reading previously accessible datasets.
Apparently there are workarounds within Hive:http://stackoverflow.com/questions/39511585/hive-create-table-not-insert-data

Merry Christmas,
Elliot.
On Sun, 25 Dec 2016 at 03:11, Rajesh Balamohan <rb...@apache.org> wrote:

Are there any exceptions in hive.log?. Is tmp_pv_v4* table part of the select query? 
Assuming you are creating the table in staging.db, it would have created the table location as staging.db/foo (as you have not specified the location). 
Adding user@hive.apache.org as this is hive related.

~Rajesh.B
On Sun, Dec 25, 2016 at 12:08 AM, Stephen Sprague <sp...@gmail.com> wrote:

all,

i'm running tez with the sql pattern: 

    * create table foo as select * from (select... UNION select... UNION select...)

in the logs the final step is this:

    * Moving data to directory hdfs://dwrnn1.sv2.trulia.com:8020/user/hive/warehouse/staging.db/tmp_pv_v4c__loc_4 from hdfs://dwrnn1.sv2.trulia.com:8020/user/hive/warehouse/staging.db/.hive-staging_hive_2016-12-24_10-05-40_048_4896412314807355668-899/-ext-10002


when querying the table i got zero rows returned which made me curious. so i queried the hdfs location and see this:

  $ hdfs dfs -ls hdfs://dwrnn1.sv2.trulia.com:8020/user/hive/warehouse/staging.db/tmp_pv_v4c__loc_4

  Found 3 items
  drwxrwxrwx   - dwr supergroup          0 2016-12-24 10:05 hdfs://dwrnn1.sv2.trulia.com:8020/user/hive/warehouse/staging.db/tmp_pv_v4c__loc_4/1
  drwxrwxrwx   - dwr supergroup          0 2016-12-24 10:06 hdfs://dwrnn1.sv2.trulia.com:8020/user/hive/warehouse/staging.db/tmp_pv_v4c__loc_4/2
  drwxrwxrwx   - dwr supergroup          0 2016-12-24 10:06 hdfs://dwrnn1.sv2.trulia.com:8020/user/hive/warehouse/staging.db/tmp_pv_v4c__loc_4/3

and yes the data files are under these three dirs.

so i ask... i'm not used to seeing sub-directories under the tablename unless the table is partitioned. is this legit? might there be some config settings i need to set to see this data via sql? 

thanks,
Stephen.




















   

Re: tez + union stmt

Posted by Elliot West <te...@gmail.com>.
Thanks Rohini,

This is good to know. Could you perhaps raise an issue in the Hive JIRA?

Thanks,

Elliot.

On Tue, 10 Jan 2017 at 22:55, Rohini Palaniswamy <ro...@gmail.com>
wrote:

> The implementation in hive does look wrong. The concept of VertexGroups
> was added in Tez specifically for the case of union to support writing to
> same directory from different vertices. Sub-directories should not be
> required as a workaround.
>
> Regards,
> Rohini
>
>
> On Sun, Dec 25, 2016 at 10:58 AM, Stephen Sprague <sp...@gmail.com>
> wrote:
>
> Thanks Elliot.  Nice christmas present.   Those settings in that
> stackoverflow link look to me to be exactly what i need to set for MR jobs
> to pick that data up that Tez created.
>
> Cheers,
> Stephen.
>
> On Sun, Dec 25, 2016 at 2:45 AM, Elliot West <te...@gmail.com> wrote:
>
> I believe that tez will generate subfolders for unioned data. As far as I
> know, this is the expected behaviour and there is no alternative.
> Presumably this is to prevent multiple tasks from attempting to write the
> same file?
>
> We've experienced issues when switching from mr to tez; downstream jobs
> weren't expecting subfolders and had trouble reading previously accessible
> datasets.
>
> Apparently there are workarounds within Hive:
>
> http://stackoverflow.com/questions/39511585/hive-create-table-not-insert-data
>
> Merry Christmas,
>
> Elliot.
>
> On Sun, 25 Dec 2016 at 03:11, Rajesh Balamohan <rb...@apache.org>
> wrote:
>
> Are there any exceptions in hive.log?. Is tmp_pv_v4* table part of the
> select query?
>
> Assuming you are creating the table in staging.db, it would have created
> the table location as staging.db/foo (as you have not specified the
> location).
>
> Adding user@hive.apache.org as this is hive related.
>
>
> ~Rajesh.B
>
> On Sun, Dec 25, 2016 at 12:08 AM, Stephen Sprague <sp...@gmail.com>
> wrote:
>
> all,
>
> i'm running tez with the sql pattern:
>
>     * create table foo as select * from (select... UNION select... UNION
> select...)
>
> in the logs the final step is this:
>
>     * Moving data to directory hdfs://
> dwrnn1.sv2.trulia.com:8020/user/hive/warehouse/staging.db/tmp_pv_v4c__loc_4
> from hdfs://
> dwrnn1.sv2.trulia.com:8020/user/hive/warehouse/staging.db/.hive-staging_hive_2016-12-24_10-05-40_048_4896412314807355668-899/-ext-10002
>
>
> when querying the table i got zero rows returned which made me curious. so
> i queried the hdfs location and see this:
>
>   $ hdfs dfs -ls hdfs://
> dwrnn1.sv2.trulia.com:8020/user/hive/warehouse/staging.db/tmp_pv_v4c__loc_4
>
>   Found 3 items
>   drwxrwxrwx   - dwr supergroup          0 2016-12-24 10:05 hdfs://
> dwrnn1.sv2.trulia.com:8020/user/hive/warehouse/staging.db/tmp_pv_v4c__loc_4/1
>   drwxrwxrwx   - dwr supergroup          0 2016-12-24 10:06 hdfs://
> dwrnn1.sv2.trulia.com:8020/user/hive/warehouse/staging.db/tmp_pv_v4c__loc_4/2
>   drwxrwxrwx   - dwr supergroup          0 2016-12-24 10:06 hdfs://
> dwrnn1.sv2.trulia.com:8020/user/hive/warehouse/staging.db/tmp_pv_v4c__loc_4/3
>
> and yes the data files are under these three dirs.
>
> so i ask... i'm not used to seeing sub-directories under the tablename
> unless the table is partitioned. is this legit? might there be some config
> settings i need to set to see this data via sql?
>
> thanks,
> Stephen.
>
>
>
>
>
>
>
>
>
>
>
>
>
>

RE: tez + union stmt

Posted by Bikas Saha <bi...@apache.org>.
IIRC, the output files for each vertex have the vertex id encoded in them to prevent them from overriding output files from other vertices. Thus the files for different union member vertices can be written safely under the same output dir.

 

Hive might be doing this to maintain uniformity between hive-tez and hive-mr.

 

Bikas

 

From: Rohini Palaniswamy [mailto:rohini.aditya@gmail.com] 
Sent: Tuesday, January 10, 2017 2:55 PM
To: user@hive.apache.org; user@tez.apache.org
Subject: Re: tez + union stmt

 

The implementation in hive does look wrong. The concept of VertexGroups was added in Tez specifically for the case of union to support writing to same directory from different vertices. Sub-directories should not be required as a workaround.

 

Regards,

Rohini

 

On Sun, Dec 25, 2016 at 10:58 AM, Stephen Sprague <spragues@gmail.com <ma...@gmail.com> > wrote:

Thanks Elliot.  Nice christmas present.   Those settings in that stackoverflow link look to me to be exactly what i need to set for MR jobs to pick that data up that Tez created.  

Cheers,

Stephen.

 

On Sun, Dec 25, 2016 at 2:45 AM, Elliot West <teabot@gmail.com <ma...@gmail.com> > wrote:

I believe that tez will generate subfolders for unioned data. As far as I know, this is the expected behaviour and there is no alternative. Presumably this is to prevent multiple tasks from attempting to write the same file?

 

We've experienced issues when switching from mr to tez; downstream jobs weren't expecting subfolders and had trouble reading previously accessible datasets.

 

Apparently there are workarounds within Hive:

http://stackoverflow.com/questions/39511585/hive-create-table-not-insert-data

 

Merry Christmas,

 

Elliot.

 

On Sun, 25 Dec 2016 at 03:11, Rajesh Balamohan <rbalamohan@apache.org <ma...@apache.org> > wrote:

Are there any exceptions in hive.log?. Is tmp_pv_v4* table part of the select query? 

 

Assuming you are creating the table in staging.db, it would have created the table location as staging.db/foo (as you have not specified the location). 

 

Adding user@hive.apache.org <ma...@hive.apache.org>  as this is hive related.

 

 

~Rajesh.B

 

On Sun, Dec 25, 2016 at 12:08 AM, Stephen Sprague <spragues@gmail.com <ma...@gmail.com> > wrote:

all,

i'm running tez with the sql pattern: 

    * create table foo as select * from (select... UNION select... UNION select...)

in the logs the final step is this:

    * Moving data to directory hdfs://dwrnn1.sv2.trulia.com:8020/user/hive/warehouse/staging.db/tmp_pv_v4c__loc_4 <http://dwrnn1.sv2.trulia.com:8020/user/hive/warehouse/staging.db/tmp_pv_v4c__loc_4>  from hdfs://dwrnn1.sv2.trulia.com:8020/user/hive/warehouse/staging.db/.hive-staging_hive_2016-12-24_10-05-40_048_4896412314807355668-899/-ext-10002 <http://dwrnn1.sv2.trulia.com:8020/user/hive/warehouse/staging.db/.hive-staging_hive_2016-12-24_10-05-40_048_4896412314807355668-899/-ext-10002> 


when querying the table i got zero rows returned which made me curious. so i queried the hdfs location and see this:

  $ hdfs dfs -ls hdfs://dwrnn1.sv2.trulia.com:8020/user/hive/warehouse/staging.db/tmp_pv_v4c__loc_4 <http://dwrnn1.sv2.trulia.com:8020/user/hive/warehouse/staging.db/tmp_pv_v4c__loc_4> 

  Found 3 items
  drwxrwxrwx   - dwr supergroup          0 2016-12-24 10:05 hdfs://dwrnn1.sv2.trulia.com:8020/user/hive/warehouse/staging.db/tmp_pv_v4c__loc_4/1 <http://dwrnn1.sv2.trulia.com:8020/user/hive/warehouse/staging.db/tmp_pv_v4c__loc_4/1> 
  drwxrwxrwx   - dwr supergroup          0 2016-12-24 10:06 hdfs://dwrnn1.sv2.trulia.com:8020/user/hive/warehouse/staging.db/tmp_pv_v4c__loc_4/2 <http://dwrnn1.sv2.trulia.com:8020/user/hive/warehouse/staging.db/tmp_pv_v4c__loc_4/2> 
  drwxrwxrwx   - dwr supergroup          0 2016-12-24 10:06 hdfs://dwrnn1.sv2.trulia.com:8020/user/hive/warehouse/staging.db/tmp_pv_v4c__loc_4/3 <http://dwrnn1.sv2.trulia.com:8020/user/hive/warehouse/staging.db/tmp_pv_v4c__loc_4/3> 

and yes the data files are under these three dirs.

 

so i ask... i'm not used to seeing sub-directories under the tablename unless the table is partitioned. is this legit? might there be some config settings i need to set to see this data via sql? 

thanks,

Stephen.

 

 

 

 

 

 


RE: tez + union stmt

Posted by Bikas Saha <bi...@apache.org>.
IIRC, the output files for each vertex have the vertex id encoded in them to prevent them from overriding output files from other vertices. Thus the files for different union member vertices can be written safely under the same output dir.

 

Hive might be doing this to maintain uniformity between hive-tez and hive-mr.

 

Bikas

 

From: Rohini Palaniswamy [mailto:rohini.aditya@gmail.com] 
Sent: Tuesday, January 10, 2017 2:55 PM
To: user@hive.apache.org; user@tez.apache.org
Subject: Re: tez + union stmt

 

The implementation in hive does look wrong. The concept of VertexGroups was added in Tez specifically for the case of union to support writing to same directory from different vertices. Sub-directories should not be required as a workaround.

 

Regards,

Rohini

 

On Sun, Dec 25, 2016 at 10:58 AM, Stephen Sprague <spragues@gmail.com <ma...@gmail.com> > wrote:

Thanks Elliot.  Nice christmas present.   Those settings in that stackoverflow link look to me to be exactly what i need to set for MR jobs to pick that data up that Tez created.  

Cheers,

Stephen.

 

On Sun, Dec 25, 2016 at 2:45 AM, Elliot West <teabot@gmail.com <ma...@gmail.com> > wrote:

I believe that tez will generate subfolders for unioned data. As far as I know, this is the expected behaviour and there is no alternative. Presumably this is to prevent multiple tasks from attempting to write the same file?

 

We've experienced issues when switching from mr to tez; downstream jobs weren't expecting subfolders and had trouble reading previously accessible datasets.

 

Apparently there are workarounds within Hive:

http://stackoverflow.com/questions/39511585/hive-create-table-not-insert-data

 

Merry Christmas,

 

Elliot.

 

On Sun, 25 Dec 2016 at 03:11, Rajesh Balamohan <rbalamohan@apache.org <ma...@apache.org> > wrote:

Are there any exceptions in hive.log?. Is tmp_pv_v4* table part of the select query? 

 

Assuming you are creating the table in staging.db, it would have created the table location as staging.db/foo (as you have not specified the location). 

 

Adding user@hive.apache.org <ma...@hive.apache.org>  as this is hive related.

 

 

~Rajesh.B

 

On Sun, Dec 25, 2016 at 12:08 AM, Stephen Sprague <spragues@gmail.com <ma...@gmail.com> > wrote:

all,

i'm running tez with the sql pattern: 

    * create table foo as select * from (select... UNION select... UNION select...)

in the logs the final step is this:

    * Moving data to directory hdfs://dwrnn1.sv2.trulia.com:8020/user/hive/warehouse/staging.db/tmp_pv_v4c__loc_4 <http://dwrnn1.sv2.trulia.com:8020/user/hive/warehouse/staging.db/tmp_pv_v4c__loc_4>  from hdfs://dwrnn1.sv2.trulia.com:8020/user/hive/warehouse/staging.db/.hive-staging_hive_2016-12-24_10-05-40_048_4896412314807355668-899/-ext-10002 <http://dwrnn1.sv2.trulia.com:8020/user/hive/warehouse/staging.db/.hive-staging_hive_2016-12-24_10-05-40_048_4896412314807355668-899/-ext-10002> 


when querying the table i got zero rows returned which made me curious. so i queried the hdfs location and see this:

  $ hdfs dfs -ls hdfs://dwrnn1.sv2.trulia.com:8020/user/hive/warehouse/staging.db/tmp_pv_v4c__loc_4 <http://dwrnn1.sv2.trulia.com:8020/user/hive/warehouse/staging.db/tmp_pv_v4c__loc_4> 

  Found 3 items
  drwxrwxrwx   - dwr supergroup          0 2016-12-24 10:05 hdfs://dwrnn1.sv2.trulia.com:8020/user/hive/warehouse/staging.db/tmp_pv_v4c__loc_4/1 <http://dwrnn1.sv2.trulia.com:8020/user/hive/warehouse/staging.db/tmp_pv_v4c__loc_4/1> 
  drwxrwxrwx   - dwr supergroup          0 2016-12-24 10:06 hdfs://dwrnn1.sv2.trulia.com:8020/user/hive/warehouse/staging.db/tmp_pv_v4c__loc_4/2 <http://dwrnn1.sv2.trulia.com:8020/user/hive/warehouse/staging.db/tmp_pv_v4c__loc_4/2> 
  drwxrwxrwx   - dwr supergroup          0 2016-12-24 10:06 hdfs://dwrnn1.sv2.trulia.com:8020/user/hive/warehouse/staging.db/tmp_pv_v4c__loc_4/3 <http://dwrnn1.sv2.trulia.com:8020/user/hive/warehouse/staging.db/tmp_pv_v4c__loc_4/3> 

and yes the data files are under these three dirs.

 

so i ask... i'm not used to seeing sub-directories under the tablename unless the table is partitioned. is this legit? might there be some config settings i need to set to see this data via sql? 

thanks,

Stephen.