You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@hive.apache.org by Edward Capriolo <ed...@gmail.com> on 2009/09/16 20:16:24 UTC
Feature request: WHERE filename='x' or filename='y'
I am dumping files into a hive partion on five minute intervals. I am
using LOAD DATA into a partition.
weblogs
web1.00
web1.05
web1.10
...
web2.00
web2.05
web1.10
....
Things that would be useful..
Select files from the folder with a regex or exact name
select * FROM logs where FILENAME LIKE(WEB1*)
select * FROM LOGS WHERE FILENAME=web2.00
Also it would be nice to be able to select offsets in a file, this
would make sense with appends
select * from logs WHERE FILENAME=web2.00 FROMOFFSET=454644 [TOOFFSET=]
Do these make sense to anyone?
Edward
Re: Feature request: WHERE filename='x' or filename='y'
Posted by Edward Capriolo <ed...@gmail.com>.
On Wed, Sep 16, 2009 at 10:42 PM, 김영우 <wa...@gmail.com> wrote:
> Hi Edward,
>
> It would be nice and very useful. sometimes I want to select my own
> 'partition' or 'datafile' explicitly.. something like below:
>
> SELECT * FROM weblogs PARTITION ('2009-09-17', '2009-09-18') WHERE
> col1='..' and col2= ...
>
> Or users can select data files from directory:
>
> SELECT * FROM weblogs DATAFILE ('log1.txt', 'log2.txt') WHERE col1='..' and
> col2= ...
>
> Anyway, your idea is very cool!
>
> Youngwoo
>
> 2009/9/17 Edward Capriolo <ed...@gmail.com>
>>
>> I am dumping files into a hive partion on five minute intervals. I am
>> using LOAD DATA into a partition.
>>
>> weblogs
>> web1.00
>> web1.05
>> web1.10
>> ...
>> web2.00
>> web2.05
>> web1.10
>> ....
>>
>> Things that would be useful..
>>
>> Select files from the folder with a regex or exact name
>>
>> select * FROM logs where FILENAME LIKE(WEB1*)
>>
>> select * FROM LOGS WHERE FILENAME=web2.00
>>
>> Also it would be nice to be able to select offsets in a file, this
>> would make sense with appends
>>
>> select * from logs WHERE FILENAME=web2.00 FROMOFFSET=454644 [TOOFFSET=]
>>
>> Do these make sense to anyone?
>>
>> Edward
>
>
I added your comments to
https://issues.apache.org/jira/browse/HIVE-837
Depending on how you are setup you can do this with a where clause
SELECT * FROM weblogs PARTITION ('2009-09-17'
For example I partion by date and by hour
partition (log_date_part string, log_hour_part string)
select * from table where log_date_part like ('2009%')
or
select * from table where log_date_part = '2009-05-05' OR
log_date_part = '2009-05-06'
So you should be able to do that already.
Re: Feature request: WHERE filename='x' or filename='y'
Posted by 김영우 <wa...@gmail.com>.
Hi Edward,
It would be nice and very useful. sometimes I want to select my own
'partition' or 'datafile' explicitly.. something like below:
SELECT * FROM weblogs PARTITION ('2009-09-17', '2009-09-18') WHERE
col1='..' and col2= ...
Or users can select data files from directory:
SELECT * FROM weblogs DATAFILE ('log1.txt', 'log2.txt') WHERE col1='..' and
col2= ...
Anyway, your idea is very cool!
Youngwoo
2009/9/17 Edward Capriolo <ed...@gmail.com>
> I am dumping files into a hive partion on five minute intervals. I am
> using LOAD DATA into a partition.
>
> weblogs
> web1.00
> web1.05
> web1.10
> ...
> web2.00
> web2.05
> web1.10
> ....
>
> Things that would be useful..
>
> Select files from the folder with a regex or exact name
>
> select * FROM logs where FILENAME LIKE(WEB1*)
>
> select * FROM LOGS WHERE FILENAME=web2.00
>
> Also it would be nice to be able to select offsets in a file, this
> would make sense with appends
>
> select * from logs WHERE FILENAME=web2.00 FROMOFFSET=454644 [TOOFFSET=]
>
> Do these make sense to anyone?
>
> Edward
>