You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@hawq.apache.org by "Goden Yao (JIRA)" <ji...@apache.org> on 2016/02/23 02:11:18 UTC
[jira] [Created] (HAWQ-437) Support time series partition filter
pushdown
Goden Yao created HAWQ-437:
------------------------------
Summary: Support time series partition filter pushdown
Key: HAWQ-437
URL: https://issues.apache.org/jira/browse/HAWQ-437
Project: Apache HAWQ
Issue Type: Sub-task
Components: PXF
Reporter: Goden Yao
Assignee: Goden Yao
time series is a very common used folder structure in HDFS usually for logging purpose:
e.g.
{noformat}
For example, let’s say that the data for sales information in a chain is stored in the following directories structure: /data/sales/yyyy/mm/dd/hh/*.data (year/month/day/hour)
/data/sales /
--- 2012 /
--- --- 01 /
--- --- --- 01 /
--- --- --- --- 00 /
--- --- --- --- 01 /
...
--- --- --- --- 23 /
--- --- --- 02 / ...
--- --- --- 31 / ...
--- --- 02 /
--- --- --- 01 / ...
--- --- --- 28 / ...
...
--- 2013 / ...
--- 2014 / ...
{noformat}
Time series partitioning will support numeric operation (=, !=, <, >, <=, >=), under the following limitations:
* Time template must be of the form year/month/day/hour or any subset of it (e.g. year/month/day or month/day).
* The dates must be in numeric values.
* Years should be represented by two or four digits (yy or yyyy).
* Months should be represented by two digits (mm). The months valid values are 01-12.
* Days should be represented by two digits (dd) The days valid values are 01-31.
* Hours should be represented by two or four digits (hh or hhhh). Hours valid values are 00-23 or 0000-2359.
Any other form should be represented using the regular string partitions. Time series will only regard the supported values - directories with unsupported values will be ignored.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)