You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@lens.apache.org by "Puneet Gupta (JIRA)" <ji...@apache.org> on 2016/09/08 06:28:20 UTC

[jira] [Created] (LENS-1309) Add capability to specify that "Future Partitions" should not be considered while answering qeuries

Puneet Gupta created LENS-1309:
----------------------------------

             Summary: Add capability to specify that "Future Partitions" should not be considered while answering qeuries
                 Key: LENS-1309
                 URL: https://issues.apache.org/jira/browse/LENS-1309
             Project: Apache Lens
          Issue Type: Improvement
            Reporter: Puneet Gupta


Use case . 

Lets say we have a Fact A which has DAILY and HOURLY update periods. 
We have partitioned the fact based on pt(process time) and et(event arrival time).
Assume today is Sep 9th and while processing data for Sep 8th 23rd(last) hour (i.e , pt=2016-09-08-23), we found few records with Event time as Sep 9, 0th hour (due to .. clock synchronization, fraud data,etc). This will lead to partitions like pt=2016-09-08-23 an et =2016-09-09-00  at HOUR level and pt=2016-09-08 and et =2016-09-09 at DAY level. 

This makes the system believe that 9th DAY level data is available for event time queries (as the time line does not consider pt for event time queries). This will lead to wrong query outputs since this day partition  pt=2016-09-08 and et =2016-09-09 will have only a very small part of 9th day data.  Major chunk of DAY data for 9th will only get created on 10th morning (pt=2016-09-09 and et =2016-09-09). In this case LENS will answer query from DAY update period for 9th Sep, while it should have used HOURLY data for 9th.

Expose a query level config to enforce/specify semantics that make sure LENS considers et partitions only if they are <= most recent pt partition. The future partitions should be ignored for higher granularity(DAY) and instead query should get answered form lower granularity data(HOUR). This should also apply for lookahead.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)