You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@lens.apache.org by "Amareshwari Sriramadasu (JIRA)" <ji...@apache.org> on 2016/09/26 10:18:20 UTC

[jira] [Commented] (LENS-1333) Add data completeness checker

    [ https://issues.apache.org/jira/browse/LENS-1333?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15522661#comment-15522661 ] 

Amareshwari Sriramadasu commented on LENS-1333:
-----------------------------------------------

bq. Lens will check partition existence first, if it exists, then check the completeness percentage. 
Instead of doing completeness check for every partition, the api can take start_time, end_time as parameters and return the completeness over the range.

> Add data completeness checker
> -----------------------------
>
>                 Key: LENS-1333
>                 URL: https://issues.apache.org/jira/browse/LENS-1333
>             Project: Apache Lens
>          Issue Type: New Feature
>          Components: cube
>            Reporter: Amareshwari Sriramadasu
>            Assignee: Narayan Periwal
>
> Though lens has partition registration being done whenever data is available, there is no guarantee the partition registered is complete. There can be different ways to know if the data is complete for partition. One option could be to have a partition property saying whether it is complete or not. Other could be to do a http call to another hosted service and more.
> Proposal here is to add an interface for DataCompletenessChecker and do the check while resolving partitions.
> Here are some of the capabilities we would like to add in Lens :
> # Lens will check partition existence first, if it exists, then check the completeness percentage. If the completeness percentage is less than a configured threshold (default should be 98, 99 or even 100), Lens will fail the query.
> # Lens's accept query on partial data will accept on incomplete data as well.
> # Lens will also option to override the completeness percentage threshold value at query level
> # Lens will still have look ahead capability of daily being incomplete, then it will union with hourly. 
> # If daily partitions exist (with no look ahead required), but they are incomplete, lens can switch to hourly partitions and answer the query.
> # If same measure is there in two different facts , Lens will we pick the one with higher availability.
> # In case of completeness percentage threshold missed, Lens will respond back with available percentage.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)