You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@phoenix.apache.org by "Bin Shi (JIRA)" <ji...@apache.org> on 2019/01/07 19:11:00 UTC

[jira] [Comment Edited] (PHOENIX-5085) Disentangle BaseResultIterators from the backing Guidepost Data structure

    [ https://issues.apache.org/jira/browse/PHOENIX-5085?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16736138#comment-16736138 ] 

Bin Shi edited comment on PHOENIX-5085 at 1/7/19 7:10 PM:
----------------------------------------------------------

[~dbwong] & [~karanmehta93], we're almost on the same page. 

At the current phase, to address this JIRA, the GuidePostsInfo can provide two create functions (Factory Method) - one returns List<GuidePost> and another returns ArrayList<GuidePost> interfaces (can leave it for implantation for the future), where GuidePost is a data structure contains data from a row in stats table. Define a SequenceAccessor factory which provides a method get SequenceAccessor (interface) of Guideposts. A concrete class implements this SequenceAccess interface and encapsulates the current implantation of guideposts data structure using prefix encoding. In BaseResultInterator.getParallelScans() use List<GuidePost> or/and ArrayList<GuidePost>.

 

At the next phase, we'll define RandomAccessor factory and RandomeAccess interface, implement different guideposts data structure (Segment Tree). Add more APIs and helper functions, such as what described in ["Phoenix deep dive" slides|https://docs.google.com/presentation/d/1G_CcAhk2xSC09mqG3MNt1i2QbgqfWbg9OM966_ucSGQ].
 # Use [Segment Tree|https://www.geeksforgeeks.org/segment-tree-set-1-sum-of-given-range/] (Plus some characteristics from B+ Tree) PHOENIX-4925
 # Disentangle the granularity of guideposts from that of the cached guideposts (PHOENIX-4927)
 # Mount/unmount guideposts for a particular tenant or key range
 # Guideposts Chunk is always encoded/decoded as a whole, so we can choose different compression algorithms depending on the data.
 # Support Range Scan. Given <start key, End Key>, return List<GuidePost> decompressed on which we can perform binary search, or return estimated # rows and # bytes


was (Author: bin shi):
[~dbwong] & [~karanmehta93], we're almost on the same page. 

At the current phase, to address this JIRA, the GuidePostsInfo can provide two create functions (Factory Method) - one returns List<GuidePost> and another returns ArrayList<GuidePost> interfaces (can leave it for implantation for the future), where GuidePost is a data structure contains data from a row in stats table.  Define a SequenceAccessor factory which provides a method get SequenceAccessor (interface) of Guideposts. A concrete class implements this SequenceAccess interface and encapsulates the current implantation of guideposts data structure using prefix encoding.

 

At the next phase, we'll define RandomAccessor factory and RandomeAccess interface, implement different guideposts data structure (Segment Tree). Add more APIs and helper functions, such as what described in ["Phoenix deep dive" slides|https://docs.google.com/presentation/d/1G_CcAhk2xSC09mqG3MNt1i2QbgqfWbg9OM966_ucSGQ].
 # Use [Segment Tree|https://www.geeksforgeeks.org/segment-tree-set-1-sum-of-given-range/] (Plus some characteristics from B+ Tree) PHOENIX-4925
 # Disentangle the granularity of guideposts from that of the cached guideposts (PHOENIX-4927)
 # Mount/unmount guideposts for a particular tenant or key range
 # Guideposts Chunk is always encoded/decoded as a whole, so we can choose different compression algorithms depending on the data.
 # Support Range Scan. Given <start key, End Key>, return List<GuidePost> decompressed on which we can perform binary search, or return estimated # rows and # bytes

> Disentangle BaseResultIterators from the backing Guidepost Data structure
> -------------------------------------------------------------------------
>
>                 Key: PHOENIX-5085
>                 URL: https://issues.apache.org/jira/browse/PHOENIX-5085
>             Project: Phoenix
>          Issue Type: Improvement
>            Reporter: Daniel Wong
>            Assignee: Daniel Wong
>            Priority: Major
>              Labels: Statistics, StatsImprovement
>
> Disentangle BaseResultIterators.getParallelScans from the backing Guidepost Data structure.  This will provide the abstraction for possible new stats data structures in https://issues.apache.org/jira/browse/PHOENIX-4925
>  Will heavily affect changes in https://issues.apache.org/jira/browse/PHOENIX-4926 and https://issues.apache.org/jira/browse/PHOENIX-4594.  [~Bin Shi] [~karanmehta93]



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)