You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@phoenix.apache.org by "James Taylor (JIRA)" <ji...@apache.org> on 2016/02/15 10:57:18 UTC

[jira] [Commented] (PHOENIX-2683) store rowCount and byteCount at guidePost level

    [ https://issues.apache.org/jira/browse/PHOENIX-2683?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15147145#comment-15147145 ] 

James Taylor commented on PHOENIX-2683:
---------------------------------------

Thanks, [~ankit.singhal]. Looks good. One minor thing we can cleanup based on the new guidepost format (unrelated to this change):
- No need to copy the row key or create ImmutableBytesWritable here, instead just modify addGuidePosts to take parameters for byte[] rowArray, int offset, int length:
{code}
diff --git a/phoenix-core/src/main/java/org/apache/phoenix/schema/stats/StatisticsCollector.java b/phoenix-core/src/main/java/org/apache/phoenix/schema/stats/StatisticsCollector.java
index 3462f22..676ff77 100644
--- a/phoenix-core/src/main/java/org/apache/phoenix/schema/stats/StatisticsCollector.java
+++ b/phoenix-core/src/main/java/org/apache/phoenix/schema/stats/StatisticsCollector.java
@@ -190,8 +190,9 @@ public class StatisticsCollector {
             if (byteCount >= guidepostDepth) {
                 byte[] row = ByteUtil.copyKeyBytesIfNecessary(
                         new ImmutableBytesWritable(kv.getRowArray(), kv.getRowOffset(), kv.getRowLength()));
-                if (gps.getSecond().addGuidePosts(row, byteCount)) {
+                if (gps.getSecond().addGuidePosts(row, byteCount, gps.getSecond().getRowCount())) {
                     gps.setFirst(0l);
+                    gps.getSecond().resetRowCount();
                 }
             }
         }
{code}
- Also, a few tests around this would be good.

> store rowCount and byteCount at guidePost level
> -----------------------------------------------
>
>                 Key: PHOENIX-2683
>                 URL: https://issues.apache.org/jira/browse/PHOENIX-2683
>             Project: Phoenix
>          Issue Type: Bug
>            Reporter: Ankit Singhal
>            Assignee: Ankit Singhal
>            Priority: Minor
>             Fix For: 4.7.0
>
>         Attachments: PHOENIX-2683.patch
>
>
> The GUIDE_POSTS_WIDTH and GUIDE_POSTS_ROW_COUNT should contain the number
> of bytes and number of rows which were traversed since the last guidepost.
> So given some start key and stop key from a scan and knowledge that a given
> column family is used in a query, you should be able to run a query like
> this:
> SELECT SUM(GUIDE_POSTS_WIDTH) bytes_traversed,
>     SUM(GUIDE_POSTS_ROW_COUNT) rows_traversed
> FROM SYSTEM.STATS
> WHERE COLUMN_FAMILY = :1
> AND GUIDE_POST_KEY >= :2
> AND GUIDE_POST_KEY < :3
> where :1 is the column family, :2 is the start row of the scan, and :3 is
> the stop row of the scan. The result of the query should tell you the
> bytes_traversed and the rows_traversed with a granularity of the
> phoenix.stats.guidepost.width config parameter.
> Description is copied from dev mail thread.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)