You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@druid.apache.org by GitBox <gi...@apache.org> on 2018/07/31 22:36:06 UTC

[GitHub] himanshug commented on a change in pull request #6016: Druid 'Shapeshifting' Columns

himanshug commented on a change in pull request #6016: Druid 'Shapeshifting' Columns
URL: https://github.com/apache/incubator-druid/pull/6016#discussion_r206667473
 
 

 ##########
 File path: processing/src/main/java/io/druid/segment/data/codecs/ints/IntFormMetrics.java
 ##########
 @@ -0,0 +1,166 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+
+package io.druid.segment.data.codecs.ints;
+
+import io.druid.segment.IndexSpec;
+import io.druid.segment.data.codecs.FormMetrics;
+
+/**
+ * Aggregates statistics about blocks of integer values, such as total number of values processed, minimum and maximum
+ * values encountered, if the chunk is constant or all zeros, and various facts about data which is repeated more than
+ * twice ('runs') including number of distinct runs, longest run length, and total number of runs. This information is
+ * collected by {@link io.druid.segment.data.ShapeShiftingColumnarIntsSerializer} which processing row values, and is
+ * provided to {@link IntFormEncoder} implementations to do anything from estimate encoded size to influencing how
+ * {@link io.druid.segment.data.ShapeShiftingColumnarIntsSerializer} decides whether or not to employ that particular
+ * encoding.
+ */
+public class IntFormMetrics extends FormMetrics
+{
+  private int minValue = Integer.MAX_VALUE;
+  private int maxValue = Integer.MIN_VALUE;
+  private int numRunValues = 0;
+  private int numDistinctRuns = 0;
+  private int longestRun;
+  private int currentRun;
+  private int previousValue;
+  private int numValues = 0;
+  private boolean isFirstValue = true;
+
 
 Review comment:
   could you add docs for numRunValues, numDistinctRuns, longestRun, numValues  probably take stuff from the class level java doc and put it next to specific variable. (basically the things that are exposed and not clear from their name e.g. min/max value)
   
   I could make sense of them by reading processNextRow(..) though.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org
For additional commands, e-mail: commits-help@druid.apache.org