You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@druid.apache.org by GitBox <gi...@apache.org> on 2020/08/07 07:02:46 UTC

[GitHub] [druid] gianm commented on a change in pull request #10248: fix bug with expressions on sparse string realtime columns without explicit null valued rows

gianm commented on a change in pull request #10248:
URL: https://github.com/apache/druid/pull/10248#discussion_r466842879



##########
File path: processing/src/main/java/org/apache/druid/segment/DimensionHandlerUtils.java
##########
@@ -73,16 +73,16 @@ private DimensionHandlerUtils()
   )
   {
     if (capabilities == null) {
-      return new StringDimensionHandler(dimensionName, multiValueHandling, true);
+      return new StringDimensionHandler(dimensionName, multiValueHandling, true, false);
     }
 
     multiValueHandling = multiValueHandling == null ? MultiValueHandling.ofDefault() : multiValueHandling;
 
     if (capabilities.getType() == ValueType.STRING) {
-      if (!capabilities.isDictionaryEncoded()) {
+      if (!capabilities.isDictionaryEncoded().isMaybeTrue()) {

Review comment:
       Negative maybe makes me do a double-take when reading. Is this equivalent to `capabilities.isDictionaryEncoded.isFalse()`?
   
   Also, that has the effect of making this check lax (it will pass if the dictionary-encodedness is unknown). That doesn't seem correct. Shouldn't it be strict, if it relies on dictionary-encodedness? Or if it doesn't rely on it, then the check should be removed.

##########
File path: processing/src/main/java/org/apache/druid/segment/DimensionIndexer.java
##########
@@ -236,6 +237,7 @@ DimensionSelector makeDimensionSelector(
       IncrementalIndex.DimensionDesc desc
   );
 
+  ColumnCapabilitiesImpl getColumnCapabilities();

Review comment:
       ColumnCapabilities is a more typical return value (the Impl is an impl but we should generally return an interface).

##########
File path: processing/src/main/java/org/apache/druid/segment/StringDimensionIndexer.java
##########
@@ -400,6 +404,17 @@ public int getCardinality()
     return dimLookup.size();
   }
 
+  /**
+   * returns true if all values are encoded in {@link #dimLookup}
+   */
+  public boolean dictionaryEncodesAllValues()

Review comment:
       Can this be private? Looks like it isn't used outside of this class.

##########
File path: processing/src/main/java/org/apache/druid/segment/StringDimensionIndexer.java
##########
@@ -456,6 +471,30 @@ public int getUnsortedEncodedKeyComponentHashCode(int[] key)
     return Arrays.hashCode(key);
   }
 
+  @Override
+  public ColumnCapabilitiesImpl getColumnCapabilities()
+  {
+    ColumnCapabilitiesImpl capabilites = new ColumnCapabilitiesImpl().setType(ValueType.STRING)
+                                                                     .setHasBitmapIndexes(hasBitmapIndexes)
+                                                                     .setHasSpatialIndexes(hasSpatialIndexes)
+                                                                     .setDictionaryValuesUnique(true)
+                                                                     .setDictionaryValuesSorted(false);
+
+    // strings are only single valued, until they are not...

Review comment:
       Please add some punctuation to this multiline comment 🙂
   
   I think I parsed it correctly, but it wasn't effortless.

##########
File path: processing/src/main/java/org/apache/druid/segment/column/ColumnCapabilities.java
##########
@@ -105,4 +104,46 @@ public String toString()
       return StringUtils.toLowerCase(super.toString());
     }
   }
+
+  interface CoercionLogic
+  {
+    boolean dictionaryEncoded();
+    boolean dictionaryValuesSorted();
+    boolean dictionaryValuesUnique();
+    boolean multipleValues();
+  }
+
+  class AllCoercionLogic implements CoercionLogic

Review comment:
       It looks like this is only used in the tests. If so, it should be moved there.

##########
File path: processing/src/main/java/org/apache/druid/segment/column/ColumnCapabilities.java
##########
@@ -105,4 +104,46 @@ public String toString()
       return StringUtils.toLowerCase(super.toString());
     }
   }
+
+  interface CoercionLogic

Review comment:
       Please add some javadocs. The purpose of this interface isn't immediately obvious.

##########
File path: processing/src/test/java/org/apache/druid/segment/IndexMergerTestBase.java
##########
@@ -2012,32 +2011,6 @@ public void testDictIdSeeker()
     Assert.assertEquals(-1, dictIdSeeker.seek(5));
   }
 
-  @Test(expected = IllegalArgumentException.class)
-  public void testCloser() throws Exception

Review comment:
       Why does this test need to be deleted?

##########
File path: processing/src/test/java/org/apache/druid/segment/virtual/ExpressionSelectorsTest.java
##########
@@ -231,6 +252,80 @@ public void testCoerceExprToValue()
     );
   }
 
+  @Test
+  public void testIncrementIndexStringSelector() throws IndexSizeExceededException

Review comment:
       Is this the regression test? If so could you add a comment that this is a regression test for whichever the issue number is?




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org
For additional commands, e-mail: commits-help@druid.apache.org