You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@druid.apache.org by GitBox <gi...@apache.org> on 2020/04/23 22:27:31 UTC

[GitHub] [druid] suneet-s opened a new pull request #9760: Fix potential NPEs in joins

suneet-s opened a new pull request #9760:
URL: https://github.com/apache/druid/pull/9760


   intelliJ reported issues with potential NPEs. This was first hit in testing
   with a filter being pushed down to the left hand table when joining against
   an indexed table.
   
   This PR has:
   - [ ] been self-reviewed.
      - [ ] using the [concurrency checklist](https://github.com/apache/druid/blob/master/dev/code-review/concurrency.md) (Remove this item if the PR doesn't have any relation to concurrency.)
   - [ ] added documentation for new or modified features or behaviors.
   - [ ] added Javadocs for most classes and all non-trivial methods. Linked related entities via Javadoc links.
   - [ ] added or updated version, license, or notice information in [licenses.yaml](https://github.com/apache/druid/blob/master/licenses.yaml)
   - [ ] added comments explaining the "why" and the intent of the code wherever would not be obvious for an unfamiliar reader.
   - [ ] added unit tests or modified existing tests to cover new code paths.
   - [ ] added integration tests.
   - [ ] been tested in a test Druid cluster.
   
   <hr>
   
   WIP - there are over 600 issues reported within the druid-processing sub-module. This patch cleans up the issues reported under the joins sub-package. This PR still needs to add tests for these issues.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org
For additional commands, e-mail: commits-help@druid.apache.org


[GitHub] [druid] suneet-s commented on a change in pull request #9760: Fix potential NPEs in joins

Posted by GitBox <gi...@apache.org>.
suneet-s commented on a change in pull request #9760:
URL: https://github.com/apache/druid/pull/9760#discussion_r417047809



##########
File path: processing/src/main/java/org/apache/druid/segment/virtual/MultiValueExpressionDimensionSelector.java
##########
@@ -50,11 +50,13 @@ public MultiValueExpressionDimensionSelector(ColumnValueSelector<ExprEval> baseS
     this.baseSelector = baseSelector;
   }
 
+  @Nullable

Review comment:
       These are checked on the hot loop. I've removed them for now, it's hard to reason whether or not it's possible to return null here. I think it can't, but I'm not 100% sure so I'm reverting this change and the associated null check




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org
For additional commands, e-mail: commits-help@druid.apache.org


[GitHub] [druid] suneet-s commented on a change in pull request #9760: Fix potential NPEs in joins

Posted by GitBox <gi...@apache.org>.
suneet-s commented on a change in pull request #9760:
URL: https://github.com/apache/druid/pull/9760#discussion_r417031506



##########
File path: processing/src/main/java/org/apache/druid/segment/virtual/MultiValueExpressionDimensionSelector.java
##########
@@ -50,11 +50,13 @@ public MultiValueExpressionDimensionSelector(ColumnValueSelector<ExprEval> baseS
     this.baseSelector = baseSelector;
   }
 
+  @Nullable

Review comment:
       I'd like to do this in phases, there are about 600+ issues that intelliJ reports in the druid-processing sub module. I'll pick through more of them in my next PR. Once they are fixed, I will enable these warnings in the inspections job




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org
For additional commands, e-mail: commits-help@druid.apache.org


[GitHub] [druid] suneet-s commented on a change in pull request #9760: Fix potential NPEs in joins

Posted by GitBox <gi...@apache.org>.
suneet-s commented on a change in pull request #9760:
URL: https://github.com/apache/druid/pull/9760#discussion_r414905804



##########
File path: processing/src/main/java/org/apache/druid/segment/join/lookup/LookupJoinMatcher.java
##########
@@ -277,7 +277,7 @@ public void matchRemainder()
     } else {
       currentIterator = Iterators.filter(
           extractor.iterable().iterator(),
-          entry -> !matchedKeys.contains(entry.getKey())
+          entry -> entry != null && !matchedKeys.contains(entry.getKey())

Review comment:
       It doesn't look like this can actually happen, I think.
   
   In theory iterator.next() can return null, but `MapLookupExtractor#iterable` (the only implementation today that does not throw an exception) returns `map.entrySet()` which should never contain a null element.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org
For additional commands, e-mail: commits-help@druid.apache.org


[GitHub] [druid] suneet-s commented on a change in pull request #9760: Fix potential NPEs in joins

Posted by GitBox <gi...@apache.org>.
suneet-s commented on a change in pull request #9760:
URL: https://github.com/apache/druid/pull/9760#discussion_r416992856



##########
File path: processing/src/main/java/org/apache/druid/segment/join/lookup/LookupJoinMatcher.java
##########
@@ -277,7 +277,7 @@ public void matchRemainder()
     } else {
       currentIterator = Iterators.filter(
           extractor.iterable().iterator(),
-          entry -> !matchedKeys.contains(entry.getKey())
+          entry -> entry != null && !matchedKeys.contains(entry.getKey())

Review comment:
       good call. Looks like this is on a hot path too, so I've added an inspection suppression. 👍 




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org
For additional commands, e-mail: commits-help@druid.apache.org


[GitHub] [druid] jihoonson commented on a change in pull request #9760: Fix potential NPEs in joins

Posted by GitBox <gi...@apache.org>.
jihoonson commented on a change in pull request #9760:
URL: https://github.com/apache/druid/pull/9760#discussion_r414282640



##########
File path: processing/src/main/java/org/apache/druid/segment/join/lookup/LookupJoinMatcher.java
##########
@@ -277,7 +277,7 @@ public void matchRemainder()
     } else {
       currentIterator = Iterators.filter(
           extractor.iterable().iterator(),
-          entry -> !matchedKeys.contains(entry.getKey())
+          entry -> entry != null && !matchedKeys.contains(entry.getKey())

Review comment:
       When can `entry` be null?

##########
File path: processing/src/main/java/org/apache/druid/segment/join/PossiblyNullDimensionSelector.java
##########
@@ -138,7 +138,8 @@ public int lookupId(@Nullable String name)
       // id 0 is always null for this selector impl.
       return 0;
     } else {
-      return baseSelector.idLookup().lookupId(name) + nullAdjustment;
+      IdLookup idLookup = baseSelector.idLookup();
+      return (idLookup == null ? 0 : idLookup.lookupId(name)) + nullAdjustment;

Review comment:
       It should probably fail if `idLooup` = null.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org
For additional commands, e-mail: commits-help@druid.apache.org


[GitHub] [druid] suneet-s commented on a change in pull request #9760: Fix potential NPEs in joins

Posted by GitBox <gi...@apache.org>.
suneet-s commented on a change in pull request #9760:
URL: https://github.com/apache/druid/pull/9760#discussion_r417031506



##########
File path: processing/src/main/java/org/apache/druid/segment/virtual/MultiValueExpressionDimensionSelector.java
##########
@@ -50,11 +50,13 @@ public MultiValueExpressionDimensionSelector(ColumnValueSelector<ExprEval> baseS
     this.baseSelector = baseSelector;
   }
 
+  @Nullable

Review comment:
       I'd like to do this in phases, there are about 600+ issues that intelliJ reports in the druid-processing sub module. I'll pick through more of them in my next PR




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org
For additional commands, e-mail: commits-help@druid.apache.org


[GitHub] [druid] suneet-s commented on a change in pull request #9760: Fix potential NPEs in joins

Posted by GitBox <gi...@apache.org>.
suneet-s commented on a change in pull request #9760:
URL: https://github.com/apache/druid/pull/9760#discussion_r416986933



##########
File path: processing/src/main/java/org/apache/druid/segment/join/PossiblyNullDimensionSelector.java
##########
@@ -138,7 +138,8 @@ public int lookupId(@Nullable String name)
       // id 0 is always null for this selector impl.
       return 0;
     } else {
-      return baseSelector.idLookup().lookupId(name) + nullAdjustment;
+      IdLookup idLookup = baseSelector.idLookup();
+      return (idLookup == null ? 0 : idLookup.lookupId(name)) + nullAdjustment;

Review comment:
       Added an assert statement with an explanation of why it should never be null in here




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org
For additional commands, e-mail: commits-help@druid.apache.org


[GitHub] [druid] jon-wei commented on a change in pull request #9760: Fix potential NPEs in joins

Posted by GitBox <gi...@apache.org>.
jon-wei commented on a change in pull request #9760:
URL: https://github.com/apache/druid/pull/9760#discussion_r416975593



##########
File path: processing/src/main/java/org/apache/druid/segment/join/JoinConditionAnalysis.java
##########
@@ -79,7 +79,7 @@ private JoinConditionAnalysis(
                                                                 .allMatch(expr -> expr.isLiteral() && expr.eval(
                                                                     ExprUtils.nilBindings()).asBoolean());
     canHashJoin = nonEquiConditions.stream().allMatch(Expr::isLiteral);
-    rightKeyColumns = getEquiConditions().stream().map(Equality::getRightColumn).distinct().collect(Collectors.toSet());
+    rightKeyColumns = getEquiConditions().stream().map(Equality::getRightColumn).collect(Collectors.toSet());

Review comment:
       What's the impact of removing the `distinct()` call (is it just unnecessary since they're being collected to a set already?)

##########
File path: processing/src/main/java/org/apache/druid/segment/join/PossiblyNullDimensionSelector.java
##########
@@ -138,7 +138,8 @@ public int lookupId(@Nullable String name)
       // id 0 is always null for this selector impl.
       return 0;
     } else {
-      return baseSelector.idLookup().lookupId(name) + nullAdjustment;
+      IdLookup idLookup = baseSelector.idLookup();
+      return (idLookup == null ? 0 : idLookup.lookupId(name)) + nullAdjustment;

Review comment:
       A null `idLookup` there should be an error condition, since the caller should check if `idLookup()` returned null before calling `lookupId(name)` on the `PossiblyNullDimensionSelector`

##########
File path: processing/src/main/java/org/apache/druid/segment/join/lookup/LookupJoinable.java
##########
@@ -95,18 +96,23 @@ public JoinMatcher makeJoinMatcher(
       boolean allowNonKeyColumnSearch
   )
   {
+    if (!ALL_COLUMNS.contains(searchColumnName) || !ALL_COLUMNS.contains(retrievalColumnName)) {
+      return ImmutableSet.of();
+    }
     Set<String> correlatedValues;
     if (LookupColumnSelectorFactory.KEY_COLUMN.equals(searchColumnName)) {
       if (LookupColumnSelectorFactory.KEY_COLUMN.equals(retrievalColumnName)) {
         correlatedValues = ImmutableSet.of(searchColumnValue);
       } else {
-        correlatedValues = ImmutableSet.of(extractor.apply(searchColumnName));
+        // This should not happen in practice because the column to be joined on must be a key.
+        correlatedValues = Collections.singleton(extractor.apply(searchColumnValue));

Review comment:
       :+1:




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org
For additional commands, e-mail: commits-help@druid.apache.org


[GitHub] [druid] suneet-s commented on a change in pull request #9760: Fix potential NPEs in joins

Posted by GitBox <gi...@apache.org>.
suneet-s commented on a change in pull request #9760:
URL: https://github.com/apache/druid/pull/9760#discussion_r416992856



##########
File path: processing/src/main/java/org/apache/druid/segment/join/lookup/LookupJoinMatcher.java
##########
@@ -277,7 +277,7 @@ public void matchRemainder()
     } else {
       currentIterator = Iterators.filter(
           extractor.iterable().iterator(),
-          entry -> !matchedKeys.contains(entry.getKey())
+          entry -> entry != null && !matchedKeys.contains(entry.getKey())

Review comment:
       good call. Looks like this is on a hot path too, so I've added an inspection suppression instead of an assert not null. 👍 




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org
For additional commands, e-mail: commits-help@druid.apache.org


[GitHub] [druid] jihoonson commented on a change in pull request #9760: Fix potential NPEs in joins

Posted by GitBox <gi...@apache.org>.
jihoonson commented on a change in pull request #9760:
URL: https://github.com/apache/druid/pull/9760#discussion_r417587632



##########
File path: processing/src/main/java/org/apache/druid/segment/virtual/MultiValueExpressionDimensionSelector.java
##########
@@ -50,11 +50,13 @@ public MultiValueExpressionDimensionSelector(ColumnValueSelector<ExprEval> baseS
     this.baseSelector = baseSelector;
   }
 
+  @Nullable

Review comment:
       Oh yeah, it doesn't seem possible to return null since `ColumnValueSelector` returns an `ExprEval`. 




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org
For additional commands, e-mail: commits-help@druid.apache.org


[GitHub] [druid] suneet-s commented on a change in pull request #9760: Fix potential NPEs in joins

Posted by GitBox <gi...@apache.org>.
suneet-s commented on a change in pull request #9760:
URL: https://github.com/apache/druid/pull/9760#discussion_r414166093



##########
File path: processing/src/main/java/org/apache/druid/segment/join/PossiblyNullDimensionSelector.java
##########
@@ -138,7 +138,8 @@ public int lookupId(@Nullable String name)
       // id 0 is always null for this selector impl.
       return 0;
     } else {
-      return baseSelector.idLookup().lookupId(name) + nullAdjustment;
+      IdLookup idLookup = baseSelector.idLookup();
+      return (idLookup == null ? 0 : idLookup.lookupId(name)) + nullAdjustment;

Review comment:
       should this return 0 if the `idLookup` does not exist?




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org
For additional commands, e-mail: commits-help@druid.apache.org


[GitHub] [druid] jihoonson commented on a change in pull request #9760: Fix potential NPEs in joins

Posted by GitBox <gi...@apache.org>.
jihoonson commented on a change in pull request #9760:
URL: https://github.com/apache/druid/pull/9760#discussion_r417032580



##########
File path: processing/src/main/java/org/apache/druid/segment/virtual/MultiValueExpressionDimensionSelector.java
##########
@@ -50,11 +50,13 @@ public MultiValueExpressionDimensionSelector(ColumnValueSelector<ExprEval> baseS
     this.baseSelector = baseSelector;
   }
 
+  @Nullable

Review comment:
       Oh I meant for this method. This method is technically private and only used in this class. Some of the callers are still not checking nulls, for example [here](https://github.com/apache/druid/pull/9760/files/79eb23832004e90ed498aa1448d3bc9513cd6b50#diff-2ae123ea2c178ff9151ddbb31ed5f2b8R108).




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org
For additional commands, e-mail: commits-help@druid.apache.org


[GitHub] [druid] suneet-s commented on a change in pull request #9760: Fix potential NPEs in joins

Posted by GitBox <gi...@apache.org>.
suneet-s commented on a change in pull request #9760:
URL: https://github.com/apache/druid/pull/9760#discussion_r416984862



##########
File path: processing/src/main/java/org/apache/druid/segment/join/JoinConditionAnalysis.java
##########
@@ -79,7 +79,7 @@ private JoinConditionAnalysis(
                                                                 .allMatch(expr -> expr.isLiteral() && expr.eval(
                                                                     ExprUtils.nilBindings()).asBoolean());
     canHashJoin = nonEquiConditions.stream().allMatch(Expr::isLiteral);
-    rightKeyColumns = getEquiConditions().stream().map(Equality::getRightColumn).distinct().collect(Collectors.toSet());
+    rightKeyColumns = getEquiConditions().stream().map(Equality::getRightColumn).collect(Collectors.toSet());

Review comment:
       should be a no-op. intelliJ suggested it's un-necessary since we're collecting to a set




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org
For additional commands, e-mail: commits-help@druid.apache.org


[GitHub] [druid] jihoonson commented on a change in pull request #9760: Fix potential NPEs in joins

Posted by GitBox <gi...@apache.org>.
jihoonson commented on a change in pull request #9760:
URL: https://github.com/apache/druid/pull/9760#discussion_r416985934



##########
File path: processing/src/main/java/org/apache/druid/segment/join/lookup/LookupJoinMatcher.java
##########
@@ -277,7 +277,7 @@ public void matchRemainder()
     } else {
       currentIterator = Iterators.filter(
           extractor.iterable().iterator(),
-          entry -> !matchedKeys.contains(entry.getKey())
+          entry -> entry != null && !matchedKeys.contains(entry.getKey())

Review comment:
       Yeah. It also doesn't seem reasonable that a "row" can be null. I think it's a false warning because of the `Predicate` interface of Guava. It would be better to suppress warning instead of checking null.

##########
File path: processing/src/main/java/org/apache/druid/segment/virtual/MultiValueExpressionDimensionSelector.java
##########
@@ -50,11 +50,13 @@ public MultiValueExpressionDimensionSelector(ColumnValueSelector<ExprEval> baseS
     this.baseSelector = baseSelector;
   }
 
+  @Nullable

Review comment:
       Does all callers need to check the returned eval? Seems there are a couple of missing places.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org
For additional commands, e-mail: commits-help@druid.apache.org