You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@druid.apache.org by "clintropolis (via GitHub)" <gi...@apache.org> on 2023/04/16 06:59:48 UTC

[GitHub] [druid] clintropolis opened a new pull request, #14096: fix bug filtering nested columns with expression filters

clintropolis opened a new pull request, #14096:
URL: https://github.com/apache/druid/pull/14096

   ### Description
   This PR fixes a bug when trying to filter `COMPLEX<json>` columns where omission of setting the `isFilterable` flag of `ColumnCapabilities` when reading the column was incorrectly causing the column to be treated as if it were all null values. The deserializers have been updated to set `isFilterable` to true, allow `COMPLEX<json>` columns to be used in expression filters, which can often be generated by the SQL planner for queries like the added SQL test.
   
   I'm not really certain how useful this behavior is, and it doesn't seem very intuitive, so we might want to consider changing this in the future so that `COMPLEX` columns produce some form of error when used with filters which do not support the type. It might have made sense in the past when there was not really a possible way for complex types to participate in filtering, but since expressions now support complex types it becomes relatively easy to imagine using these columns in filters when used with expressions which can extract meaningful data from the complex types (such as get the estimate from an HLL sketch, or .. extract a value from a json blob as in this case).
   
   #### Release note
   Fixed a bug when using expression filters for filtering `COMPLEX<json>` columns or values extracted from them using `JSON_VALUE` or any other nested column function, where the column would be incorrectly treated as all null values.
   
   
   <hr>
   
   This PR has:
   
   - [x] been self-reviewed.
   - [x] a release note entry in the PR description.
   - [x] added comments explaining the "why" and the intent of the code wherever would not be obvious for an unfamiliar reader.
   - [x] added unit tests or modified existing tests to cover new code paths, ensuring the threshold for [code coverage](https://github.com/apache/druid/blob/master/dev/code-review/code-coverage.md) is met.
   - [x] been tested in a test Druid cluster.
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org
For additional commands, e-mail: commits-help@druid.apache.org


[GitHub] [druid] abhishekagarwal87 commented on pull request #14096: fix bug filtering nested columns with expression filters

Posted by "abhishekagarwal87 (via GitHub)" <gi...@apache.org>.
abhishekagarwal87 commented on PR #14096:
URL: https://github.com/apache/druid/pull/14096#issuecomment-1511087133

   does it also fix the queries such as `select * from table where complex_column is not null`. Those don't work as well and they don't get planned into a query with expression filters. 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org
For additional commands, e-mail: commits-help@druid.apache.org


[GitHub] [druid] clintropolis merged pull request #14096: fix bug filtering nested columns with expression filters

Posted by "clintropolis (via GitHub)" <gi...@apache.org>.
clintropolis merged PR #14096:
URL: https://github.com/apache/druid/pull/14096


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org
For additional commands, e-mail: commits-help@druid.apache.org


[GitHub] [druid] clintropolis commented on pull request #14096: fix bug filtering nested columns with expression filters

Posted by "clintropolis (via GitHub)" <gi...@apache.org>.
clintropolis commented on PR #14096:
URL: https://github.com/apache/druid/pull/14096#issuecomment-1512102484

   >does it also fix the queries such as select * from table where complex_column is not null. Those don't work as well and they don't get planned into a query with expression filters.
   
   No, not quite, there is a bit more to fixing that because of things like https://github.com/apache/druid/blob/master/processing/src/main/java/org/apache/druid/query/filter/DruidPredicateFactory.java#L37, which are treating all inputs to predicates for complex and array types as null in the non-vectorized engine. I will be looking into fixing this in the near future.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org
For additional commands, e-mail: commits-help@druid.apache.org


[GitHub] [druid] gianm commented on pull request #14096: fix bug filtering nested columns with expression filters

Posted by "gianm (via GitHub)" <gi...@apache.org>.
gianm commented on PR #14096:
URL: https://github.com/apache/druid/pull/14096#issuecomment-1510380267

   The CI failure is due to https://github.com/apache/druid/pull/14099


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org
For additional commands, e-mail: commits-help@druid.apache.org