You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@druid.apache.org by GitBox <gi...@apache.org> on 2021/06/03 11:19:01 UTC

[GitHub] [druid] clintropolis opened a new pull request #11327: fix a bugs related to SQL type inference return type nullability

clintropolis opened a new pull request #11327:
URL: https://github.com/apache/druid/pull/11327


   ### Description
   This PR fixes the return type inference of several SQL operators which were incorrectly reporting their return type as not nullable, when in fact this was dependent on the arguments to the operator. A new method, `returnTypeCascadeNullable`, has been added to `OperatorConversions` builder type, which allows defining operators with a return type that is only nullable if any of the operands are nullable, which is many of them. I only evaluated all callers of `returnTypeNonNull`, and didn't look closely if some of the `returnTypeNullable` should be switched to this new method.
   
   This fixes bugs like in the test case added to `CalciteQueryTests`, which covers some of the functions which have been fixed and illustrates the fix (since these operators were previously marked as non-null, the count aggregator would be effectively translated to `count(*)`). I am unsure of what other bugs might have been happening besides this, but it is quite possible this PR fixes a handful of other issues with null handing in SQL compatible mode since some inappropriate optimizations might have been happening.
   
   Along the way, I also fixed null handling bugs in `repeat` which would've had a null pointer exception, and `timestamp_shift` which would ignore numeric nulls in SQL compatible mode and just shift.
   
   <hr>
   
   This PR has:
   - [x] been self-reviewed.
   - [x] added Javadocs for most classes and all non-trivial methods. Linked related entities via Javadoc links.
   - [x] added unit tests or modified existing tests to cover new code paths, ensuring the threshold for [code coverage](https://github.com/apache/druid/blob/master/dev/code-review/code-coverage.md) is met.
   - [x] been tested in a test Druid cluster.
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org
For additional commands, e-mail: commits-help@druid.apache.org


[GitHub] [druid] gianm commented on pull request #11327: fix a bugs related to SQL type inference return type nullability

Posted by GitBox <gi...@apache.org>.
gianm commented on pull request #11327:
URL: https://github.com/apache/druid/pull/11327#issuecomment-861557741


   > As far as the docs go, I think it would be worth trying to fill out the null handling of all functions, but I would maybe rather do this as a follow-up since there are a lot of them if that is ok. I imagine something similar to what #11188 added for aggregator functions, but I'm not yet sure if that format makes the most sense for the other functions, or if there is a more concise way to describe general behavior unless otherwise specified, need to think about it a bit.
   
   I think the way PostgreSQL does it is reasonable for a single-page doc style: https://www.postgresql.org/docs/13/functions-string.html. It describes null handling behavior only for functions where it's non-obvious (like concat) or where there's something special going on (like quote_literal and quote_nullable). It keeps the noise down.
   
   One day, we might want to consider a multi-page doc style, though, with a whole page for each function. Then we can go into all the details, give usage examples, etc. Microsoft docs are like that: https://docs.microsoft.com/en-us/sql/t-sql/functions/concat-transact-sql?view=sql-server-ver15. But I wouldn't worry about that as part of the null handling documentation stuff.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org
For additional commands, e-mail: commits-help@druid.apache.org


[GitHub] [druid] clintropolis commented on pull request #11327: fix a bugs related to SQL type inference return type nullability

Posted by GitBox <gi...@apache.org>.
clintropolis commented on pull request #11327:
URL: https://github.com/apache/druid/pull/11327#issuecomment-861073716


   >I think in light of what other DBs do, this is fine, so I'd just suggest updating the ConcatOperatorConversion and the docs appropriately.
   
   I've updated `ConcatOperatorConversion`, missed that one because it wasn't using the builder.
   
   As far as the docs go, I think it would be worth trying to fill out the null handling of all functions, but I would maybe rather do this as a follow-up since there are a lot of them if that is ok. I imagine something similar to what #11188 added for aggregator functions, but I'm not yet sure if that format makes the most sense for the other functions, or if there is a more concise way to describe general behavior unless otherwise specified, need to think about it a bit.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org
For additional commands, e-mail: commits-help@druid.apache.org


[GitHub] [druid] gianm edited a comment on pull request #11327: fix a bugs related to SQL type inference return type nullability

Posted by GitBox <gi...@apache.org>.
gianm edited a comment on pull request #11327:
URL: https://github.com/apache/druid/pull/11327#issuecomment-860951542


   > Generally LGTM. Could you take a look at ConcatOperatorConversion too? I think it should also be "cascade nullable".
   
   Btw, surveying what else is out there:
   
   - PostgreSQL CONCAT ignores nulls, and `||` returns null if any argument is null: https://www.postgresql.org/docs/13/functions-string.html, http://sqlfiddle.com/#!17/5821c5/2
   - MySQL CONCAT returns null if any argument is null, and doesn't seem to support the `||` operator: https://dev.mysql.com/doc/refman/8.0/en/string-functions.html#function_concat, http://sqlfiddle.com/#!9/180f8e/1
   - BigQuery CONCAT returns null if any argument is null; it mentions support for `||` but doesn't say how it will handle nulls: https://cloud.google.com/bigquery/docs/reference/standard-sql/string_functions
   
   Today, our ConcatFunc is implemented to return null if any argument is null, and we use that ConcatFunc for both CONCAT and `||`. I think in light of what other DBs do, this is fine, so I'd just suggest updating the ConcatOperatorConversion and the docs appropriately.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org
For additional commands, e-mail: commits-help@druid.apache.org


[GitHub] [druid] gianm commented on pull request #11327: fix a bugs related to SQL type inference return type nullability

Posted by GitBox <gi...@apache.org>.
gianm commented on pull request #11327:
URL: https://github.com/apache/druid/pull/11327#issuecomment-860951542


   > Generally LGTM. Could you take a look at ConcatOperatorConversion too? I think it should also be "cascade nullable".
   
   Or maybe not:
   
   - PostgreSQL CONCAT ignores nulls, and `||` returns null if any argument is null: https://www.postgresql.org/docs/13/functions-string.html, http://sqlfiddle.com/#!17/5821c5/2
   - MySQL CONCAT returns null if any argument is null, and doesn't seem to support the `||` operator: https://dev.mysql.com/doc/refman/8.0/en/string-functions.html#function_concat, http://sqlfiddle.com/#!9/180f8e/1
   - BigQuery CONCAT returns NULL if any input argument is NULL; it mentions support for `||` but doesn't say how it will handle nulls: https://cloud.google.com/bigquery/docs/reference/standard-sql/string_functions


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org
For additional commands, e-mail: commits-help@druid.apache.org


[GitHub] [druid] gianm merged pull request #11327: fix a bugs related to SQL type inference return type nullability

Posted by GitBox <gi...@apache.org>.
gianm merged pull request #11327:
URL: https://github.com/apache/druid/pull/11327


   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org
For additional commands, e-mail: commits-help@druid.apache.org


[GitHub] [druid] gianm edited a comment on pull request #11327: fix a bugs related to SQL type inference return type nullability

Posted by GitBox <gi...@apache.org>.
gianm edited a comment on pull request #11327:
URL: https://github.com/apache/druid/pull/11327#issuecomment-860951542


   > Generally LGTM. Could you take a look at ConcatOperatorConversion too? I think it should also be "cascade nullable".
   
   Btw, surveying what else is out there:
   
   - PostgreSQL CONCAT ignores nulls, and `||` returns null if any argument is null: https://www.postgresql.org/docs/13/functions-string.html, http://sqlfiddle.com/#!17/5821c5/2
   - MySQL CONCAT returns null if any argument is null, and seems to use the `||` operator for something other than concatenation: https://dev.mysql.com/doc/refman/8.0/en/string-functions.html#function_concat, http://sqlfiddle.com/#!9/180f8e/1
   - BigQuery CONCAT returns null if any argument is null; it mentions support for `||` but doesn't say how it will handle nulls: https://cloud.google.com/bigquery/docs/reference/standard-sql/string_functions
   
   Today, our ConcatFunc is implemented to return null if any argument is null, and we use that ConcatFunc for both CONCAT and `||`. I think in light of what other DBs do, this is fine, so I'd just suggest updating the ConcatOperatorConversion and the docs appropriately.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org
For additional commands, e-mail: commits-help@druid.apache.org