You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@druid.apache.org by GitBox <gi...@apache.org> on 2020/08/25 19:43:32 UTC

[GitHub] [druid] clintropolis commented on pull request #9638: refactor internal type system

clintropolis commented on pull request #9638:
URL: https://github.com/apache/druid/pull/9638#issuecomment-680232328


   >In addition to the line comments, could you discuss a bit about how you approached testing this change to make sure nothing funky is going to happen? What risks did you see and how do the existing/new tests speak to them?
   
   
   Despite touching a lot of files, I feel like this PR currently has a relatively light impact; besides the refactoring, the primary differences are that array types produced by expressions and some post aggregators are now acknowledged by `ValueType`, and that row signatures in various places have been enriched to include type information for post aggregators as well as definitely complex types for aggregators. Its more like setting the stage for future cool stuff than introducing anything dramatic at this time. 
   
   Are you more curious/worried/hesitant/etc about the inclusion of array types, or the changes of including more type information in the `RowSignature`?
   
   If it is the former, it feels safe to me after my investigation and testing and I haven't _yet_ found issue with the array types existing. Grouping will effectively be treated the same as a complex type, and so invalid. ValueType handling is most often part of either a switch or if/else chain where the caller handles explicit types, and explode for things it does not understand. In fact the only way these array types can appear right now are from the few post aggregators that can produce them, since expression selectors and `ExprType` are not currently integrated with this, but probably will be in a future patch.
   
   If it is the latter, it would be trivial to add an escape hatch to at least the row signature changes, in the form of config `druid.something.enableAdvancedTypingInformation` or something, that would allow excluding all of this new stuff from the `RowSignature` and retain the current `null` type behavior.
   
   The primary tests added are around confirming that row signatures computed for a query matches the updated expectations, since that seems like the most noticeable change. On top of this, I've done some examining of how `RowSignature` type information and `ValueType` itself is used, as well as some manual prodding with the debugger, and haven't run into any issues yet with either the new value types or the additional signature information, though I'm sure it is possible there are things I missed.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org
For additional commands, e-mail: commits-help@druid.apache.org