You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@druid.apache.org by GitBox <gi...@apache.org> on 2020/03/28 12:38:18 UTC

[GitHub] [druid] Synforge commented on issue #9460: Issue with result from CONCAT expression when using Kafka streaming ingestion.

Synforge commented on issue #9460: Issue with result from CONCAT expression when using Kafka streaming ingestion.
URL: https://github.com/apache/druid/issues/9460#issuecomment-605441813
 
 
   I've done a little bit of digging on this and this bug applies to all string dimension columns in the IncrementalIndexStorageAdapter. It seems that regardless of whether a multi value was inserted into a column or not, this storage adapter sets all string columns to be multi value.
   
   e.g. for the example above while it hasn't been persisted a query for segment metadata results in this:
   
   `            "currency": {
                   "cardinality": 2, 
                   "errorMessage": null, 
                   "hasMultipleValues": true, 
                   "maxValue": "GBP", 
                   "minValue": "EUR", 
                   "size": 0, 
                   "type": "STRING"
               } `
   
   Whereas the persisted data returns hasMultipleValues correctly as false, it seems this results in inconsistencies when using any kind of string function against a dimensional column that has not yet been persisted vs data that has been persisted. So I think this problem is bigger than just the above report.
   
   I verified this by amending the following to return false and this then correctly returns just a string value instead of an array. However I'm aware this may break multi-values on ingestion?
   
   https://github.com/apache/druid/blob/master/processing/src/main/java/org/apache/druid/segment/incremental/IncrementalIndexStorageAdapter.java#L166
   
   Happy to take a look further if anyone can offer any advice as to how to tackle this problem. I believe @gianm wrote some of this code, I'm hoping you might be able to offer some advice?
   
   Thanks
   

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org
For additional commands, e-mail: commits-help@druid.apache.org