You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@druid.apache.org by GitBox <gi...@apache.org> on 2020/09/17 14:46:54 UTC

[GitHub] [druid] repnop opened a new issue #10403: Kafka ingestion sometimes produces rows that aren't aggregated

repnop opened a new issue #10403:
URL: https://github.com/apache/druid/issues/10403


   ### Affected Version
   
   0.18.1
   
   ### Description
   
   Hi, this problem seems to be similar from what I saw in #8276, but unfortunately I'm not really able to understand what the solution was in that case and this problem is affecting my ability to produce reports that are quick and accurate. We have a Kafka ingestion task that occasionally produces more than one row for the same dimension (data is pushed out onto the topic every 30s), but in my case the values aren't duplicated but they fail to be aggregated and instead appear as two separate rows which then throw off any aggregations in queries if I don't use a `longMax`, which then hinders my ability to do aggregation of other types because `longSum` suddenly produces extremely large values only some of the time
   
   Example screenshot of issue (dimension value excluded here):
   ![image](https://user-images.githubusercontent.com/24203105/93486466-3852ea00-f8d2-11ea-94e8-71f48276acd9.png)
   
   I did see compressing segments should aggregate these values in the referenced issue, and tried to do that but I don't believe that's a solution that particularly works since this is a real-time data set I need for visualization and waiting for segment compression to happen is a non-starter and I can't remember if it even solved the issue when I did try, though its quite possible I wasn't doing it correctly.
   
   Any help would be greatly appreciated, thanks!
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org
For additional commands, e-mail: commits-help@druid.apache.org