You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@druid.apache.org by GitBox <gi...@apache.org> on 2019/06/30 13:03:55 UTC

[GitHub] [incubator-druid] acdn-mpreston opened a new issue #8000: Datasource availability mis-represented due to 0 row segments not cleaned up after compaction

acdn-mpreston opened a new issue #8000: Datasource availability mis-represented due to 0 row segments not cleaned up after compaction
URL: https://github.com/apache/incubator-druid/issues/8000
 
 
   
   
   ### Affected Version
   
   0.14.0, 0.14.2
   
   ### Description
   
   After tracking the 'druid/coordinator/v1/loadstatus' API over weeks of execution, I started to notice long periods of time when our datasources would not be 100% available even though I could query all data from them successfully. Here is an example response:
   
   {
       "npav-ts-metrics-15m": 100,
       "npav-ts-metrics": 97.31543624161074
   }
   
   When I look at the "missing" segments, I see the following in the unified-console:
   
   ![image](https://user-images.githubusercontent.com/33030320/60396911-483bae00-9b15-11e9-859b-c311b79fec48.png)
   
   This shows that the data is there, and has been compacted, but there are also some '0-row' segments for the same time period that have not been cleaned up. These are being used to reduce the datasource availability even though the data is queryable.
   
   Eventually these rows get cleaned up, but then new ones take their place, so the datasource is very rarely 100% available from the POV of the 'druid/coordinator/v1/loadstatus' API. Maybe these segments could be filtered out when handling the API to more accurately represent the status of the datasource?

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org
For additional commands, e-mail: commits-help@druid.apache.org