You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@druid.apache.org by GitBox <gi...@apache.org> on 2021/05/26 08:42:06 UTC

[GitHub] [druid] sasounda opened a new issue #5577: historical miscalculating remaining disk capacity - segment too large exception causing segments to not load

sasounda opened a new issue #5577:
URL: https://github.com/apache/druid/issues/5577


   After seeing this error in Druid historical
   ```
   Caused by: com.metamx.common.ISE: Segment[timeseries_dogstatsd_counter_2018-04-04T16:00:00.000Z_2018-04-04T17:00:00.000Z_2018-04-04T16:00:00.000Z_1528210995:152,770,889] too large for storage[/var/tmp/druid/indexCache:22,010].
   ```
   , we notice that the historical node stops loading new segments from realtime and the realtime nodes starts accumulating segments.
   
   Our `maxSize` settings goes like this and we had enough free disk space.
   ```
   druid.server.maxSize=882159184076
   druid.segmentCache.locations=[{"path":"/var/tmp/druid/indexCache","maxSize":882159184076}]
   ```
   
   Restarting Druid historical fixes the issue. We suspect that there is something going wrong with how Druid calculates the available size i.e., `22,010`.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org
For additional commands, e-mail: commits-help@druid.apache.org


[GitHub] [druid] tanisdlj commented on issue #5577: historical miscalculating remaining disk capacity - segment too large exception causing segments to not load

Posted by GitBox <gi...@apache.org>.
tanisdlj commented on issue #5577:
URL: https://github.com/apache/druid/issues/5577#issuecomment-949774997


   happening in 0.20.1 too


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org
For additional commands, e-mail: commits-help@druid.apache.org


[GitHub] [druid] egor-ryashin commented on issue #5577: historical miscalculating remaining disk capacity - segment too large exception causing segments to not load

Posted by GitBox <gi...@apache.org>.
egor-ryashin commented on issue #5577:
URL: https://github.com/apache/druid/issues/5577#issuecomment-848585038


   I can confirm that this is still present in Druid 0.21.
   
   I see logs like:
   ```
   [WARN ] 2021-05-25 14:39:20.718 [SimpleDataSegmentChangeHandler-2] StorageLocation - Segment[<segment_name>:260,453,366] too large for storage/opt/druid/var/segmentCache:7,621,518. Check your druid.segmentCache.locations maxSize param
   ``` 
   To eliminate confusion I can confirm that 7,621,518 number is what Druid thinks about available space in storage.
   You can check it here `org/apache/druid/segment/loading/StorageLocation.java:155`
   Meanwhile there are at least several GBs available:
   ```
   df -h
   ...
   Avail Use% Mounted on
   17G  94% /opt/druid/var
   ```
   Metrics say the same (the correct used size), actually, metrics report from a different in-memory variable, you can find it here `org/apache/druid/server/metrics/HistoricalMetricsMonitor.java:88`
   
   Right now I cannot find a bug in source code, I can only speculate that some exception is uncaught and the available size is not decreased properly, but looking through logs doesn't show there were uncaught exceptions.
   
   I propose to log or send as metrics this variable `org.apache.druid.segment.loading.StorageLocation#availableSizeBytes` so we can track when the issue starts taking place, other debugging ideas are welcome.
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org
For additional commands, e-mail: commits-help@druid.apache.org


[GitHub] [druid] abhishekagarwal87 commented on issue #5577: historical miscalculating remaining disk capacity - segment too large exception causing segments to not load

Posted by GitBox <gi...@apache.org>.
abhishekagarwal87 commented on issue #5577:
URL: https://github.com/apache/druid/issues/5577#issuecomment-950754165


   This sounds like a different problem than what my patch fixed. My patch fixes the problem where historical is over-estimating available disk capacity, causing disks to overflow. The problem here sounds like that druid is under-estimating available capacity. 
   @tanisdlj what does your maxSize and segment cache configuration look like? Maybe they will offer a clue. 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org
For additional commands, e-mail: commits-help@druid.apache.org


[GitHub] [druid] tanisdlj commented on issue #5577: historical miscalculating remaining disk capacity - segment too large exception causing segments to not load

Posted by GitBox <gi...@apache.org>.
tanisdlj commented on issue #5577:
URL: https://github.com/apache/druid/issues/5577#issuecomment-950681580


   Thanks! I will try and report back :)


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org
For additional commands, e-mail: commits-help@druid.apache.org


[GitHub] [druid] a2l007 commented on issue #5577: historical miscalculating remaining disk capacity - segment too large exception causing segments to not load

Posted by GitBox <gi...@apache.org>.
a2l007 commented on issue #5577:
URL: https://github.com/apache/druid/issues/5577#issuecomment-949891663


   @egor-ryashin @tanisdlj Would it be possible for you to test this out on 0.22 ? We had something similar in one of our clusters and this patch #10884 worked for us.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org
For additional commands, e-mail: commits-help@druid.apache.org


[GitHub] [druid] tanisdlj commented on issue #5577: historical miscalculating remaining disk capacity - segment too large exception causing segments to not load

Posted by GitBox <gi...@apache.org>.
tanisdlj commented on issue #5577:
URL: https://github.com/apache/druid/issues/5577#issuecomment-950757669


   @abhishekagarwal87 you can check my config in here, I just posted it:
   https://github.com/apache/druid/issues/11840


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org
For additional commands, e-mail: commits-help@druid.apache.org