You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@qpid.apache.org by Pavel Moravec <pm...@redhat.com> on 2014/04/18 15:39:16 UTC

Purging a big queue backed by linear store postpones journal files to be returned to EFP

Hi all,
we identified some sub-optimal behaviour in the way linear store returns files to Empty File Pool (EFP) in one use case. I would like to get some feedback from the community if it would be valuable to implement some better way of returning the files to EFP.

Currently, the only way to return a journal file to EFP during broker runtime is a check after every 100th dequeue on that queue/journal: while the _oldest_ journal file has no valid enqueue record, it is returned to EFP and removed from the journal. This has one surprising consequence when purging a queue with many messages:

- you enqueue e.g. 1000099 messages to journal files numbered from 1 to let say 100
- you dequeue all of them:
  - dequeueing a message means writing to the journal a dequeue record and discarding the enqueue record. This means, dequeueing 1000000 messages means creating let say 10 new journal files (numbered 101 to 110) filled by dequeue events only
  - every 100th dequeue event checks if the oldest journal file can be returned to EFP. Dequeueing millionth message sees 100th journal file still has 99 remaining enqueues, so it does not return it to EFP. While the next 9 or 10 journal files with just dequeue records are not checked at all
- at the end, we have an empty queue backed up by 10 journal files. While one file would be sufficient.

Note the above is _not_ a journal file leak, the store only postpones moving the files to EFP until next 100th dequeue is coming. So it only increases disk psace utilization, somehow.

Also note the use case assumes the journal gets many dequeue events in a row (with no or very few enqueues in between), and no enqueue+dequeue activity follows for a longer time (as in the scenario above, next dequeue would move 10 journal files to EFP). In other cases, no (or at most one) journal file moving to EFP is postponed by some time.

It would be possible to implement some time-based trigger that will - for example - checks all journal files in all journals if they can be returned to EFP. The question is how much valuable it would be (compared to adding some complexity to the code). My own attitude is "disk space is cheap, don't implement it", but if somebody has some solid use case where such feature would be much appreciated, please respond.


Kind regards,
Pavel Moravec



---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@qpid.apache.org
For additional commands, e-mail: users-help@qpid.apache.org