You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@activemq.apache.org by Kevin Burton <bu...@spinn3r.com> on 2014/12/18 21:45:04 UTC

massive data loss after 5.10 + leveldb restart?

I’m trying to track down a bunch of concerning production bugs in ActiveMQ
that might be related.

The biggest problem I’m seeing is that on restart, it seems that we’re
losing a LARGE percentage of our messages.

We had about 7k in a queue, and on restart, it went down to almost zero.

I’m getting LevelDB messages like:

2014-12-18 14:29:36,024 | WARN  | No reader available for position:
eac7b1cad, log_infos:
{81371951320=LogInfo(/var/lib/apache-activemq/leveldb/00000012f22570d8.log,81371951320,104858168),
81686526928=LogInfo(/var/lib/apache- … } |
org.apache.activemq.leveldb.RecordLog | ActiveMQ BrokerService[
util0041.wdc.sl.spinn3r.com] Task-3

.. the other main problem I’m having is that JMX is telling me that I have
a LARGE number of messages in some of our queues, but they’re not being
processed.

It also appears that AMQ is having corrupt JMX values because some of my
queues have *negative* sizes. which obviously makes no sense.

For example, right now it’s saying there are 15k messages in our dead
letter queue.  However, when I try to browse it, nothing is returned.

I’ve had this problem before, and the only resolution has been to
completely scrap our full Level DB database by stopping AMQ, removing the
directory, then starting it again.

Then I have to re-enqueue all of our messages.

This isn’t scalable obviously and I need to track down why AMQ keeps
corrupting itself.

Kevin


—
Founder/CEO Spinn3r.com
Location: *San Francisco, CA*
blog: http://burtonator.wordpress.com
… or check out my Google+ profile
<https://plus.google.com/102718274791889610666/posts>
<http://spinn3r.com>

Re: massive data loss after 5.10 + leveldb restart?

Posted by Kevin Burton <bu...@spinn3r.com>.
I’m worried this is a sever LevelDB bug that’s been around for more than a
year without any fix…

https://issues.apache.org/jira/browse/AMQ-5300

seems to be relocated to log rolling and a restart. LevelDB ends up
corrupted if you restart it after the logs have rolled over.  At least that
would explain what I’m seeing.

On Thu, Dec 18, 2014 at 12:45 PM, Kevin Burton <bu...@spinn3r.com> wrote:
>
> I’m trying to track down a bunch of concerning production bugs in ActiveMQ
> that might be related.
>
> The biggest problem I’m seeing is that on restart, it seems that we’re
> losing a LARGE percentage of our messages.
>
> We had about 7k in a queue, and on restart, it went down to almost zero.
>
> I’m getting LevelDB messages like:
>
> 2014-12-18 14:29:36,024 | WARN  | No reader available for position:
> eac7b1cad, log_infos:
> {81371951320=LogInfo(/var/lib/apache-activemq/leveldb/00000012f22570d8.log,81371951320,104858168),
> 81686526928=LogInfo(/var/lib/apache- … } |
> org.apache.activemq.leveldb.RecordLog | ActiveMQ BrokerService[
> util0041.wdc.sl.spinn3r.com] Task-3
>
> .. the other main problem I’m having is that JMX is telling me that I have
> a LARGE number of messages in some of our queues, but they’re not being
> processed.
>
> It also appears that AMQ is having corrupt JMX values because some of my
> queues have *negative* sizes. which obviously makes no sense.
>
> For example, right now it’s saying there are 15k messages in our dead
> letter queue.  However, when I try to browse it, nothing is returned.
>
> I’ve had this problem before, and the only resolution has been to
> completely scrap our full Level DB database by stopping AMQ, removing the
> directory, then starting it again.
>
> Then I have to re-enqueue all of our messages.
>
> This isn’t scalable obviously and I need to track down why AMQ keeps
> corrupting itself.
>
> Kevin
>
>
> —
> Founder/CEO Spinn3r.com
> Location: *San Francisco, CA*
> blog: http://burtonator.wordpress.com
> … or check out my Google+ profile
> <https://plus.google.com/102718274791889610666/posts>
> <http://spinn3r.com>
>
>

-- 

Founder/CEO Spinn3r.com
Location: *San Francisco, CA*
blog: http://burtonator.wordpress.com
… or check out my Google+ profile
<https://plus.google.com/102718274791889610666/posts>
<http://spinn3r.com>