You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@activemq.apache.org by "Lionel Cons (JIRA)" <ji...@apache.org> on 2013/02/05 08:26:16 UTC

[jira] [Reopened] (APLO-293) Apollo should try to recover messages from a corrupted store

     [ https://issues.apache.org/jira/browse/APLO-293?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Lionel Cons reopened APLO-293:
------------------------------


I'm sorry but this does not seem to work.

Using apollo-99-trunk-20130202.135855-180, with an Apollo that had to be killed the hard way, we get:

2013-02-05 07:59:29,109 | INFO  | virtual host startup is waiting on store startup | 
2013-02-05 07:59:29,958 | INFO  | Opening the log file took: 1638.60 ms | 
2013-02-05 07:59:30,133 | WARN  | DB operation failed. (entering recovery mode): org.iq80.leveldb.DBException: IO error: /var/lib/apollo/data/dirty.index/000094.sst: No such file or directory | 13ca926eba9
[...]
2013-02-05 07:59:33,127 | INFO  | broker startup is waiting on start virtual-host: apollo | 

And Apollo is stuck there. The web interface reports that the host is "STARTING" but this for more than 25 minutes. The store was only 160 MB (reported by du).
                
> Apollo should try to recover messages from a corrupted store
> ------------------------------------------------------------
>
>                 Key: APLO-293
>                 URL: https://issues.apache.org/jira/browse/APLO-293
>             Project: ActiveMQ Apollo
>          Issue Type: Bug
>         Environment: apollo-99-trunk-20130202.135855-180
>            Reporter: Lionel Cons
>            Assignee: Hiram Chirino
>             Fix For: 1.6
>
>
> Due to other bugs (mainly APLO-257 but not only), we sometimes have to kill Apollo the hard way as it would not stop gracefully.
> This almost always leaves the LevelDB store corrupted. When (re)starting, we see messages like:
> 2013-02-03 12:47:13,099 | WARN  | DB operation failed. (entering recovery mode): org.iq80.leveldb.DBException: IO error: /var/lib/apollo/data/dirty.index/001869.sst: No such file or directory | 13c9fe18242
> (see also APLO-282)
> At this point Apollo hangs. The only solution is to kill it once more and completely destroy the message store, loosing all messages :-(
> Could Apollo try to recover at least some messages in these situations instead of hanging during startup?

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira