You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@flume.apache.org by "Juhani Connolly (JIRA)" <ji...@apache.org> on 2013/02/27 09:59:12 UTC
[jira] [Commented] (FLUME-1929) CheckpointRebuilder main method
does not work
[ https://issues.apache.org/jira/browse/FLUME-1929?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13588132#comment-13588132 ]
Juhani Connolly commented on FLUME-1929:
----------------------------------------
This appears to hang.
Steps followed:
- start up flume, feed some data -kill 9 to try to force an inconsistent checkpoint
- delete in-use.lock, checkpoint and checkpoint.meta
- run the checkpoint rebuilder, final command through our script is(not that I patched -c to become -h)
+ exec /usr/local/java/bin/java -server -XX:OnOutOfMemoryError=/tmp/stop.sh -XX:MaxPermSize=24m -XX:PermSize=24m -XX:SurvivorRatio=8 -Xmn96m -Xmx512m -Xms128m -Dcom.sun.management.jmxremote -Dcom.sun.management.jmxremote.port=12345 -Dcom.sun.management.jmxremote.ssl=false -Dcom.sun.management.jmxremote.authenticate=false -Djava.rmi.server.hostname=172.28.202.76 -Dflume.monitoring.type=GANGLIA -Dflume.monitoring.hosts=pat-log-om01:8649 -cp '/etc/flume/conf:/usr/lib/flume/lib/*' -Djava.library.path= org.apache.flume.channel.file.CheckpointRebuilder -h /tmp/flume-check -l /tmp/flume-data -t 5000000
Full logs are as below:
27 Feb 2013 17:51:35,995 INFO [main] (org.apache.flume.channel.file.EventQueueBackingStoreFile.<init>:71) - Preallocated /tmp/flume-check/checkpoint to 40008232 for capacity 5000000
27 Feb 2013 17:51:36,004 INFO [main] (org.apache.flume.channel.file.EventQueueBackingStoreFileV3.<init>:47) - Starting up with /tmp/flume-check/checkpoint and /tmp/flume-check/checkpoint.meta
27 Feb 2013 17:51:36,078 INFO [main] (org.apache.flume.channel.file.CheckpointRebuilder.rebuild:64) - Attempting to fast replay the log files.
27 Feb 2013 17:51:36,112 INFO [main] (org.apache.flume.tools.DirectMemoryUtils.getDefaultDirectMemorySize:113) - Unable to get maxDirectMemory from VM: NoSuchMethodException: sun.misc.VM.maxDirectMemory(null)
27 Feb 2013 17:51:36,117 INFO [main] (org.apache.flume.tools.DirectMemoryUtils.allocate:47) - Direct Memory Allocation: Allocation = 1048576, Allocated = 0, MaxDirectMemorySize = 526843904, Remaining = 526843904
27 Feb 2013 17:51:36,866 INFO [main] (org.apache.flume.channel.file.LogFile$SequentialReader.next:491) - Encountered EOF at 150457 in /tmp/flume-data/log-3
27 Feb 2013 17:51:36,884 INFO [main] (org.apache.flume.channel.file.LogFile$SequentialReader.next:491) - Encountered EOF at 4095 in /tmp/flume-data/log-4
27 Feb 2013 17:51:36,887 INFO [main] (org.apache.flume.channel.file.CheckpointRebuilder.rebuild:151) - Replayed 0 events using fast replay logic.
27 Feb 2013 17:51:36,889 INFO [main] (org.apache.flume.channel.file.EventQueueBackingStoreFile.beginCheckpoint:108) - Start checkpoint for /tmp/flume-check/checkpoint, elements to sync = 0
27 Feb 2013 17:51:36,896 INFO [main] (org.apache.flume.channel.file.EventQueueBackingStoreFile.checkpoint:120) - Updating checkpoint metadata: logWriteOrderID: 1361955096886, queueSize: 0, queueHead: 0
27 Feb 2013 17:51:36,906 INFO [main] (org.apache.flume.channel.file.LogFileV3$MetaDataWriter.markCheckpoint:85) - Updating log-3.meta currentPosition = 0, logWriteOrderID = 1361955096886
27 Feb 2013 17:51:36,908 INFO [main] (org.apache.flume.channel.file.LogFileV3$MetaDataWriter.markCheckpoint:85) - Updating log-4.meta currentPosition = 4095, logWriteOrderID = 1361955096886
Some diagnostics:
# lsof +d /tmp/flume-data
COMMAND PID USER FD TYPE DEVICE SIZE/OFF NODE NAME
bash 15144 juhani_connolly cwd DIR 252,0 4096 132605 /tmp/flume-data
sudo 16392 root cwd DIR 252,0 4096 132605 /tmp/flume-data
lsof 16394 root cwd DIR 252,0 4096 132605 /tmp/flume-data
lsof 16395 root cwd DIR 252,0 4096 132605 /tmp/flume-data
Attaching thread dump
> CheckpointRebuilder main method does not work
> ---------------------------------------------
>
> Key: FLUME-1929
> URL: https://issues.apache.org/jira/browse/FLUME-1929
> Project: Flume
> Issue Type: Bug
> Reporter: Hari Shreedharan
> Assignee: Hari Shreedharan
> Priority: Minor
> Attachments: FLUME-1929.patch
>
>
> Based on the discussion in this thread: http://apache.markmail.org/thread/567cshrmz35okrq3 - the main method in CheckpointRebuilder was not updated for the new data file format.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira