You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@flume.apache.org by Aitor Osa Larrabe <ai...@alumni.mondragon.edu> on 2017/07/07 10:20:21 UTC

HDFS Batch size not working correctly

I have a Kafka Source, Memory Channel and HDFS Sink composition. What I
want to do is to stack 4 messages from the Kafka Source and then put them
in HDFS in one single transaction.

agent.sinks.HDFS.hdfs.batchSize = 4
agent.sinks.HDFS.hdfs.path =
hdfs://127.0.0.1:54310/flume/events/%y-%m-%d/%H%M/%S
agent.sinks.HDFS.hdfs.fileType = DataStream
agent.sinks.HDFS.hdfs.writeFormat = Text
agent.sinks.HDFS.hdfs.rollCount = 2
agent.sinks.HDFS.hdfs.rollInterval = 0
agent.sinks.HDFS.hdfs.rollSize = 0

I don't know why but if I send events every 10 minutes it should be 40
minutes till it registers the data in HDFS but it does after the first
event.