You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@beam.apache.org by amir bahmanyari <am...@yahoo.com> on 2016/10/31 17:53:05 UTC
Beam App: MEM<-->Disk Space Issue
Hi Colleagues,Perhaps some of these questions should be asked in Flink forum. Pls let me know if thats the case so I can post it there as well.
I am facing something new when running my Beam app in a 4 nodes Flink Cluster.I list the behavior items:1- Dashboard shows all nodes actively running2- All slots being consumed3- There is a Flink instance daemon running in every node4- I submit the app fat jar from one of the servers to the cluster where JM is running successfully5- All Task managers log progress reports6- Only one of the nodes, on random basis, reports *.out logs (logs commuted by the app, not Flink). Other nodes donr report any *.out. Zero size.7- The app runs under heavy load for some reasonable time before start crunching...i.e. slowing down...which is expected7- The node where the *.out gets reported, runs out of space and "/" is 100%. Other nodes stay at "/" being 10%.8- I get the following exception in the *.out of the only node that reports *.out.
Questions:1- Why other nodes dont report *.out at runtime & only random node reports it?2- What should I do in my app to avoid this exception?3- What can I configure in Flink and/or in my environment and/or the servers config to avoid this issue?
I really appreciate your valuable time & help.It is very crucial to pass this issue in our bench-marking efforts and product selection process.
Cheers+here is the exception:Amir-
log4j:ERROR Failed to flush writer,java.io.IOException: No space left on device at java.io.FileOutputStream.writeBytes(Native Method) at java.io.FileOutputStream.write(FileOutputStream.java:326) at sun.nio.cs.StreamEncoder.writeBytes(StreamEncoder.java:221) at sun.nio.cs.StreamEncoder.implFlushBuffer(StreamEncoder.java:291) at sun.nio.cs.StreamEncoder.implFlush(StreamEncoder.java:295) at sun.nio.cs.StreamEncoder.flush(StreamEncoder.java:141) at java.io.OutputStreamWriter.flush(OutputStreamWriter.java:229) at org.apache.log4j.helpers.QuietWriter.flush(QuietWriter.java:59) at org.apache.log4j.WriterAppender.subAppend(WriterAppender.java:324) at org.apache.log4j.WriterAppender.append(WriterAppender.java:162) at org.apache.log4j.AppenderSkeleton.doAppend(AppenderSkeleton.java:251) at org.apache.log4j.helpers.AppenderAttachableImpl.appendLoopOnAppenders(AppenderAttachableImpl.java:66) at org.apache.log4j.Category.callAppenders(Category.java:206) at org.apache.log4j.Category.forcedLog(Category.java:391) at org.apache.log4j.Category.log(Category.java:856) at org.slf4j.impl.Log4jLoggerAdapter.warn(Log4jLoggerAdapter.java:420) at org.apache.beam.sdk.io.kafka.KafkaIO$UnboundedKafkaReader.getWatermark(KafkaIO.java:1078) at org.apache.beam.runners.flink.translation.wrappers.streaming.io.UnboundedSourceWrapper.trigger(UnboundedSourceWrapper.java:366) at org.apache.flink.streaming.runtime.tasks.StreamTask$TriggerTask.run(StreamTask.java:710) at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) at java.util.concurrent.FutureTask.run(FutureTask.java:266) at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:180) at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) at java.lang.Thread.run(Thread.java:745)[abahman@beam4 log]$