You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@griffin.apache.org by "William Guo (Jira)" <ji...@apache.org> on 2019/09/13 14:18:00 UTC

[jira] [Resolved] (GRIFFIN-288) Optimize hdfs sink

     [ https://issues.apache.org/jira/browse/GRIFFIN-288?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

William Guo resolved GRIFFIN-288.
---------------------------------
    Fix Version/s: 0.6.0
       Resolution: Fixed

Issue resolved by pull request 533
[https://github.com/apache/griffin/pull/533]

> Optimize hdfs sink
> ------------------
>
>                 Key: GRIFFIN-288
>                 URL: https://issues.apache.org/jira/browse/GRIFFIN-288
>             Project: Griffin
>          Issue Type: Task
>            Reporter: wan kun
>            Priority: Major
>             Fix For: 0.6.0
>
>          Time Spent: 0.5h
>  Remaining Estimate: 0h
>
> When we sink records to hdfs , it  may be OOM if the result is huge. 
> {code:java}
> 19/09/06 18:52:39 INFO LineBufferedStream: 19/09/06 18:52:39 ERROR sink.HdfsSink: Java heap space
> 19/09/06 18:52:39 INFO LineBufferedStream: java.lang.OutOfMemoryError: Java heap space
> 19/09/06 18:52:39 INFO LineBufferedStream:      at java.util.Arrays.copyOf(Arrays.java:3332)
> 19/09/06 18:52:39 INFO LineBufferedStream:      at java.lang.AbstractStringBuilder.ensureCapacityInternal(AbstractStringBuilder.java:124)
> 19/09/06 18:52:39 INFO LineBufferedStream:      at java.lang.AbstractStringBuilder.append(AbstractStringBuilder.java:448)
> 19/09/06 18:52:39 INFO LineBufferedStream:      at java.lang.StringBuilder.append(StringBuilder.java:136)
> 19/09/06 18:52:39 INFO LineBufferedStream:      at scala.collection.mutable.StringBuilder.append(StringBuilder.scala:200)
> 19/09/06 18:52:39 INFO LineBufferedStream:      at scala.collection.TraversableOnce$$anonfun$addString$1.apply(TraversableOnce.scala:364)
> 19/09/06 18:52:39 INFO LineBufferedStream:      at scala.collection.IndexedSeqOptimized$class.foreach(IndexedSeqOptimized.scala:33)
> 19/09/06 18:52:39 INFO LineBufferedStream:      at scala.collection.mutable.WrappedArray.foreach(WrappedArray.scala:35)
> 19/09/06 18:52:39 INFO LineBufferedStream:      at scala.collection.TraversableOnce$class.addString(TraversableOnce.scala:357)
> 19/09/06 18:52:39 INFO LineBufferedStream:      at scala.collection.AbstractTraversable.addString(Traversable.scala:104)
> 19/09/06 18:52:39 INFO LineBufferedStream:      at scala.collection.TraversableOnce$class.mkString(TraversableOnce.scala:323)
> 19/09/06 18:52:39 INFO LineBufferedStream:      at scala.collection.AbstractTraversable.mkString(Traversable.scala:104)
> 19/09/06 18:52:39 INFO LineBufferedStream:      at scala.collection.TraversableOnce$class.mkString(TraversableOnce.scala:325)
> 19/09/06 18:52:39 INFO LineBufferedStream:      at scala.collection.AbstractTraversable.mkString(Traversable.scala:104)
> 19/09/06 18:52:39 INFO LineBufferedStream:      at org.apache.griffin.measure.sink.HdfsSink.org$apache$griffin$measure$sink$HdfsSink$$sinkRecords2Hdfs(HdfsSink.scala:191)
> 19/09/06 18:52:39 INFO LineBufferedStream:      at org.apache.griffin.measure.sink.HdfsSink.sinkRecords(HdfsSink.scala:133)
> 19/09/06 18:52:39 INFO LineBufferedStream:      at org.apache.griffin.measure.sink.MultiSinks$$anonfun$sinkRecords$1.apply(MultiSinks.scala:63)
> 19/09/06 18:52:39 INFO LineBufferedStream:      at org.apache.griffin.measure.sink.MultiSinks$$anonfun$sinkRecords$1.apply(MultiSinks.scala:61)
> 19/09/06 18:52:39 INFO LineBufferedStream:      at scala.collection.immutable.List.foreach(List.scala:392)
> 19/09/06 18:52:39 INFO LineBufferedStream:      at org.apache.griffin.measure.sink.MultiSinks.sinkRecords(MultiSinks.scala:61)
> 19/09/06 18:52:39 INFO LineBufferedStream:      at org.apache.griffin.measure.step.write.RecordWriteStep.execute(RecordWriteStep.scala:49)
> 19/09/06 18:52:39 INFO LineBufferedStream:      at org.apache.griffin.measure.step.transform.SparkSqlTransformStep.doExecute(SparkSqlTransformStep.scala:40)
> 19/09/06 18:52:39 INFO LineBufferedStream:      at org.apache.griffin.measure.step.transform.TransformStep$class.execute(TransformStep.scala:72)
> 19/09/06 18:52:39 INFO LineBufferedStream:      at org.apache.griffin.measure.step.transform.SparkSqlTransformStep.execute(SparkSqlTransformStep.scala:27)
> 19/09/06 18:52:39 INFO LineBufferedStream:      at org.apache.griffin.measure.step.transform.TransformStep$$anonfun$2$$anonfun$apply$1.apply$mcV$sp(TransformStep.scala:51)
> 19/09/06 18:52:39 INFO LineBufferedStream:      at org.apache.griffin.measure.step.transform.TransformStep$$anonfun$2$$anonfun$apply$1.apply(TransformStep.scala:50)
> 19/09/06 18:52:39 INFO LineBufferedStream:      at org.apache.griffin.measure.step.transform.TransformStep$$anonfun$2$$anonfun$apply$1.apply(TransformStep.scala:50)
> 19/09/06 18:52:39 INFO LineBufferedStream:      at scala.concurrent.impl.Future$PromiseCompletingRunnable.liftedTree1$1(Future.scala:24)
> 19/09/06 18:52:39 INFO LineBufferedStream:      at scala.concurrent.impl.Future$PromiseCompletingRunnable.run(Future.scala:24)
> 19/09/06 18:52:39 INFO LineBufferedStream:      at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
> 19/09/06 18:52:39 INFO LineBufferedStream:      at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
> 19/09/06 18:52:39 INFO LineBufferedStream:      at java.lang.Thread.run(Thread.java:748)
> {code}



--
This message was sent by Atlassian Jira
(v8.3.2#803003)