You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@spark.apache.org by AR...@cognizant.com on 2016/02/04 18:48:29 UTC

Memory tuning in spark sql

Hi Sir/madam,
Greetings of the day.

I am working on Spark 1.6.0 with AWS EMR(Elastic Map Reduce). I'm facing some issues in reading large(500 mb) file in spark-sql.
Sometimes i get heap space error and sometimes the executors fail.
i have increased the driver memory, executor memory, kryo serializer buffer size.. etc.. but nothing helps.

I Kindly request your help in resolving this issue.

Thanks
Arun.
This e-mail and any files transmitted with it are for the sole use of the intended recipient(s) and may contain confidential and privileged information. If you are not the intended recipient(s), please reply to the sender and destroy all copies of the original message. Any unauthorized review, use, disclosure, dissemination, forwarding, printing or copying of this email, and/or any action taken in reliance on the contents of this e-mail is strictly prohibited and may be unlawful. Where permitted by applicable law, this e-mail and other e-mail communications sent to and from Cognizant e-mail addresses may be monitored.

Re: Memory tuning in spark sql

Posted by Ted Yu <yu...@gmail.com>.
Please take a look at SPARK-1867.

The discussion was very long.
You may want to look for missing classes.

Also see https://bugs.openjdk.java.net/browse/JDK-7172206

On Thu, Feb 4, 2016 at 10:31 AM, <AR...@cognizant.com> wrote:

> Hi Ted. Thanks for the response.
>
> i'm just trying to do a select *. the table has 1+ million rows.
>
> I have set below parameters.
>
> export SPARK_EXECUTOR_MEMORY=4G
> export SPARK_DRIVER_MEMORY=2G
> spark.kryoserializer.buffer.max 2000m.
>
>
> I have started the thrift server on port 10001 and trying to access these
> spark tables from qlikview BI tools.
> I have been stuck with this. Kinldly help.
>
> PFB the logs.
>
> 16/02/04 18:11:21 INFO TaskSetManager: Finished task 9.0 in stage 6.0 (TID
> 57) in 4305 ms on ip-xx-xx-xx-xx.ec2.internal (8/10)
> 16/02/04 18:11:26 INFO TaskSetManager: Finished task 7.0 in stage 6.0 (TID
> 55) in 14711 ms on ip-xx-xx-xx-xx.ec2.internal (9/10)
> #
> # java.lang.OutOfMemoryError: Java heap space
> # -XX:OnOutOfMemoryError="kill -9 %p
> kill -9 %p"
> #   Executing /bin/sh -c "kill -9 17242
> kill -9 17242"...
> 16/02/04 18:11:39 ERROR TransportRequestHandler: Error while invoking
> RpcHandler#receive() for one-way message.
> java.lang.IllegalStateException: unread block data
>         at
> java.io.ObjectInputStream$BlockDataInputStream.setBlockDataMode(ObjectInputStream.java:2428)
>         at
> java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1382)
>         at
> java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:1997)
>         at
> java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1921)
>         at
> java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1798)
>         at
> java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1350)
>         at java.io.ObjectInputStream.readObject(ObjectInputStream.java:370)
>         at
> org.apache.spark.serializer.JavaDeserializationStream.readObject(JavaSerializer.scala:76)
>         at
> org.apache.spark.serializer.JavaSerializerInstance.deserialize(JavaSerializer.scala:109)
>         at
> org.apache.spark.rpc.netty.NettyRpcEnv$$anonfun$deserialize$1$$anonfun$apply$1.apply(NettyRpcEnv.scala:258)
>         at scala.util.DynamicVariable.withValue(DynamicVariable.scala:57)
>         at
> org.apache.spark.rpc.netty.NettyRpcEnv.deserialize(NettyRpcEnv.scala:310)
>         at
> org.apache.spark.rpc.netty.NettyRpcEnv$$anonfun$deserialize$1.apply(NettyRpcEnv.scala:257)
>         at scala.util.DynamicVariable.withValue(DynamicVariable.scala:57)
>         at
> org.apache.spark.rpc.netty.NettyRpcEnv.deserialize(NettyRpcEnv.scala:256)
>         at
> org.apache.spark.rpc.netty.NettyRpcHandler.internalReceive(NettyRpcEnv.scala:588)
>         at
> org.apache.spark.rpc.netty.NettyRpcHandler.receive(NettyRpcEnv.scala:577)
>         at
> org.apache.spark.network.server.TransportRequestHandler.processOneWayMessage(TransportRequestHandler.java:170)
>         at
> org.apache.spark.network.server.TransportRequestHandler.handle(TransportRequestHandler.java:104)
>         at
> org.apache.spark.network.server.TransportChannelHandler.channelRead0(TransportChannelHandler.java:104)
>         at
> org.apache.spark.network.server.TransportChannelHandler.channelRead0(TransportChannelHandler.java:51)
>         at
> io.netty.channel.SimpleChannelInboundHandler.channelRead(SimpleChannelInboundHandler.java:105)
>         at
> io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:308)
>         at
> io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:294)
>         at
> io.netty.handler.timeout.IdleStateHandler.channelRead(IdleStateHandler.java:266)
>         at
> io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:308)
>         at
> io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:294)
>         at
> io.netty.handler.codec.MessageToMessageDecoder.channelRead(MessageToMessageDecoder.java:103)
>         at
> io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:308)
>         at
> io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:294)
>         at
> org.apache.spark.network.util.TransportFrameDecoder.channelRead(TransportFrameDecoder.java:86)
>         at
> io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:308)
>         at
> io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:294)
>         at
> io.netty.channel.DefaultChannelPipeline.fireChannelRead(DefaultChannelPipeline.java:846)
>         at
> io.netty.channel.nio.AbstractNioByteChannel$NioByteUnsafe.read(AbstractNioByteChannel.java:131)
>         at
> io.netty.channel.nio.NioEventLoop.processSelectedKey(NioEventLoop.java:511)
>         at
> io.netty.channel.nio.NioEventLoop.processSelectedKeysOptimized(NioEventLoop.java:468)
>         at
> io.netty.channel.nio.NioEventLoop.processSelectedKeys(NioEventLoop.java:382)
>         at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:354)
>         at
> io.netty.util.concurrent.SingleThreadEventExecutor$2.run(SingleThreadEventExecutor.java:111)
>         at java.lang.Thread.run(Thread.java:745)
>
>
>
> Thanks
> Arun
>
> ------------------------------
> *From:* Ted Yu [yuzhihong@gmail.com]
> *Sent:* 04 February 2016 23:37
> *To:* BONGALE, ARUN (Cognizant)
> *Cc:* user
> *Subject:* Re: Memory tuning in spark sql
>
> Can you provide a bit more detail ?
>
> values of the parameters you have tuned
> log snippets from executors
> snippet of your code
>
> Thanks
>
> On Thu, Feb 4, 2016 at 9:48 AM, <ARUN.BONGALE@cognizant.com
> <http://redir.aspx?REF=Cvgtpa7SYatX8coIXyW5Vsnc-ZSwAOo_o6sBm3hEmEIBJpaNkC3TCAFtYWlsdG86QVJVTi5CT05HQUxFQGNvZ25pemFudC5jb20.>
> > wrote:
>
>> Hi Sir/madam,
>> Greetings of the day.
>>
>> I am working on Spark 1.6.0 with AWS EMR(Elastic Map Reduce). I'm facing
>> some issues in reading large(500 mb) file in spark-sql.
>> Sometimes i get heap space error and sometimes the executors fail.
>> i have increased the driver memory, executor memory, kryo serializer
>> buffer size.. etc.. but nothing helps.
>>
>> I Kindly request your help in resolving this issue.
>>
>> Thanks
>> Arun.
>> This e-mail and any files transmitted with it are for the sole use of the
>> intended recipient(s) and may contain confidential and privileged
>> information. If you are not the intended recipient(s), please reply to the
>> sender and destroy all copies of the original message. Any unauthorized
>> review, use, disclosure, dissemination, forwarding, printing or copying of
>> this email, and/or any action taken in reliance on the contents of this
>> e-mail is strictly prohibited and may be unlawful. Where permitted by
>> applicable law, this e-mail and other e-mail communications sent to and
>> from Cognizant e-mail addresses may be monitored.
>>
>
> This e-mail and any files transmitted with it are for the sole use of the
> intended recipient(s) and may contain confidential and privileged
> information. If you are not the intended recipient(s), please reply to the
> sender and destroy all copies of the original message. Any unauthorized
> review, use, disclosure, dissemination, forwarding, printing or copying of
> this email, and/or any action taken in reliance on the contents of this
> e-mail is strictly prohibited and may be unlawful. Where permitted by
> applicable law, this e-mail and other e-mail communications sent to and
> from Cognizant e-mail addresses may be monitored.
>

Re: Memory tuning in spark sql

Posted by Ted Yu <yu...@gmail.com>.
Can you provide a bit more detail ?

values of the parameters you have tuned
log snippets from executors
snippet of your code

Thanks

On Thu, Feb 4, 2016 at 9:48 AM, <AR...@cognizant.com> wrote:

> Hi Sir/madam,
> Greetings of the day.
>
> I am working on Spark 1.6.0 with AWS EMR(Elastic Map Reduce). I'm facing
> some issues in reading large(500 mb) file in spark-sql.
> Sometimes i get heap space error and sometimes the executors fail.
> i have increased the driver memory, executor memory, kryo serializer
> buffer size.. etc.. but nothing helps.
>
> I Kindly request your help in resolving this issue.
>
> Thanks
> Arun.
> This e-mail and any files transmitted with it are for the sole use of the
> intended recipient(s) and may contain confidential and privileged
> information. If you are not the intended recipient(s), please reply to the
> sender and destroy all copies of the original message. Any unauthorized
> review, use, disclosure, dissemination, forwarding, printing or copying of
> this email, and/or any action taken in reliance on the contents of this
> e-mail is strictly prohibited and may be unlawful. Where permitted by
> applicable law, this e-mail and other e-mail communications sent to and
> from Cognizant e-mail addresses may be monitored.
>