You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@spark.apache.org by Everett Anderson <ev...@nuna.com.INVALID> on 2017/04/14 20:39:05 UTC

Driver spins hours in query plan optimization

Hi,

We keep hitting a situation on Spark 2.0.2 (haven't tested later versions,
yet) where the driver spins forever seemingly in query plan optimization
for moderate queries, such as the union of a few (~5) other DataFrames.

We can see the driver spinning with one core in the nioEventLoopGroup-2-2
thread in a deep trace like the attached.

Throwing in a MEMORY_OR_DISK persist() so the query plan is collapsed works
around this, but it's a little surprising how often we encounter the
problem, forcing us to work to manage persisting/unpersisting tables and
potentially suffering unnecessary disk I/O.

I've looking through JIRA but don't see open issues about this -- might've
just not found them successfully.

Anyone else encounter this?

Re: Driver spins hours in query plan optimization

Posted by Everett Anderson <ev...@nuna.com.INVALID>.
Seems like

https://issues.apache.org/jira/browse/SPARK-13346

is likely the same issue.

Seems like for some people persist() doesn't work and they have to convert
to RDDs and back.


On Fri, Apr 14, 2017 at 1:39 PM, Everett Anderson <ev...@nuna.com> wrote:

> Hi,
>
> We keep hitting a situation on Spark 2.0.2 (haven't tested later versions,
> yet) where the driver spins forever seemingly in query plan optimization
> for moderate queries, such as the union of a few (~5) other DataFrames.
>
> We can see the driver spinning with one core in the nioEventLoopGroup-2-2
> thread in a deep trace like the attached.
>
> Throwing in a MEMORY_OR_DISK persist() so the query plan is collapsed
> works around this, but it's a little surprising how often we encounter the
> problem, forcing us to work to manage persisting/unpersisting tables and
> potentially suffering unnecessary disk I/O.
>
> I've looking through JIRA but don't see open issues about this -- might've
> just not found them successfully.
>
> Anyone else encounter this?
>
>