You are viewing a plain text version of this content. The canonical link for it is here.

Posted to users@zeppelin.apache.org by Zhong Wang <wa...@gmail.com> on 2016/02/10 12:48:24 UTC

%spark is significantly slower than %sql

Hi,

I am working on building some responsive UI to visualize reports on top of
Zeppelin. I found that running the same query using %spark sometimes is
significantly slower than using %sql.

For the query I am running, I can get result within 1 sec using %sql, but
sometimes using more than 2 secs to get result using %spark. %spark is also
very responsive after a clean restart, but become slower after it runs for
a while, but %sql is still fast.

Does anyone have the same issue? What is the difference between running
using %spark and running using %sql?

Thanks!

Zhong

Re: %spark is significantly slower than %sql

Posted by Zhong Wang <wa...@gmail.com>.

Thanks for your answer, moon! Sorry that it is difficult to reproduce the
issue, but I think you are right.

Best,
Zhong

On Fri, Feb 12, 2016 at 8:56 PM, moon soo Lee <mo...@apache.org> wrote:

> Hi Zhong,
>
> Thanks for sharing the problem.
> I think that's related to the behavior (not releasing any resource) of
> scala compiler embedded into SparkInterpreter.
>
> For now, restart seems only practical solution.
> Do you mind create an issue for it?
>
> Thanks,
> moon
>
> On Wed, Feb 10, 2016 at 8:48 PM Zhong Wang <wa...@gmail.com>
> wrote:
>
>> Hi,
>>
>> I am working on building some responsive UI to visualize reports on top
>> of Zeppelin. I found that running the same query using %spark sometimes is
>> significantly slower than using %sql.
>>
>> For the query I am running, I can get result within 1 sec using %sql, but
>> sometimes using more than 2 secs to get result using %spark. %spark is also
>> very responsive after a clean restart, but become slower after it runs for
>> a while, but %sql is still fast.
>>
>> Does anyone have the same issue? What is the difference between running
>> using %spark and running using %sql?
>>
>> Thanks!
>>
>> Zhong
>>
>

Re: %spark is significantly slower than %sql

Posted by moon soo Lee <mo...@apache.org>.

Hi Zhong,

Thanks for sharing the problem.
I think that's related to the behavior (not releasing any resource) of
scala compiler embedded into SparkInterpreter.

For now, restart seems only practical solution.
Do you mind create an issue for it?

Thanks,
moon

On Wed, Feb 10, 2016 at 8:48 PM Zhong Wang <wa...@gmail.com> wrote:

> Hi,
>
> I am working on building some responsive UI to visualize reports on top of
> Zeppelin. I found that running the same query using %spark sometimes is
> significantly slower than using %sql.
>
> For the query I am running, I can get result within 1 sec using %sql, but
> sometimes using more than 2 secs to get result using %spark. %spark is also
> very responsive after a clean restart, but become slower after it runs for
> a while, but %sql is still fast.
>
> Does anyone have the same issue? What is the difference between running
> using %spark and running using %sql?
>
> Thanks!
>
> Zhong
>