You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@spark.apache.org by Jeroen Miller <bl...@gmail.com> on 2017/10/01 17:17:41 UTC

Re: More instances = slower Spark job

On Fri, Sep 29, 2017 at 12:20 AM, Gourav Sengupta
<go...@gmail.com> wrote:
> Why are you not using JSON reader of SPARK?

Since the filter I want to perform is so simple, I do not want to
spend time and memory to deserialise the JSON lines.

Jeroen

---------------------------------------------------------------------
To unsubscribe e-mail: user-unsubscribe@spark.apache.org


Re: More instances = slower Spark job

Posted by Gourav Sengupta <go...@gmail.com>.
Hi Jeroen,

I do not believe that I completely agree with the idea that you will be
spending more time and memory that way.

But if that was also the case why are you not using data frames and UDF?


Regards,
Gourav

On Sun, Oct 1, 2017 at 6:17 PM, Jeroen Miller <bl...@gmail.com>
wrote:

> On Fri, Sep 29, 2017 at 12:20 AM, Gourav Sengupta
> <go...@gmail.com> wrote:
> > Why are you not using JSON reader of SPARK?
>
> Since the filter I want to perform is so simple, I do not want to
> spend time and memory to deserialise the JSON lines.
>
> Jeroen
>
> ---------------------------------------------------------------------
> To unsubscribe e-mail: user-unsubscribe@spark.apache.org
>
>