You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@spark.apache.org by Hitesh Goyal <hi...@nlpcaptcha.com> on 2016/11/28 12:41:06 UTC

time to run Spark SQL query

Hi team, I am using spark SQL for accessing the amazon S3 bucket data.
If I run a sql query by using normal SQL syntax like below

1)      DataFrame d=sqlContext.sql(i.e. Select * from tablename where column_condition);

Secondly, if I use dataframe functions for the same query like below :-

2)      dataframe.select(column_name).where(column_condition);

Now there is a question arising in my mind that which query would take more time to execute if I run both on the same dataset.
Or both would execute in the same time duration. Please suggest your answer.
Regards,
Hitesh Goyal
Simpli5d Technologies
Cont No.: 9996588220


Re: time to run Spark SQL query

Posted by ayan guha <gu...@gmail.com>.
They should take same time if everything else is constant
On 28 Nov 2016 23:41, "Hitesh Goyal" <hi...@nlpcaptcha.com> wrote:

> Hi team, I am using spark SQL for accessing the amazon S3 bucket data.
>
> If I run a sql query by using normal SQL syntax like below
>
> 1)      DataFrame d=sqlContext.sql(i.e. Select * from tablename where
> column_condition);
>
>
>
> Secondly, if I use dataframe functions for the same query like below :-
>
> 2)      dataframe.select(column_name).where(column_condition);
>
>
>
> Now there is a question arising in my mind that which query would take
> more time to execute if I run both on the same dataset.
>
> Or both would execute in the same time duration. Please suggest your
> answer.
>
> Regards,
>
> *Hitesh Goyal*
>
> Simpli5d Technologies
>
> Cont No.: 9996588220
>
>
>