You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@spark.apache.org by Ladle <la...@tcs.com> on 2015/08/12 12:28:41 UTC

Is there any tool that i can prove to customer that spark is faster then hive ?

Hi ,

I have build the the machine learning features and model using Apache spark.

And the same features i have i build using hive,java and used mahout to run
model.

Now how can i show to customer that Apache Spark is more faster then hive.

Is there any tool that shows the time ?

Regards,
Ladle



--
View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/Is-there-any-tool-that-i-can-prove-to-customer-that-spark-is-faster-then-hive-tp24224.html
Sent from the Apache Spark User List mailing list archive at Nabble.com.

---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscribe@spark.apache.org
For additional commands, e-mail: user-help@spark.apache.org


Re: Is there any tool that i can prove to customer that spark is faster then hive ?

Posted by Gourav Sengupta <go...@gmail.com>.
You might also need to consider the maturity of SPARKSQL vs HIVEQL.

Besides that please read the following (which will soon be available as a
part of standard Amazon stack, in case its not already)
https://cwiki.apache.org/confluence/display/Hive/Hive+on+Spark%3A+Getting+Started.


All that you need to do is run the following (in case the environment is
already set up in AWS) and HIVE will be using SPARK for executing the
queries
hive> set hive.execution.engine=spark;


HIVE can use Map Reduce, Tez and now SPARK in order to execute its queries.

Please do read the details in the above link.


Regards,
Gourav Sengupta

On Wed, Aug 12, 2015 at 1:01 PM, Nick Pentreath <ni...@gmail.com>
wrote:

> Perhaps you could time the end-to-end runtime for each pipeline, and each
> stage?
>
> Through Id be fairly confidant that Spark will outperform hive/mahout on
> MR, that's not he only consideration - having everything on a single
> platform and the Spark / data frame API is a huge win just by itself
>
>
>
> —
> Sent from Mailbox <https://www.dropbox.com/mailbox>
>
>
> On Wed, Aug 12, 2015 at 1:45 PM, Ladle <la...@tcs.com> wrote:
>
>> Hi ,
>>
>> I have build the the machine learning features and model using Apache
>> spark.
>>
>> And the same features i have i build using hive,java and used mahout to
>> run
>> model.
>>
>> Now how can i show to customer that Apache Spark is more faster then
>> hive.
>>
>> Is there any tool that shows the time ?
>>
>> Regards,
>> Ladle
>>
>>
>>
>> --
>> View this message in context:
>> http://apache-spark-user-list.1001560.n3.nabble.com/Is-there-any-tool-that-i-can-prove-to-customer-that-spark-is-faster-then-hive-tp24224.html
>> Sent from the Apache Spark User List mailing list archive at Nabble.com.
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: user-unsubscribe@spark.apache.org
>> For additional commands, e-mail: user-help@spark.apache.org
>>
>>
>

Re: Is there any tool that i can prove to customer that spark is faster then hive ?

Posted by Nick Pentreath <ni...@gmail.com>.
Perhaps you could time the end-to-end runtime for each pipeline, and each stage?




Through Id be fairly confidant that Spark will outperform hive/mahout on MR, that's not he only consideration - having everything on a single platform and the Spark / data frame API is a huge win just by itself









—
Sent from Mailbox

On Wed, Aug 12, 2015 at 1:45 PM, Ladle <la...@tcs.com> wrote:

> Hi ,
> I have build the the machine learning features and model using Apache spark.
> And the same features i have i build using hive,java and used mahout to run
> model.
> Now how can i show to customer that Apache Spark is more faster then hive.
> Is there any tool that shows the time ?
> Regards,
> Ladle
> --
> View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/Is-there-any-tool-that-i-can-prove-to-customer-that-spark-is-faster-then-hive-tp24224.html
> Sent from the Apache Spark User List mailing list archive at Nabble.com.
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: user-unsubscribe@spark.apache.org
> For additional commands, e-mail: user-help@spark.apache.org