You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@spark.apache.org by Pranay Tonpay <pt...@gmail.com> on 2015/10/25 18:05:15 UTC

spark-sql / apache-drill / jboss-tiied

Hi,
In terms of federated query, has anyone done any evaluation between
spark-sql and drill and jboss-tiied.
I have a very urgent requirement for creating a virtualized layer (sitting
atop several databases) and am evaluating these 3 as an option.. Any help
would be appreciated.
I know Spark-SQL has the benefit that i can invoke MLLib algorithms on the
data fetched, but apart from that, any other considerations ?
Drill does not seem to have support for many data sources..

Any inputs ?

thx
pranay

RE: spark-sql / apache-drill / jboss-tiied

Posted by pr...@wipro.com.
Hi,

Though not the comparison you wanted, I have implemented a SparkSQL vs Hive performance comparison with one master and two worker instances. Data was stored in HDFS. SparkSQL showed promise. I used Spark version 1.4 and Hadoop version 2.6.

https://hivevssparksql.wordpress.com/

The table data size used for the performance comparison ranged 100,000 to 100 million rows. The master and slaves ran on EC2 m3.xlarge(4core/15GB RAM).

In the graph you can observe the consistent response behavior of SparkSQL.

Regards,
Prajod

From: Pranay Tonpay [mailto:ptonpay@gmail.com]
Sent: 25 October 2015 22:35
To: dev@spark.apache.org
Subject: spark-sql / apache-drill / jboss-tiied

Hi,
In terms of federated query, has anyone done any evaluation between spark-sql and drill and jboss-tiied.
I have a very urgent requirement for creating a virtualized layer (sitting atop several databases) and am evaluating these 3 as an option.. Any help would be appreciated.
I know Spark-SQL has the benefit that i can invoke MLLib algorithms on the data fetched, but apart from that, any other considerations ?
Drill does not seem to have support for many data sources..

Any inputs ?
thx
pranay
The information contained in this electronic message and any attachments to this message are intended for the exclusive use of the addressee(s) and may contain proprietary, confidential or privileged information. If you are not the intended recipient, you should not disseminate, distribute or copy this e-mail. Please notify the sender immediately and destroy all copies of this message and any attachments. WARNING: Computer viruses can be transmitted via email. The recipient should check this email and any attachments for the presence of viruses. The company accepts no liability for any damage caused by any virus transmitted by this email. www.wipro.com