You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@spark.apache.org by 李铖 <li...@gmail.com> on 2015/03/17 11:39:12 UTC

Should I do spark-sql query on HDFS or hive?

Hi,everybody.

I am new in spark. Now I want to do interactive sql query using spark sql.
spark sql can run under hive or loading files from hdfs.

Which is better or faster?

Thanks.

Re: Should I do spark-sql query on HDFS or hive?

Posted by Denny Lee <de...@gmail.com>.
>From the standpoint of Spark SQL accessing the files - when it is hitting
Hive, it is in effect hitting HDFS as well.  Hive provides a great
framework where the table structure is already well defined.    But
underneath it, Hive is just accessing files from HDFS so you are hitting
HDFS either way.  HTH!

On Tue, Mar 17, 2015 at 3:41 AM 李铖 <li...@gmail.com> wrote:

> Hi,everybody.
>
> I am new in spark. Now I want to do interactive sql query using spark sql.
> spark sql can run under hive or loading files from hdfs.
>
> Which is better or faster?
>
> Thanks.
>