You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@spark.apache.org by xweb <as...@gmail.com> on 2019/10/14 16:24:01 UTC

Use our own metastore with Spark SQL

Is it possible to use our own metastore instead of Hive Metastore with Spark
SQL? 

Can you please point me to some docs or code I can look at to get it done?

We are moving away from everything Hadoop. 




--
Sent from: http://apache-spark-user-list.1001560.n3.nabble.com/

---------------------------------------------------------------------
To unsubscribe e-mail: user-unsubscribe@spark.apache.org


Re: Use our own metastore with Spark SQL

Posted by "Zhu, Luke" <lu...@brown.edu>.
I had a similar issue this summer while prototyping Spark on K8s. I ended
up sticking with Hive Metastore 2 to meet time goals. Not sure if I was
using it correctly, but I only needed Hadoop + Hive JARs; I did not need to
run HDFS, YARN, etc. Using the metastore with an s3a warehouse.dir path
seemed to work fine.

When Spark supports Metastore 3.0, things should be a bit easier as HMS 3
will have clearer instructions for standalone deployments.
https://cwiki.apache.org/confluence/display/Hive/AdminManual+Metastore+3.0+Administration

If you have more time and truly need to move away from everything Hadoop,
you can also implement ExternalCatalog:
https://github.com/apache/spark/blob/5264164a67df498b73facae207eda12ee133be7d/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/catalog/ExternalCatalog.scala

See https://jira.apache.org/jira/browse/SPARK-23443 for ongoing progress on
a Glue ExternalCatalog implementation. If you are using EMR, you can also
check  out
https://docs.aws.amazon.com/emr/latest/ReleaseGuide/emr-spark-glue.html


On Mon, Oct 14, 2019 at 12:24 PM xweb <as...@gmail.com> wrote:

>
> Is it possible to use our own metastore instead of Hive Metastore with
> Spark
> SQL?
>
> Can you please point me to some docs or code I can look at to get it done?
>
> We are moving away from everything Hadoop.
>
>
>
>
> --
> Sent from: http://apache-spark-user-list.1001560.n3.nabble.com/
>
> ---------------------------------------------------------------------
> To unsubscribe e-mail: user-unsubscribe@spark.apache.org
>
>