You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@spark.apache.org by ankits <an...@gmail.com> on 2016/10/24 05:05:22 UTC

How to avoid the delay associated with Hive Metastore when loading parquet?

Hi,

I'm loading parquet files via spark, and I see the first time a file is
loaded that there is a 5-10s delay related to the Hive Metastore with
messages relating to metastore in the console.  How can I avoid this delay
and keep the metadata around? I want the data to be persisted even after
killing the JVM/sparksession and avoid this delay.

I have configured hive-site to use a MySQL DB as the metastore - i thought
that would solve the problem by giving it a persistent metastore, but that
did not help, so I don't quite understand whats going on. How do i keep the
metadata around and avoid the delay? 


Here is the relevant code and config

*Initializing the SparkSession, storing and reading data via parquet* 


*hive-site.xml*


*Console output*










--
View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/How-to-avoid-the-delay-associated-with-Hive-Metastore-when-loading-parquet-tp27948.html
Sent from the Apache Spark User List mailing list archive at Nabble.com.

---------------------------------------------------------------------
To unsubscribe e-mail: user-unsubscribe@spark.apache.org