You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@spark.apache.org by ankits <an...@gmail.com> on 2016/10/24 05:05:22 UTC
How to avoid the delay associated with Hive Metastore when loading
parquet?
Hi,
I'm loading parquet files via spark, and I see the first time a file is
loaded that there is a 5-10s delay related to the Hive Metastore with
messages relating to metastore in the console. How can I avoid this delay
and keep the metadata around? I want the data to be persisted even after
killing the JVM/sparksession and avoid this delay.
I have configured hive-site to use a MySQL DB as the metastore - i thought
that would solve the problem by giving it a persistent metastore, but that
did not help, so I don't quite understand whats going on. How do i keep the
metadata around and avoid the delay?
Here is the relevant code and config
*Initializing the SparkSession, storing and reading data via parquet*
*hive-site.xml*
*Console output*
--
View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/How-to-avoid-the-delay-associated-with-Hive-Metastore-when-loading-parquet-tp27948.html
Sent from the Apache Spark User List mailing list archive at Nabble.com.
---------------------------------------------------------------------
To unsubscribe e-mail: user-unsubscribe@spark.apache.org