You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@spark.apache.org by Tomasz Krol <pa...@gmail.com> on 2019/03/26 11:51:50 UTC

Spark Thrift Server 2.2.1

Hey Guys,

I am wondering if any of you sorted issue with parquet metadata refresh. In
my case its for Spark Thrift Server ( ver 2.2.1), but it might be the case
for any applications. As you know when any external application updates
parquet table then it requires in your application to REFRESH Table to have
updated metadata. In my case users running queries on parquet tables using
Spark Thrift Server. In the same time ETL jobs updating those tables. Then
users getting errors that they have to REFRESH Table. I set property
spark.sql.parquet.cacheMetadata=false in the server config, but that didnt
help.
I am wondering if any of you have any idea how to handle this issue?

Also second question, if any of you changed spark.scheduler.mode to FAIR.
When I have FIFO obviously other users are blocked by user who os currently
running queries. After changing to FAIR, Ive seen that performance
decreased. I am wondering if any of you manage to get good results running
the Spark Thrift Server with FAIR scheduler mode?

Thanks

Tom
-- 
Tomasz Krol
patrickoq@gmail.com