You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@spark.apache.org by "melin (Jira)" <ji...@apache.org> on 2022/08/17 07:38:00 UTC

[jira] [Created] (SPARK-40118) InMemoryFIleIndex caches filelists, how to solve the problem that multiple sparksessions run for a long time and filelists is out of sync

melin created SPARK-40118:
-----------------------------

             Summary: InMemoryFIleIndex caches filelists, how to solve the problem that multiple sparksessions run for a long time and filelists is out of sync
                 Key: SPARK-40118
                 URL: https://issues.apache.org/jira/browse/SPARK-40118
             Project: Spark
          Issue Type: Improvement
          Components: SQL
    Affects Versions: 3.4.0
            Reporter: melin


For example, two Sparksessions A and B, query table T1 in A, write data to table T1 in B, and fail to query the data written by B in A. There are currently two approaches: 

1. Close SparkSession A and restart it 

2. Invoke Refresh table Command. 

These two practices are not feasible for business users who do not know when to operate. Frequent refresh affects the interaction performance. 

 

Ideally, it would support a centralized caching scheme such as RedIS, providing an extended interface that allows you to customize the Cache



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org