You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@spark.apache.org by "Gengliang Wang (JIRA)" <ji...@apache.org> on 2019/03/14 16:08:00 UTC
[jira] [Created] (SPARK-27162) Add new method getOriginalMap in
CaseInsensitiveStringMap
Gengliang Wang created SPARK-27162:
--------------------------------------
Summary: Add new method getOriginalMap in CaseInsensitiveStringMap
Key: SPARK-27162
URL: https://issues.apache.org/jira/browse/SPARK-27162
Project: Spark
Issue Type: Task
Components: SQL
Affects Versions: 3.0.0
Reporter: Gengliang Wang
Currently, DataFrameReader/DataFrameReader supports setting Hadoop configurations via method `.option()`.
E.g.
```
class TestFileFilter extends PathFilter {
override def accept(path: Path): Boolean = path.getParent.getName != "p=2"
}
withTempPath { dir =>
val path = dir.getCanonicalPath
val df = spark.range(2)
df.write.orc(path + "/p=1")
df.write.orc(path + "/p=2")
assert(spark.read.orc(path).count() === 4)
val extraOptions = Map(
"mapred.input.pathFilter.class" -> classOf[TestFileFilter].getName,
"mapreduce.input.pathFilter.class" -> classOf[TestFileFilter].getName
)
assert(spark.read.options(extraOptions).orc(path).count() === 2)
}
```
While Hadoop Configurations are case sensitive, the current data source V2 APIs are using `CaseInsensitiveStringMap` in TableProvider.
To create Hadoop configurations correctly, I suggest adding a method `getOriginalMap` in `CaseInsensitiveStringMap`.
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)
---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org