You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@spark.apache.org by "Apache Spark (JIRA)" <ji...@apache.org> on 2017/03/12 10:17:04 UTC
[jira] [Assigned] (SPARK-19912) String literals are not escaped
while performing Hive metastore level partition pruning
[ https://issues.apache.org/jira/browse/SPARK-19912?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Apache Spark reassigned SPARK-19912:
------------------------------------
Assignee: Apache Spark
> String literals are not escaped while performing Hive metastore level partition pruning
> ---------------------------------------------------------------------------------------
>
> Key: SPARK-19912
> URL: https://issues.apache.org/jira/browse/SPARK-19912
> Project: Spark
> Issue Type: Bug
> Components: SQL
> Affects Versions: 2.1.1, 2.2.0
> Reporter: Cheng Lian
> Assignee: Apache Spark
> Labels: correctness
>
> {{Shim_v0_13.convertFilters()}} doesn't escape string literals while generating Hive style partition predicates.
> The following SQL-injection-like test case illustrates this issue:
> {code}
> test("SPARK-19912") {
> withTable("spark_19912") {
> Seq(
> (1, "p1", "q1"),
> (2, "p1\" and q=\"q1", "q2")
> ).toDF("a", "p", "q").write.partitionBy("p", "q").saveAsTable("spark_19912")
> checkAnswer(
> spark.table("foo").filter($"p" === "p1\" and q = \"q1").select($"a"),
> Row(2)
> )
> }
> }
> {code}
> The above test case fails like this:
> {noformat}
> [info] - spark_19912 *** FAILED *** (13 seconds, 74 milliseconds)
> [info] Results do not match for query:
> [info] Timezone: sun.util.calendar.ZoneInfo[id="America/Los_Angeles",offset=-28800000,dstSavings=3600000,useDaylight=true,transitions=185,lastRule=java.util.SimpleTimeZone[id=America/Los_Angeles,offset=-28800000,dstSavings=3600000,useDaylight=true,startYear=0,startMode=3,startMonth=2,startDay=8,startDayOfWeek=1,startTime=7200000,startTimeMode=0,endMode=3,endMonth=10,endDay=1,endDayOfWeek=1,endTime=7200000,endTimeMode=0]]
> [info] Timezone Env:
> [info]
> [info] == Parsed Logical Plan ==
> [info] 'Project [unresolvedalias('a, None)]
> [info] +- Filter (p#27 = p1" and q = "q1)
> [info] +- SubqueryAlias spark_19912
> [info] +- Relation[a#26,p#27,q#28] parquet
> [info]
> [info] == Analyzed Logical Plan ==
> [info] a: int
> [info] Project [a#26]
> [info] +- Filter (p#27 = p1" and q = "q1)
> [info] +- SubqueryAlias spark_19912
> [info] +- Relation[a#26,p#27,q#28] parquet
> [info]
> [info] == Optimized Logical Plan ==
> [info] Project [a#26]
> [info] +- Filter (isnotnull(p#27) && (p#27 = p1" and q = "q1))
> [info] +- Relation[a#26,p#27,q#28] parquet
> [info]
> [info] == Physical Plan ==
> [info] *Project [a#26]
> [info] +- *FileScan parquet default.spark_19912[a#26,p#27,q#28] Batched: true, Format: Parquet, Location: PrunedInMemoryFileIndex[], PartitionCount: 0, PartitionFilters: [isnotnull(p#27), (p#27 = p1" and q = "q1)], PushedFilters: [], ReadSchema: struct<a:int>
> [info] == Results ==
> [info]
> [info] == Results ==
> [info] !== Correct Answer - 1 == == Spark Answer - 0 ==
> [info] struct<> struct<>
> [info] ![2]
> {noformat}
--
This message was sent by Atlassian JIRA
(v6.3.15#6346)
---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org