You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@spark.apache.org by "wuyi (JIRA)" <ji...@apache.org> on 2018/01/05 03:17:03 UTC
[jira] [Created] (SPARK-22967) VersionSuite failed on Windows
caused by unescapeSQLString()
wuyi created SPARK-22967:
----------------------------
Summary: VersionSuite failed on Windows caused by unescapeSQLString()
Key: SPARK-22967
URL: https://issues.apache.org/jira/browse/SPARK-22967
Project: Spark
Issue Type: Bug
Components: SQL
Affects Versions: 2.2.1
Environment: Windos7
Reporter: wuyi
Priority: Minor
On Windows system, two unit test case would fail while running VersionSuite ("A simple set of tests that call the methods of a `HiveClient`, loading different version of hive from maven central.")
Failed A : test(s"$version: read avro file containing decimal")
{code:java}
org.apache.hadoop.hive.ql.metadata.HiveException: MetaException(message:java.lang.IllegalArgumentException: Can not create a Path from an empty string);
{code}
Failed B: test(s"$version: SPARK-17920: Insert into/overwrite avro table")
{code:java}
Unable to infer the schema. The schema specification is required to create the table `default`.`tab2`.;
org.apache.spark.sql.AnalysisException: Unable to infer the schema. The schema specification is required to create the table `default`.`tab2`.;
{code}
As I deep into this problem, I found it is related to ParserUtils#unescapeSQLString().
These are two lines at the beginning of Failed A:
{code:java}
val url = Thread.currentThread().getContextClassLoader.getResource("avroDecimal")
val location = new File(url.getFile)
{code}
And in my environment,`location` (path value) is
{code:java}
D:\workspace\IdeaProjects\spark\sql\hive\target\scala-2.11\test-classes\avroDecimal
{code}
And then, in SparkSqlParser#visitCreateHiveTable()#L1128:
{code:java}
val location = Option(ctx.locationSpec).map(visitLocationSpec)
{code}
This line want to get LocationSepcContext's content first, which is equal to `location` above.
Then, the content is passed to visitLocationSpec(), and passed to unescapeSQLString()
finally.
Lets' have a look at unescapeSQLString():
{code:java}
/** Unescape baskslash-escaped string enclosed by quotes. */
def unescapeSQLString(b: String): String = {
var enclosure: Character = null
val sb = new StringBuilder(b.length())
def appendEscapedChar(n: Char) {
n match {
case '0' => sb.append('\u0000')
case '\'' => sb.append('\'')
case '"' => sb.append('\"')
case 'b' => sb.append('\b')
case 'n' => sb.append('\n')
case 'r' => sb.append('\r')
case 't' => sb.append('\t')
case 'Z' => sb.append('\u001A')
case '\\' => sb.append('\\')
// The following 2 lines are exactly what MySQL does TODO: why do we do this?
case '%' => sb.append("\\%")
case '_' => sb.append("\\_")
case _ => sb.append(n)
}
}
var i = 0
val strLength = b.length
while (i < strLength) {
val currentChar = b.charAt(i)
if (enclosure == null) {
if (currentChar == '\'' || currentChar == '\"') {
enclosure = currentChar
}
} else if (enclosure == currentChar) {
enclosure = null
} else if (currentChar == '\\') {
if ((i + 6 < strLength) && b.charAt(i + 1) == 'u') {
// \u0000 style character literals.
val base = i + 2
val code = (0 until 4).foldLeft(0) { (mid, j) =>
val digit = Character.digit(b.charAt(j + base), 16)
(mid << 4) + digit
}
sb.append(code.asInstanceOf[Char])
i += 5
} else if (i + 4 < strLength) {
// \000 style character literals.
val i1 = b.charAt(i + 1)
val i2 = b.charAt(i + 2)
val i3 = b.charAt(i + 3)
if ((i1 >= '0' && i1 <= '1') && (i2 >= '0' && i2 <= '7') && (i3 >= '0' && i3 <= '7')) {
val tmp = ((i3 - '0') + ((i2 - '0') << 3) + ((i1 - '0') << 6)).asInstanceOf[Char]
sb.append(tmp)
i += 3
} else {
appendEscapedChar(i1)
i += 1
}
} else if (i + 2 < strLength) {
// escaped character literals.
val n = b.charAt(i + 1)
appendEscapedChar(n)
i += 1
}
} else {
// non-escaped character literals.
sb.append(currentChar)
}
i += 1
}
sb.toString()
}
{code}
Again, here, variable `b` is equal to content and `location`, is valued of
{code:java}
D:\workspace\IdeaProjects\spark\sql\hive\target\scala-2.11\test-classes\avroDecimal
{code}
And we can make sense from the unescapeSQLString()' strategies that it transform the String "\t" into a escape character '\t' and remove all backslashes.
So, our original correct location resulted in:
{code:java}
D:workspaceIdeaProjectssparksqlhive\targetscala-2.11\test-classesavroDecimal
{code}
after unescapeSQLString() completed.
Then, return into SparkSqlParser#visitCreateHiveTable(), and move to L1134:
{code:java}
val locUri = location.map(CatalogUtils.stringToURI(_))
{code}
`location` is passed to stringToURI(), and resulted in:
{code:java}
file:/D:workspaceIdeaProjectssparksqlhive%09argetscala-2.11%09est-classesavroDecimal
{code}
finally, as escape character '\t' is transformed into URI code '%09'.
Although, I'm not clearly about how this wrong path directly caused that exception, as I almostly know nothing about Hive, I can verify that this wrong path is the real factor to cause this exception.
When I append these lines after HiveExternalCatalog#doCreateTable()Line236-240:
{code:java}
if (tableLocation.get.getPath.startsWith("/D")) {
tableLocation = Some(CatalogUtils.stringToURI(
"file:/D:/workspace/IdeaProjects/spark/sql/hive/target/scala-2.11/test-classes/avroDecimal"))
}
{code}
then, failed unit test A will pass, excluding test B.
And below is the stack trace of the Exception:
{code:java}
org.apache.hadoop.hive.ql.metadata.HiveException: MetaException(message:java.lang.IllegalArgumentException: Can not create a Path from an empty string)
at org.apache.hadoop.hive.ql.metadata.Hive.createTable(Hive.java:602)
at org.apache.spark.sql.hive.client.HiveClientImpl$$anonfun$createTable$1.apply$mcV$sp(HiveClientImpl.scala:469)
at org.apache.spark.sql.hive.client.HiveClientImpl$$anonfun$createTable$1.apply(HiveClientImpl.scala:467)
at org.apache.spark.sql.hive.client.HiveClientImpl$$anonfun$createTable$1.apply(HiveClientImpl.scala:467)
at org.apache.spark.sql.hive.client.HiveClientImpl$$anonfun$withHiveState$1.apply(HiveClientImpl.scala:273)
at org.apache.spark.sql.hive.client.HiveClientImpl.liftedTree1$1(HiveClientImpl.scala:210)
at org.apache.spark.sql.hive.client.HiveClientImpl.retryLocked(HiveClientImpl.scala:209)
at org.apache.spark.sql.hive.client.HiveClientImpl.withHiveState(HiveClientImpl.scala:256)
at org.apache.spark.sql.hive.client.HiveClientImpl.createTable(HiveClientImpl.scala:467)
at org.apache.spark.sql.hive.HiveExternalCatalog$$anonfun$doCreateTable$1.apply$mcV$sp(HiveExternalCatalog.scala:263)
at org.apache.spark.sql.hive.HiveExternalCatalog$$anonfun$doCreateTable$1.apply(HiveExternalCatalog.scala:216)
at org.apache.spark.sql.hive.HiveExternalCatalog$$anonfun$doCreateTable$1.apply(HiveExternalCatalog.scala:216)
at org.apache.spark.sql.hive.HiveExternalCatalog.withClient(HiveExternalCatalog.scala:97)
at org.apache.spark.sql.hive.HiveExternalCatalog.doCreateTable(HiveExternalCatalog.scala:216)
at org.apache.spark.sql.catalyst.catalog.ExternalCatalog.createTable(ExternalCatalog.scala:119)
at org.apache.spark.sql.catalyst.catalog.SessionCatalog.createTable(SessionCatalog.scala:304)
at org.apache.spark.sql.execution.command.CreateTableCommand.run(tables.scala:128)
at org.apache.spark.sql.execution.command.ExecutedCommandExec.sideEffectResult$lzycompute(commands.scala:70)
at org.apache.spark.sql.execution.command.ExecutedCommandExec.sideEffectResult(commands.scala:68)
at org.apache.spark.sql.execution.command.ExecutedCommandExec.executeCollect(commands.scala:79)
at org.apache.spark.sql.Dataset$$anonfun$6.apply(Dataset.scala:186)
at org.apache.spark.sql.Dataset$$anonfun$6.apply(Dataset.scala:186)
at org.apache.spark.sql.Dataset$$anonfun$51.apply(Dataset.scala:3196)
at org.apache.spark.sql.execution.SQLExecution$.withNewExecutionId(SQLExecution.scala:77)
at org.apache.spark.sql.Dataset.withAction(Dataset.scala:3195)
at org.apache.spark.sql.Dataset.<init>(Dataset.scala:186)
at org.apache.spark.sql.Dataset$.ofRows(Dataset.scala:71)
at org.apache.spark.sql.SparkSession.sql(SparkSession.scala:638)
at org.apache.spark.sql.SQLContext.sql(SQLContext.scala:694)
at org.apache.spark.sql.hive.client.VersionsSuite$$anonfun$6$$anonfun$apply$24$$anonfun$apply$mcV$sp$3.apply$mcV$sp(VersionsSuite.scala:829)
at org.apache.spark.sql.hive.client.VersionsSuite.withTable(VersionsSuite.scala:70)
at org.apache.spark.sql.hive.client.VersionsSuite$$anonfun$6$$anonfun$apply$24.apply$mcV$sp(VersionsSuite.scala:828)
at org.apache.spark.sql.hive.client.VersionsSuite$$anonfun$6$$anonfun$apply$24.apply(VersionsSuite.scala:805)
at org.apache.spark.sql.hive.client.VersionsSuite$$anonfun$6$$anonfun$apply$24.apply(VersionsSuite.scala:805)
at org.scalatest.OutcomeOf$class.outcomeOf(OutcomeOf.scala:85)
at org.scalatest.OutcomeOf$.outcomeOf(OutcomeOf.scala:104)
at org.scalatest.Transformer.apply(Transformer.scala:22)
at org.scalatest.Transformer.apply(Transformer.scala:20)
at org.scalatest.FunSuiteLike$$anon$1.apply(FunSuiteLike.scala:186)
at org.apache.spark.SparkFunSuite.withFixture(SparkFunSuite.scala:68)
at org.scalatest.FunSuiteLike$class.invokeWithFixture$1(FunSuiteLike.scala:183)
at org.scalatest.FunSuiteLike$$anonfun$runTest$1.apply(FunSuiteLike.scala:196)
at org.scalatest.FunSuiteLike$$anonfun$runTest$1.apply(FunSuiteLike.scala:196)
at org.scalatest.SuperEngine.runTestImpl(Engine.scala:289)
at org.scalatest.FunSuiteLike$class.runTest(FunSuiteLike.scala:196)
at org.scalatest.FunSuite.runTest(FunSuite.scala:1560)
at org.scalatest.FunSuiteLike$$anonfun$runTests$1.apply(FunSuiteLike.scala:229)
at org.scalatest.FunSuiteLike$$anonfun$runTests$1.apply(FunSuiteLike.scala:229)
at org.scalatest.SuperEngine$$anonfun$traverseSubNodes$1$1.apply(Engine.scala:396)
at org.scalatest.SuperEngine$$anonfun$traverseSubNodes$1$1.apply(Engine.scala:384)
at scala.collection.immutable.List.foreach(List.scala:381)
at org.scalatest.SuperEngine.traverseSubNodes$1(Engine.scala:384)
at org.scalatest.SuperEngine.org$scalatest$SuperEngine$$runTestsInBranch(Engine.scala:379)
at org.scalatest.SuperEngine.runTestsImpl(Engine.scala:461)
at org.scalatest.FunSuiteLike$class.runTests(FunSuiteLike.scala:229)
at org.scalatest.FunSuite.runTests(FunSuite.scala:1560)
at org.scalatest.Suite$class.run(Suite.scala:1147)
at org.scalatest.FunSuite.org$scalatest$FunSuiteLike$$super$run(FunSuite.scala:1560)
at org.scalatest.FunSuiteLike$$anonfun$run$1.apply(FunSuiteLike.scala:233)
at org.scalatest.FunSuiteLike$$anonfun$run$1.apply(FunSuiteLike.scala:233)
at org.scalatest.SuperEngine.runImpl(Engine.scala:521)
at org.scalatest.FunSuiteLike$class.run(FunSuiteLike.scala:233)
at org.apache.spark.SparkFunSuite.org$scalatest$BeforeAndAfterAll$$super$run(SparkFunSuite.scala:31)
at org.scalatest.BeforeAndAfterAll$class.liftedTree1$1(BeforeAndAfterAll.scala:213)
at org.scalatest.BeforeAndAfterAll$class.run(BeforeAndAfterAll.scala:210)
at org.apache.spark.SparkFunSuite.run(SparkFunSuite.scala:31)
at org.scalatest.tools.SuiteRunner.run(SuiteRunner.scala:45)
at org.scalatest.tools.Runner$$anonfun$doRunRunRunDaDoRunRun$1.apply(Runner.scala:1340)
at org.scalatest.tools.Runner$$anonfun$doRunRunRunDaDoRunRun$1.apply(Runner.scala:1334)
at scala.collection.immutable.List.foreach(List.scala:381)
at org.scalatest.tools.Runner$.doRunRunRunDaDoRunRun(Runner.scala:1334)
at org.scalatest.tools.Runner$$anonfun$runOptionallyWithPassFailReporter$2.apply(Runner.scala:1011)
at org.scalatest.tools.Runner$$anonfun$runOptionallyWithPassFailReporter$2.apply(Runner.scala:1010)
at org.scalatest.tools.Runner$.withClassLoaderAndDispatchReporter(Runner.scala:1500)
at org.scalatest.tools.Runner$.runOptionallyWithPassFailReporter(Runner.scala:1010)
at org.scalatest.tools.Runner$.run(Runner.scala:850)
at org.scalatest.tools.Runner.run(Runner.scala)
at org.jetbrains.plugins.scala.testingSupport.scalaTest.ScalaTestRunner.runScalaTest2(ScalaTestRunner.java:138)
at org.jetbrains.plugins.scala.testingSupport.scalaTest.ScalaTestRunner.main(ScalaTestRunner.java:28)
Caused by: MetaException(message:java.lang.IllegalArgumentException: Can not create a Path from an empty string)
at org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.create_table_with_environment_context(HiveMetaStore.java:1121)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at org.apache.hadoop.hive.metastore.RetryingHMSHandler.invoke(RetryingHMSHandler.java:103)
at com.sun.proxy.$Proxy31.create_table_with_environment_context(Unknown Source)
at org.apache.hadoop.hive.metastore.HiveMetaStoreClient.createTable(HiveMetaStoreClient.java:482)
at org.apache.hadoop.hive.metastore.HiveMetaStoreClient.createTable(HiveMetaStoreClient.java:471)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at org.apache.hadoop.hive.metastore.RetryingMetaStoreClient.invoke(RetryingMetaStoreClient.java:89)
at com.sun.proxy.$Proxy32.createTable(Unknown Source)
at org.apache.hadoop.hive.ql.metadata.Hive.createTable(Hive.java:596)
... 78 more
Caused by: java.lang.IllegalArgumentException: Can not create a Path from an empty string
at org.apache.hadoop.fs.Path.checkPathArg(Path.java:127)
at org.apache.hadoop.fs.Path.<init>(Path.java:184)
at org.apache.hadoop.fs.Path.getParent(Path.java:357)
at org.apache.hadoop.fs.RawLocalFileSystem.mkdirs(RawLocalFileSystem.java:427)
at org.apache.hadoop.fs.ChecksumFileSystem.mkdirs(ChecksumFileSystem.java:690)
at org.apache.hadoop.hive.metastore.Warehouse.mkdirs(Warehouse.java:194)
at org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.create_table_core(HiveMetaStore.java:1059)
at org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.create_table_with_environment_context(HiveMetaStore.java:1107)
... 93 more
{code}
As for test B, I did'n do a careful inspection, but I find a same wrong path as test A. So, I guess exceptions were caused by the same factor.
--
This message was sent by Atlassian JIRA
(v6.4.14#64029)
---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org