You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@spark.apache.org by "Sean Owen (JIRA)" <ji...@apache.org> on 2019/08/12 17:14:00 UTC

[jira] [Commented] (SPARK-28693) Malformed input or input contains unmappable characters

    [ https://issues.apache.org/jira/browse/SPARK-28693?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16905396#comment-16905396 ] 

Sean Owen commented on SPARK-28693:
-----------------------------------

I wonder if we have to fix the default encoding in the JVM for these tests, to make sure the env default doesn't matter? would that be a solution? we do this in some other tests. But then I'm not sure why the test that explicity sets the 'tr' locale fails. There could be something deeper in the code that needs to be locale aware.

> Malformed input or input contains unmappable characters
> -------------------------------------------------------
>
>                 Key: SPARK-28693
>                 URL: https://issues.apache.org/jira/browse/SPARK-28693
>             Project: Spark
>          Issue Type: Sub-task
>          Components: SQL, Tests
>    Affects Versions: 3.0.0
>            Reporter: Yuming Wang
>            Priority: Major
>
> {{[info] - create Hive-serde table and view with unicode columns and comment *** FAILED *** (706 milliseconds)}} from {{HiveDDLSuite}}
> {{[info] - basic DDL using locale tr - caseSensitive true *** FAILED *** (189 milliseconds)}} from {{HiveCatalogedDDLSuite}}
> {noformat}
> [info] - create Hive-serde table and view with unicode columns and comment *** FAILED *** (706 milliseconds)
> [info]   org.apache.spark.sql.AnalysisException: java.nio.file.InvalidPathException: Malformed input or input contains unmappable characters: /root/opensource/spark/target/tmp/warehouse-bfbd010e-29be-44c4-939d-2011d84f1d38/tab1/?=2;
> [info]   at org.apache.spark.sql.hive.HiveExternalCatalog.withClient(HiveExternalCatalog.scala:109)
> [info]   at org.apache.spark.sql.hive.HiveExternalCatalog.loadPartition(HiveExternalCatalog.scala:872)
> [info]   at org.apache.spark.sql.catalyst.catalog.ExternalCatalogWithListener.loadPartition(ExternalCatalogWithListener.scala:175)
> [info]   at org.apache.spark.sql.hive.execution.InsertIntoHiveTable.processInsert(InsertIntoHiveTable.scala:262)
> [info]   at org.apache.spark.sql.hive.execution.InsertIntoHiveTable.run(InsertIntoHiveTable.scala:101)
> [info]   at org.apache.spark.sql.execution.command.DataWritingCommandExec.sideEffectResult$lzycompute(commands.scala:104)
> [info]   at org.apache.spark.sql.execution.command.DataWritingCommandExec.sideEffectResult(commands.scala:102)
> [info]   at org.apache.spark.sql.execution.command.DataWritingCommandExec.executeCollect(commands.scala:116)
> [info]   at org.apache.spark.sql.Dataset.$anonfun$logicalPlan$1(Dataset.scala:203)
> [info]   at org.apache.spark.sql.Dataset.$anonfun$withAction$1(Dataset.scala:3399)
> [info]   at org.apache.spark.sql.execution.SQLExecution$.$anonfun$withNewExecutionId$4(SQLExecution.scala:100)
> [info]   at org.apache.spark.sql.execution.SQLExecution$.withSQLConfPropagated(SQLExecution.scala:160)
> [info]   at org.apache.spark.sql.execution.SQLExecution$.withNewExecutionId(SQLExecution.scala:87)
> [info]   at org.apache.spark.sql.Dataset.withAction(Dataset.scala:3395)
> [info]   at org.apache.spark.sql.Dataset.<init>(Dataset.scala:203)
> [info]   at org.apache.spark.sql.Dataset$.ofRows(Dataset.scala:80)
> [info]   at org.apache.spark.sql.hive.test.TestHiveSparkSession.sql(TestHive.scala:238)
> [info]   at org.apache.spark.sql.test.SQLTestUtilsBase.$anonfun$sql$1(SQLTestUtils.scala:216)
> [info]   at org.apache.spark.sql.hive.execution.HiveDDLSuite.$anonfun$new$68(HiveDDLSuite.scala:515)
> [info]   at scala.runtime.java8.JFunction0$mcV$sp.apply(JFunction0$mcV$sp.java:23)
> [info]   at org.apache.spark.util.Utils$.tryWithSafeFinally(Utils.scala:1356)
> [info]   at org.apache.spark.sql.test.SQLTestUtilsBase.withTable(SQLTestUtils.scala:290)
> [info]   at org.apache.spark.sql.test.SQLTestUtilsBase.withTable$(SQLTestUtils.scala:288)
> [info]   at org.apache.spark.sql.hive.execution.HiveDDLSuite.withTable(HiveDDLSuite.scala:365)
> [info]   at org.apache.spark.sql.hive.execution.HiveDDLSuite.$anonfun$new$67(HiveDDLSuite.scala:509)
> [info]   at scala.runtime.java8.JFunction0$mcV$sp.apply(JFunction0$mcV$sp.java:23)
> [info]   at org.scalatest.OutcomeOf.outcomeOf(OutcomeOf.scala:85)
> [info]   at org.scalatest.OutcomeOf.outcomeOf$(OutcomeOf.scala:83)
> [info]   at org.scalatest.OutcomeOf$.outcomeOf(OutcomeOf.scala:104)
> [info]   at org.scalatest.Transformer.apply(Transformer.scala:22)
> [info]   at org.scalatest.Transformer.apply(Transformer.scala:20)
> [info]   at org.scalatest.FunSuiteLike$$anon$1.apply(FunSuiteLike.scala:186)
> [info]   at org.apache.spark.SparkFunSuite.withFixture(SparkFunSuite.scala:149)
> [info]   at org.scalatest.FunSuiteLike.invokeWithFixture$1(FunSuiteLike.scala:184)
> [info]   at org.scalatest.FunSuiteLike.$anonfun$runTest$1(FunSuiteLike.scala:196)
> [info]   at org.scalatest.SuperEngine.runTestImpl(Engine.scala:289)
> [info]   at org.scalatest.FunSuiteLike.runTest(FunSuiteLike.scala:196)
> [info]   at org.scalatest.FunSuiteLike.runTest$(FunSuiteLike.scala:178)
> [info]   at org.apache.spark.SparkFunSuite.org$scalatest$BeforeAndAfterEach$$super$runTest(SparkFunSuite.scala:56)
> [info]   at org.scalatest.BeforeAndAfterEach.runTest(BeforeAndAfterEach.scala:221)
> [info]   at org.scalatest.BeforeAndAfterEach.runTest$(BeforeAndAfterEach.scala:214)
> [info]   at org.apache.spark.SparkFunSuite.runTest(SparkFunSuite.scala:56)
> [info]   at org.scalatest.FunSuiteLike.$anonfun$runTests$1(FunSuiteLike.scala:229)
> [info]   at org.scalatest.SuperEngine.$anonfun$runTestsInBranch$1(Engine.scala:396)
> [info]   at scala.collection.immutable.List.foreach(List.scala:392)
> [info]   at org.scalatest.SuperEngine.traverseSubNodes$1(Engine.scala:384)
> [info]   at org.scalatest.SuperEngine.runTestsInBranch(Engine.scala:379)
> [info]   at org.scalatest.SuperEngine.runTestsImpl(Engine.scala:461)
> [info]   at org.scalatest.FunSuiteLike.runTests(FunSuiteLike.scala:229)
> [info]   at org.scalatest.FunSuiteLike.runTests$(FunSuiteLike.scala:228)
> [info]   at org.scalatest.FunSuite.runTests(FunSuite.scala:1560)
> [info]   at org.scalatest.Suite.run(Suite.scala:1147)
> [info]   at org.scalatest.Suite.run$(Suite.scala:1129)
> [info]   at org.scalatest.FunSuite.org$scalatest$FunSuiteLike$$super$run(FunSuite.scala:1560)
> [info]   at org.scalatest.FunSuiteLike.$anonfun$run$1(FunSuiteLike.scala:233)
> [info]   at org.scalatest.SuperEngine.runImpl(Engine.scala:521)
> [info]   at org.scalatest.FunSuiteLike.run(FunSuiteLike.scala:233)
> [info]   at org.scalatest.FunSuiteLike.run$(FunSuiteLike.scala:232)
> [info]   at org.apache.spark.SparkFunSuite.org$scalatest$BeforeAndAfterAll$$super$run(SparkFunSuite.scala:56)
> [info]   at org.scalatest.BeforeAndAfterAll.liftedTree1$1(BeforeAndAfterAll.scala:213)
> [info]   at org.scalatest.BeforeAndAfterAll.run(BeforeAndAfterAll.scala:210)
> [info]   at org.scalatest.BeforeAndAfterAll.run$(BeforeAndAfterAll.scala:208)
> [info]   at org.apache.spark.SparkFunSuite.run(SparkFunSuite.scala:56)
> [info]   at org.scalatest.tools.Framework.org$scalatest$tools$Framework$$runSuite(Framework.scala:314)
> [info]   at org.scalatest.tools.Framework$ScalaTestTask.execute(Framework.scala:507)
> [info]   at sbt.ForkMain$Run$2.call(ForkMain.java:296)
> [info]   at sbt.ForkMain$Run$2.call(ForkMain.java:286)
> [info]   at java.base/java.util.concurrent.FutureTask.run(FutureTask.java:264)
> [info]   at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)
> [info]   at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)
> [info]   at java.base/java.lang.Thread.run(Thread.java:834)
> [info]   Cause: java.nio.file.InvalidPathException: Malformed input or input contains unmappable characters: /root/opensource/spark/target/tmp/warehouse-bfbd010e-29be-44c4-939d-2011d84f1d38/tab1/?=2
> [info]   at java.base/sun.nio.fs.UnixPath.encode(UnixPath.java:145)
> [info]   at java.base/sun.nio.fs.UnixPath.<init>(UnixPath.java:69)
> [info]   at java.base/sun.nio.fs.UnixFileSystem.getPath(UnixFileSystem.java:280)
> [info]   at java.base/java.io.File.toPath(File.java:2290)
> [info]   at org.apache.hadoop.fs.RawLocalFileSystem$DeprecatedRawLocalFileStatus.getLastAccessTime(RawLocalFileSystem.java:683)
> [info]   at org.apache.hadoop.fs.RawLocalFileSystem$DeprecatedRawLocalFileStatus.<init>(RawLocalFileSystem.java:694)
> [info]   at org.apache.hadoop.fs.RawLocalFileSystem.deprecatedGetFileStatus(RawLocalFileSystem.java:664)
> [info]   at org.apache.hadoop.fs.RawLocalFileSystem.getFileLinkStatusInternal(RawLocalFileSystem.java:987)
> [info]   at org.apache.hadoop.fs.RawLocalFileSystem.getFileStatus(RawLocalFileSystem.java:656)
> [info]   at org.apache.hadoop.fs.FilterFileSystem.getFileStatus(FilterFileSystem.java:454)
> [info]   at org.apache.hadoop.hive.io.HdfsUtils$HadoopFileStatus.<init>(HdfsUtils.java:211)
> [info]   at org.apache.hadoop.hive.ql.metadata.Hive.moveFile(Hive.java:3122)
> [info]   at org.apache.hadoop.hive.ql.metadata.Hive.replaceFiles(Hive.java:3478)
> [info]   at org.apache.hadoop.hive.ql.metadata.Hive.loadPartition(Hive.java:1650)
> [info]   at org.apache.hadoop.hive.ql.metadata.Hive.loadPartition(Hive.java:1579)
> [info]   at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> [info]   at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
> [info]   at java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> [info]   at java.base/java.lang.reflect.Method.invoke(Method.java:566)
> [info]   at org.apache.spark.sql.hive.client.Shim_v2_1.loadPartition(HiveShim.scala:1159)
> [info]   at org.apache.spark.sql.hive.client.HiveClientImpl.$anonfun$loadPartition$1(HiveClientImpl.scala:839)
> [info]   at scala.runtime.java8.JFunction0$mcV$sp.apply(JFunction0$mcV$sp.java:23)
> [info]   at org.apache.spark.sql.hive.client.HiveClientImpl.$anonfun$withHiveState$1(HiveClientImpl.scala:310)
> [info]   at org.apache.spark.sql.hive.client.HiveClientImpl.liftedTree1$1(HiveClientImpl.scala:244)
> [info]   at org.apache.spark.sql.hive.client.HiveClientImpl.retryLocked(HiveClientImpl.scala:243)
> [info]   at org.apache.spark.sql.hive.client.HiveClientImpl.withHiveState(HiveClientImpl.scala:293)
> [info]   at org.apache.spark.sql.hive.client.HiveClientImpl.loadPartition(HiveClientImpl.scala:829)
> [info]   at org.apache.spark.sql.hive.HiveExternalCatalog.$anonfun$loadPartition$1(HiveExternalCatalog.scala:893)
> [info]   at scala.runtime.java8.JFunction0$mcV$sp.apply(JFunction0$mcV$sp.java:23)
> [info]   at org.apache.spark.sql.hive.HiveExternalCatalog.withClient(HiveExternalCatalog.scala:99)
> [info]   at org.apache.spark.sql.hive.HiveExternalCatalog.loadPartition(HiveExternalCatalog.scala:872)
> [info]   at org.apache.spark.sql.catalyst.catalog.ExternalCatalogWithListener.loadPartition(ExternalCatalogWithListener.scala:175)
> [info]   at org.apache.spark.sql.hive.execution.InsertIntoHiveTable.processInsert(InsertIntoHiveTable.scala:262)
> [info]   at org.apache.spark.sql.hive.execution.InsertIntoHiveTable.run(InsertIntoHiveTable.scala:101)
> [info]   at org.apache.spark.sql.execution.command.DataWritingCommandExec.sideEffectResult$lzycompute(commands.scala:104)
> [info]   at org.apache.spark.sql.execution.command.DataWritingCommandExec.sideEffectResult(commands.scala:102)
> [info]   at org.apache.spark.sql.execution.command.DataWritingCommandExec.executeCollect(commands.scala:116)
> [info]   at org.apache.spark.sql.Dataset.$anonfun$logicalPlan$1(Dataset.scala:203)
> [info]   at org.apache.spark.sql.Dataset.$anonfun$withAction$1(Dataset.scala:3399)
> [info]   at org.apache.spark.sql.execution.SQLExecution$.$anonfun$withNewExecutionId$4(SQLExecution.scala:100)
> [info]   at org.apache.spark.sql.execution.SQLExecution$.withSQLConfPropagated(SQLExecution.scala:160)
> [info]   at org.apache.spark.sql.execution.SQLExecution$.withNewExecutionId(SQLExecution.scala:87)
> [info]   at org.apache.spark.sql.Dataset.withAction(Dataset.scala:3395)
> [info]   at org.apache.spark.sql.Dataset.<init>(Dataset.scala:203)
> [info]   at org.apache.spark.sql.Dataset$.ofRows(Dataset.scala:80)
> [info]   at org.apache.spark.sql.hive.test.TestHiveSparkSession.sql(TestHive.scala:238)
> [info]   at org.apache.spark.sql.test.SQLTestUtilsBase.$anonfun$sql$1(SQLTestUtils.scala:216)
> [info]   at org.apache.spark.sql.hive.execution.HiveDDLSuite.$anonfun$new$68(HiveDDLSuite.scala:515)
> [info]   at scala.runtime.java8.JFunction0$mcV$sp.apply(JFunction0$mcV$sp.java:23)
> [info]   at org.apache.spark.util.Utils$.tryWithSafeFinally(Utils.scala:1356)
> [info]   at org.apache.spark.sql.test.SQLTestUtilsBase.withTable(SQLTestUtils.scala:290)
> [info]   at org.apache.spark.sql.test.SQLTestUtilsBase.withTable$(SQLTestUtils.scala:288)
> [info]   at org.apache.spark.sql.hive.execution.HiveDDLSuite.withTable(HiveDDLSuite.scala:365)
> [info]   at org.apache.spark.sql.hive.execution.HiveDDLSuite.$anonfun$new$67(HiveDDLSuite.scala:509)
> [info]   at scala.runtime.java8.JFunction0$mcV$sp.apply(JFunction0$mcV$sp.java:23)
> [info]   at org.scalatest.OutcomeOf.outcomeOf(OutcomeOf.scala:85)
> [info]   at org.scalatest.OutcomeOf.outcomeOf$(OutcomeOf.scala:83)
> [info]   at org.scalatest.OutcomeOf$.outcomeOf(OutcomeOf.scala:104)
> [info]   at org.scalatest.Transformer.apply(Transformer.scala:22)
> [info]   at org.scalatest.Transformer.apply(Transformer.scala:20)
> [info]   at org.scalatest.FunSuiteLike$$anon$1.apply(FunSuiteLike.scala:186)
> [info]   at org.apache.spark.SparkFunSuite.withFixture(SparkFunSuite.scala:149)
> [info]   at org.scalatest.FunSuiteLike.invokeWithFixture$1(FunSuiteLike.scala:184)
> [info]   at org.scalatest.FunSuiteLike.$anonfun$runTest$1(FunSuiteLike.scala:196)
> [info]   at org.scalatest.SuperEngine.runTestImpl(Engine.scala:289)
> [info]   at org.scalatest.FunSuiteLike.runTest(FunSuiteLike.scala:196)
> [info]   at org.scalatest.FunSuiteLike.runTest$(FunSuiteLike.scala:178)
> [info]   at org.apache.spark.SparkFunSuite.org$scalatest$BeforeAndAfterEach$$super$runTest(SparkFunSuite.scala:56)
> [info]   at org.scalatest.BeforeAndAfterEach.runTest(BeforeAndAfterEach.scala:221)
> [info]   at org.scalatest.BeforeAndAfterEach.runTest$(BeforeAndAfterEach.scala:214)
> [info]   at org.apache.spark.SparkFunSuite.runTest(SparkFunSuite.scala:56)
> [info]   at org.scalatest.FunSuiteLike.$anonfun$runTests$1(FunSuiteLike.scala:229)
> [info]   at org.scalatest.SuperEngine.$anonfun$runTestsInBranch$1(Engine.scala:396)
> [info]   at scala.collection.immutable.List.foreach(List.scala:392)
> [info]   at org.scalatest.SuperEngine.traverseSubNodes$1(Engine.scala:384)
> [info]   at org.scalatest.SuperEngine.runTestsInBranch(Engine.scala:379)
> [info]   at org.scalatest.SuperEngine.runTestsImpl(Engine.scala:461)
> [info]   at org.scalatest.FunSuiteLike.runTests(FunSuiteLike.scala:229)
> [info]   at org.scalatest.FunSuiteLike.runTests$(FunSuiteLike.scala:228)
> [info]   at org.scalatest.FunSuite.runTests(FunSuite.scala:1560)
> [info]   at org.scalatest.Suite.run(Suite.scala:1147)
> [info]   at org.scalatest.Suite.run$(Suite.scala:1129)
> [info]   at org.scalatest.FunSuite.org$scalatest$FunSuiteLike$$super$run(FunSuite.scala:1560)
> [info]   at org.scalatest.FunSuiteLike.$anonfun$run$1(FunSuiteLike.scala:233)
> [info]   at org.scalatest.SuperEngine.runImpl(Engine.scala:521)
> [info]   at org.scalatest.FunSuiteLike.run(FunSuiteLike.scala:233)
> [info]   at org.scalatest.FunSuiteLike.run$(FunSuiteLike.scala:232)
> [info]   at org.apache.spark.SparkFunSuite.org$scalatest$BeforeAndAfterAll$$super$run(SparkFunSuite.scala:56)
> [info]   at org.scalatest.BeforeAndAfterAll.liftedTree1$1(BeforeAndAfterAll.scala:213)
> [info]   at org.scalatest.BeforeAndAfterAll.run(BeforeAndAfterAll.scala:210)
> [info]   at org.scalatest.BeforeAndAfterAll.run$(BeforeAndAfterAll.scala:208)
> [info]   at org.apache.spark.SparkFunSuite.run(SparkFunSuite.scala:56)
> [info]   at org.scalatest.tools.Framework.org$scalatest$tools$Framework$$runSuite(Framework.scala:314)
> [info]   at org.scalatest.tools.Framework$ScalaTestTask.execute(Framework.scala:507)
> [info]   at sbt.ForkMain$Run$2.call(ForkMain.java:296)
> [info]   at sbt.ForkMain$Run$2.call(ForkMain.java:286)
> [info]   at java.base/java.util.concurrent.FutureTask.run(FutureTask.java:264)
> [info]   at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)
> [info]   at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)
> [info]   at java.base/java.lang.Thread.run(Thread.java:834)
> {noformat}
> Actually, this is an environmental issue, but it works on JDK 1.8. I noticed this issue when logging to server:
> {noformat}
> Environment: yumwang ApplicationServices:  VClusters:
> -bash: warning: setlocale: LC_CTYPE: cannot change locale (UTF-8): No such file or directory
> [root@spark-3267648 ~]#
> {noformat}
> And I fixed this issue by adding the following to {{/etc/environment}}:
> {code}
> LANG=en_US.utf-8
> LC_ALL=en_US.utf-8
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org