You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@spark.apache.org by "Bruce Robbins (JIRA)" <ji...@apache.org> on 2018/12/28 18:37:00 UTC

[jira] [Created] (SPARK-26496) Test "locality preferences of StateStoreAwareZippedRDD" frequently fails on High Sierra

Bruce Robbins created SPARK-26496:
-------------------------------------

             Summary: Test "locality preferences of StateStoreAwareZippedRDD" frequently fails on High Sierra
                 Key: SPARK-26496
                 URL: https://issues.apache.org/jira/browse/SPARK-26496
             Project: Spark
          Issue Type: Bug
          Components: SQL
    Affects Versions: 2.4.0
         Environment: Mac OS X High Sierra

            Reporter: Bruce Robbins


This is a bit esoteric and minor, but makes it difficult to run SQL unit tests successfully on High Sierra.

StreamingInnerJoinSuite."locality preferences of StateStoreAwareZippedRDD" generates a directory name using {{Random.nextString(10)}}, and frequently that directory name is unacceptable to High Sierra.

For example:
{noformat}
scala> val prefix = Random.nextString(10); val dir = new File("/tmp", "del_" + prefix + "-" + UUID.randomUUID.toString); dir.mkdirs()
prefix: String = 媈ᒢ탊渓뀟?녛ꃲ싢櫦
dir: java.io.File = /tmp/del_媈ᒢ탊渓뀟?녛ꃲ싢櫦-aff57fc6-ca38-4825-b4f3-473140edd4f6
res39: Boolean = true // this one was OK

scala> val prefix = Random.nextString(10); val dir = new File("/tmp", "del_" + prefix + "-" + UUID.randomUUID.toString); dir.mkdirs()
prefix: String = 窽텘⒘駖ⵚ駢⡞Ρ닋੎
dir: java.io.File = /tmp/del_窽텘⒘駖ⵚ駢⡞Ρ닋੎-a3f99855-c429-47a0-a108-47bca6905745
res40: Boolean = false  // nope, didn't like this one

scala> prefix.foreach(x => printf("%04x ", x.toInt))
7abd d158 2498 99d6 2d5a 99e2 285e 03a1 b2cb 0a4e 

scala> prefix(9)
res46: Char = ੎

scala> val prefix = "\u7abd"
prefix: String = 窽

scala> val dir = new File("/tmp", "del_" + prefix + "-" + UUID.randomUUID.toString); dir.mkdirs()
dir: java.io.File = /tmp/del_窽-d1c3af34-d34d-43fe-afed-ccef9a800ff4
res47: Boolean = true // it's OK with \u7abd

scala> val prefix = "\u0a4e"
prefix: String = ੎

scala> val dir = new File("/tmp", "del_" + prefix + "-" + UUID.randomUUID.toString); dir.mkdirs()
dir: java.io.File = /tmp/del_੎-3654a34c-6f74-4591-85af-a0f28b675a6f
res50: Boolean = false // doesn't like \u0a4e
{noformat}
I thought it might have something to do with my Java 8 version, but Python is equally affected:
{noformat}
>>> f = open(u"/tmp/del_\u7abd_file", "wb")
f = open(u"/tmp/del_\u7abd_file", "wb")
>>> f.write("hello\n")
f.write("hello\n")
# it's OK with \u7abd
>>> f2 = open(u"/tmp/del_\u0a4e_file", "wb")
f2 = open(u"/tmp/del_\u0a4e_file", "wb")
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
IOError: [Errno 92] Illegal byte sequence: u'/tmp/del_\u0a4e_file'
# doesn't like \u0a4e
>>> f2 = open(u"/tmp/del_\ufa4e_file", "wb")
f2 = open(u"/tmp/del_\ufa4e_file", "wb")
# a little change and it's happy again
>>> 
{noformat}
Mac OS X Sierra is perfectly happy with these characters. This seems to be a limitation introduced by High Sierra.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org