You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@spark.apache.org by srowen <gi...@git.apache.org> on 2018/08/19 15:03:38 UTC
[GitHub] spark pull request #22048: [SPARK-25108][SQL] Fix the show method to display...
Github user srowen commented on a diff in the pull request:
https://github.com/apache/spark/pull/22048#discussion_r211105897
--- Diff: sql/core/src/main/scala/org/apache/spark/sql/Dataset.scala ---
@@ -294,23 +294,24 @@ class Dataset[T] private[sql](
// We set a minimum column width at '3'
val minimumColWidth = 3
+ val regex = """[^\x00-\u2e39]""".r
--- End diff --
This could use a comment and slightly better name for the variable? I also wonder if a regex is a little bit slow for scanning every character.
However it's not clear this definition is accurate enough. According to things like https://stackoverflow.com/questions/13505075/analyzing-full-width-or-half-width-character-in-java we're really looking for the concept of "fullwidth" East Asian characters. The answer there provides a somewhat more precise definition, though others say you need something like icu4j for a proper definition.
Maybe at least adopt the alternative proposed in the SO answer? It's not necessary to be perfect here, as it's a cosmetic issue.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org