You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@spark.apache.org by GitBox <gi...@apache.org> on 2018/12/13 07:45:29 UTC

[GitHub] kjmrknsn opened a new pull request #23307: [SPARK-26335][SQL] Add a property for Dataset#show not to care about wide characters when padding them

kjmrknsn opened a new pull request #23307: [SPARK-26335][SQL] Add a property for Dataset#show not to care about wide characters when padding them
URL: https://github.com/apache/spark/pull/23307
 
 
   ## What changes were proposed in this pull request?
   
   ### Issue
   [SPARK-25108](https://issues.apache.org/jira/browse/SPARK-25108) made `Dataset#show` care about wide characters when padding them. That is useful for humans to read a result of `Dataset#show`. On the other hand, that makes it impossible for programs to parse a result of `Dataset#show` because each cell's length can be different from its header's length. My company develops and manages a Jupyter/Apache Zeppelin-like visualization tool named [OASIS](https://databricks.com/session/oasis-collaborative-data-analysis-platform-using-apache-spark). On this application, a result of `Dataset#show` on a Scala or Python process is parsed to visualize it as an HTML table format as follows: 
   
   <img width="1092" alt="screen shot 2018-12-13 at 16 38 58" src="https://user-images.githubusercontent.com/31149688/49923017-9e3c6180-fef5-11e8-970b-077bed46cdee.png">
   
   ### Solution
   Add the `spark.sql.dataset.show.handleFullWidthCharacters` property for `Dataset#show` to control whether wide characters are cared/handled or not.
   
   ## How was this patch tested?
   This patch was tested via unit tests.
   
   ## Jira Issue
   https://issues.apache.org/jira/browse/SPARK-26335

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org