You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@spark.apache.org by "Jatin Sharma (Jira)" <ji...@apache.org> on 2022/07/08 11:05:00 UTC

[jira] [Created] (SPARK-39722) Make Dataset.showString() public

Jatin Sharma created SPARK-39722:
------------------------------------

             Summary: Make Dataset.showString() public
                 Key: SPARK-39722
                 URL: https://issues.apache.org/jira/browse/SPARK-39722
             Project: Spark
          Issue Type: Improvement
          Components: SQL
    Affects Versions: 3.3.0, 2.4.8
            Reporter: Jatin Sharma


Currently, we have {{.show}} APIs on a Dataset, but they print directly to stdout.

But there are a lot of cases where we might need to get a String representation of the show output. For example
 * We have a logging framework to which we need to push the representation of a df
 * We have to send the string over a REST call from the driver
 * We want to send the string to stderr instead of stdout

For such cases, currently one needs to do a hack by changing the Console.out temporarily and catching the representation in a ByteArrayOutputStream or similar, then extracting the string from it.

Strictly only printing to stdout seems like a limiting choice. 

 

Solution:

We expose APIs to return the String representation back. We already have the .{{{}showString{}}} method internally.

 

We could mirror the current {{.show}} APIS with a corresponding {{.showString}} (and rename the internal private function to something else if required)



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org