You are viewing a plain text version of this content. The canonical link for it is here.

Posted to reviews@spark.apache.org by HarshSharma8 <gi...@git.apache.org> on 2017/02/20 06:54:55 UTC

[GitHub] spark pull request #16997: Updated the SQL programming guide to explain abou...

GitHub user HarshSharma8 opened a pull request:

    https://github.com/apache/spark/pull/16997

    Updated the SQL programming guide to explain about the Encoding opera\u2026

    
    ## What changes were proposed in this pull request?
    
    Made some updates to SQL programming guide to explain the Encoding operation with kryo.
    
    ## How was this patch tested?
    
    Just updated the docs.
    
    Please review http://spark.apache.org/contributing.html before opening a pull request.


You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/HarshSharma8/spark feature/docs

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/spark/pull/16997.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #16997
    
----
commit 103906fb23b5212858e89e9a090693b6fb2c6307
Author: Harsh Sharma <ha...@knoldus.com>
Date:   2017-02-20T06:51:55Z

    Updated the SQL programming guide to explain about the Encoding operation

----


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request #16997: Updated the SQL programming guide to explain abou...

Posted by srowen <gi...@git.apache.org>.

Github user srowen commented on a diff in the pull request:

https://github.com/apache/spark/pull/16997#discussion_r102008215

--- Diff: docs/sql-programming-guide.md ---
@@ -297,6 +297,9 @@ reflection and become the names of the columns. Case classes can also be nested
types such as `Seq`s or `Array`s. This RDD can be implicitly converted to a DataFrame and then be
registered as a table. Tables can be used in subsequent SQL statements.

+Spark Encoders are used to convert a JVM object to Spark SQL representation. When we want to make a datase, Spark requires an encoder which takes the form Encoder[T] where T is the type we want to be encoded. When we try to create dataset with a custom type of object, then may result into java.lang.UnsupportedOperationException: No Encoder found for Object-Name.
--- End diff --

It's minor, but there are enough problems with the text to call it out. Please match the voice of the other text and avoid 'we'. Typos: "datase", "spark sql" and "kryo" for example. Use back-ticks to consistently format code if you're going to. What is Object-Name?

---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request #16997: Updated the SQL programming guide to explain abou...

Posted by HyukjinKwon <gi...@git.apache.org>.

Github user HyukjinKwon commented on a diff in the pull request:

https://github.com/apache/spark/pull/16997#discussion_r102134319

+Spark Encoders are used to convert a JVM object to Spark SQL representation. To create dataset, spark requires an encoder which takes the form of Encoder[T] where T is the type which has to be encoded. Creation of a dataset with a custom type of object, may result into java.lang.UnsupportedOperationException: No Encoder found for Object-Name.
--- End diff --

It is trivial.. but maybe `spark` -> `Spark`? I am not an expert in grammar but up to my knowledge, capitalizing a proper noun is correct.

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org