You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@spark.apache.org by HarshSharma8 <gi...@git.apache.org> on 2017/02/20 06:54:55 UTC

[GitHub] spark pull request #16997: Updated the SQL programming guide to explain abou...

GitHub user HarshSharma8 opened a pull request:

    https://github.com/apache/spark/pull/16997

    Updated the SQL programming guide to explain about the Encoding opera\u2026

    
    ## What changes were proposed in this pull request?
    
    Made some updates to SQL programming guide to explain the Encoding operation with kryo.
    
    ## How was this patch tested?
    
    Just updated the docs.
    
    Please review http://spark.apache.org/contributing.html before opening a pull request.


You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/HarshSharma8/spark feature/docs

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/spark/pull/16997.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #16997
    
----
commit 103906fb23b5212858e89e9a090693b6fb2c6307
Author: Harsh Sharma <ha...@knoldus.com>
Date:   2017-02-20T06:51:55Z

    Updated the SQL programming guide to explain about the Encoding operation

----


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #16997: Updated the SQL programming guide to explain abou...

Posted by srowen <gi...@git.apache.org>.
Github user srowen commented on a diff in the pull request:

    https://github.com/apache/spark/pull/16997#discussion_r102008215
  
    --- Diff: docs/sql-programming-guide.md ---
    @@ -297,6 +297,9 @@ reflection and become the names of the columns. Case classes can also be nested
     types such as `Seq`s or `Array`s. This RDD can be implicitly converted to a DataFrame and then be
     registered as a table. Tables can be used in subsequent SQL statements.
     
    +Spark Encoders are used to convert a JVM object to Spark SQL representation. When we want to make a datase, Spark requires an encoder which takes the form Encoder[T] where T is the type we want to be encoded. When we try to create dataset with a custom type of object, then may result into <b>java.lang.UnsupportedOperationException: No Encoder found for Object-Name</b>.
    --- End diff --
    
    It's minor, but there are enough problems with the text to call it out. Please match the voice of the other text and avoid 'we'. Typos: "datase", "spark sql" and "kryo" for example. Use back-ticks to consistently format code if you're going to. What is Object-Name? 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #16997: Updated the SQL programming guide to explain abou...

Posted by HyukjinKwon <gi...@git.apache.org>.
Github user HyukjinKwon commented on a diff in the pull request:

    https://github.com/apache/spark/pull/16997#discussion_r102134319
  
    --- Diff: docs/sql-programming-guide.md ---
    @@ -297,6 +297,9 @@ reflection and become the names of the columns. Case classes can also be nested
     types such as `Seq`s or `Array`s. This RDD can be implicitly converted to a DataFrame and then be
     registered as a table. Tables can be used in subsequent SQL statements.
     
    +Spark Encoders are used to convert a JVM object to Spark SQL representation. To create dataset, spark requires an encoder which takes the form of <b>Encoder[T]</b> where <b>T</b> is the type which has to be encoded. Creation of a dataset with a custom type of object, may result into <b>java.lang.UnsupportedOperationException: No Encoder found for Object-Name</b>.
    --- End diff --
    
    It is trivial.. but maybe `spark` -> `Spark`? I am not an expert in grammar but up to my knowledge, capitalizing a proper noun is correct.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #16997: Updated the SQL programming guide to explain about the E...

Posted by HyukjinKwon <gi...@git.apache.org>.
Github user HyukjinKwon commented on the issue:

    https://github.com/apache/spark/pull/16997
  
    Could you fix the PR title too while you are online maybe? It might be nice to have a good title for both a commit log and those who like to track down the history.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #16997: Updated the SQL programming guide to explain about the E...

Posted by HarshSharma8 <gi...@git.apache.org>.
Github user HarshSharma8 commented on the issue:

    https://github.com/apache/spark/pull/16997
  
    Hello Sean,
    I have updated the content with back-ticks, Can you have a look at this ?
    And i am not getting which object-name you are asking about.
    
    
    Thank You
    
    
    Best Regards |
    *Harsh Sharma*
    Sr. Software Consultant
    Facebook <https://www.facebook.com/harsh.sharma.161446> | Twitter
    <https://twitter.com/harsh_sharma5> | Linked In
    <https://www.linkedin.com/in/harsh-sharma-0a08a1b0?trk=hp-identity-name>
    harshs316@gmail.com
    Skype*: khandal60*
    *+91-8447307237*
    
    On Tue, Feb 21, 2017 at 11:03 AM, Harsh Sharma <ha...@knoldus.com> wrote:
    
    > Hello Sean,
    > I apologize for bold instead of back-ticks, and i'm updating the content
    > for this.
    >
    >
    > Thank You
    >
    >
    > Best Regards |
    > *Harsh Sharma*
    > Sr. Software Consultant
    > Facebook <https://www.facebook.com/harsh.sharma.161446> | Twitter
    > <https://twitter.com/harsh_sharma5> | Linked In
    > <https://www.linkedin.com/in/harsh-sharma-0a08a1b0?trk=hp-identity-name>
    > harshs316@gmail.com
    > Skype*: khandal60*
    > *+91-8447307237*
    >
    > On Tue, Feb 21, 2017 at 10:58 AM, Sean Owen <no...@github.com>
    > wrote:
    >
    >> *@srowen* commented on this pull request.
    >> ------------------------------
    >>
    >> In docs/sql-programming-guide.md
    >> <https://github.com/apache/spark/pull/16997#discussion_r102134397>:
    >>
    >> > @@ -297,6 +297,9 @@ reflection and become the names of the columns. Case classes can also be nested
    >>  types such as `Seq`s or `Array`s. This RDD can be implicitly converted to a DataFrame and then be
    >>  registered as a table. Tables can be used in subsequent SQL statements.
    >>
    >> +Spark Encoders are used to convert a JVM object to Spark SQL representation. To create dataset, spark requires an encoder which takes the form of <b>Encoder[T]</b> where <b>T</b> is the type which has to be encoded. Creation of a dataset with a custom type of object, may result into <b>java.lang.UnsupportedOperationException: No Encoder found for Object-Name</b>.
    >>
    >> Yes, @HarshSharma8 <https://github.com/HarshSharma8> this still doesn't
    >> address the comments. Use back-ticks for code, not bold, too. What is
    >> Object-Name?
    >>
    >> \u2014
    >> You are receiving this because you were mentioned.
    >> Reply to this email directly, view it on GitHub
    >> <https://github.com/apache/spark/pull/16997#discussion_r102134397>, or mute
    >> the thread
    >> <https://github.com/notifications/unsubscribe-auth/AKIiQM8Tsz96c1KHGszvbFmgJnnRD62Gks5renYPgaJpZM4MF0vf>
    >> .
    >>
    >
    >



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #16997: Updated the SQL programming guide to explain about the E...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/16997
  
    Can one of the admins verify this patch?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #16997: Updated the SQL programming guide to explain abou...

Posted by HarshSharma8 <gi...@git.apache.org>.
Github user HarshSharma8 commented on a diff in the pull request:

    https://github.com/apache/spark/pull/16997#discussion_r102011677
  
    --- Diff: docs/sql-programming-guide.md ---
    @@ -297,6 +297,9 @@ reflection and become the names of the columns. Case classes can also be nested
     types such as `Seq`s or `Array`s. This RDD can be implicitly converted to a DataFrame and then be
     registered as a table. Tables can be used in subsequent SQL statements.
     
    +Spark Encoders are used to convert a JVM object to Spark SQL representation. When we want to make a datase, Spark requires an encoder which takes the form Encoder[T] where T is the type we want to be encoded. When we try to create dataset with a custom type of object, then may result into <b>java.lang.UnsupportedOperationException: No Encoder found for Object-Name</b>.
    --- End diff --
    
    Hello srowen,
    I have updated the content to match the void of the content, you can have another look at it.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #16997: Updated the SQL programming guide to explain about the E...

Posted by HarshSharma8 <gi...@git.apache.org>.
Github user HarshSharma8 commented on the issue:

    https://github.com/apache/spark/pull/16997
  
    I updated the content with a demo object. I would appreciate if anyone can have a look at this.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #16997: Updated the Spark SQL Programming guide with Custom obje...

Posted by HarshSharma8 <gi...@git.apache.org>.
Github user HarshSharma8 commented on the issue:

    https://github.com/apache/spark/pull/16997
  
    Did anyone get a chance to verify it or any changes required by me to make ?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #16997: Updated the Spark SQL Programming guide with Custom obje...

Posted by HarshSharma8 <gi...@git.apache.org>.
Github user HarshSharma8 commented on the issue:

    https://github.com/apache/spark/pull/16997
  
    Hello HyukjinKwon,
    I have updated the title, i wish you like it, it shows what is there in the content. And commit has already been made.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #16997: Updated the Spark SQL Programming guide with Cust...

Posted by asfgit <gi...@git.apache.org>.
Github user asfgit closed the pull request at:

    https://github.com/apache/spark/pull/16997


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #16997: Updated the Spark SQL Programming guide with Custom obje...

Posted by HarshSharma8 <gi...@git.apache.org>.
Github user HarshSharma8 commented on the issue:

    https://github.com/apache/spark/pull/16997
  
    Sure, and thanks for kind attention to this pull request.
    
    
    Thank You
    
    
    Best Regards |
    *Harsh Sharma*
    Sr. Software Consultant
    Knoldus Software LLP
    FB <https://www.facebook.com/harsh.sharma.161446> | Twitter
    <https://twitter.com/harsh_sharma5> | LinkedIn
    <https://www.linkedin.com/in/harsh-sharma-0a08a1b0?trk=hp-identity-name>
    harshs316@gmail.com
    Skype*: khandal60*
    *+91-8447307237*
    
    On Sun, Mar 5, 2017 at 10:13 PM, Sean Owen <no...@github.com> wrote:
    
    > This still has formatting and text problems. I'm sorry I don't think I can
    > go around again for this when it's not an important change, and I'd like to
    > close this.
    >
    > \u2014
    > You are receiving this because you were mentioned.
    > Reply to this email directly, view it on GitHub
    > <https://github.com/apache/spark/pull/16997#issuecomment-284242129>, or mute
    > the thread
    > <https://github.com/notifications/unsubscribe-auth/AKIiQARgsS9c8P7s7slP6T39bwCfW7ywks5riuZGgaJpZM4MF0vf>
    > .
    >



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #16997: Updated the SQL programming guide to explain about the E...

Posted by HyukjinKwon <gi...@git.apache.org>.
Github user HyukjinKwon commented on the issue:

    https://github.com/apache/spark/pull/16997
  
    BTW, could we maybe make the title complete (not `opera\u2026`)?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #16997: Updated the SQL programming guide to explain about the E...

Posted by HarshSharma8 <gi...@git.apache.org>.
Github user HarshSharma8 commented on the issue:

    https://github.com/apache/spark/pull/16997
  
    Hello Sean,
    I apologize for bold instead of back-ticks, and i'm updating the content
    for this.
    
    
    Thank You
    
    
    Best Regards |
    *Harsh Sharma*
    Sr. Software Consultant
    Facebook <https://www.facebook.com/harsh.sharma.161446> | Twitter
    <https://twitter.com/harsh_sharma5> | Linked In
    <https://www.linkedin.com/in/harsh-sharma-0a08a1b0?trk=hp-identity-name>
    harshs316@gmail.com
    Skype*: khandal60*
    *+91-8447307237*
    
    On Tue, Feb 21, 2017 at 10:58 AM, Sean Owen <no...@github.com>
    wrote:
    
    > *@srowen* commented on this pull request.
    > ------------------------------
    >
    > In docs/sql-programming-guide.md
    > <https://github.com/apache/spark/pull/16997#discussion_r102134397>:
    >
    > > @@ -297,6 +297,9 @@ reflection and become the names of the columns. Case classes can also be nested
    >  types such as `Seq`s or `Array`s. This RDD can be implicitly converted to a DataFrame and then be
    >  registered as a table. Tables can be used in subsequent SQL statements.
    >
    > +Spark Encoders are used to convert a JVM object to Spark SQL representation. To create dataset, spark requires an encoder which takes the form of <b>Encoder[T]</b> where <b>T</b> is the type which has to be encoded. Creation of a dataset with a custom type of object, may result into <b>java.lang.UnsupportedOperationException: No Encoder found for Object-Name</b>.
    >
    > Yes, @HarshSharma8 <https://github.com/HarshSharma8> this still doesn't
    > address the comments. Use back-ticks for code, not bold, too. What is
    > Object-Name?
    >
    > \u2014
    > You are receiving this because you were mentioned.
    > Reply to this email directly, view it on GitHub
    > <https://github.com/apache/spark/pull/16997#discussion_r102134397>, or mute
    > the thread
    > <https://github.com/notifications/unsubscribe-auth/AKIiQM8Tsz96c1KHGszvbFmgJnnRD62Gks5renYPgaJpZM4MF0vf>
    > .
    >



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #16997: Updated the SQL programming guide to explain about the E...

Posted by srowen <gi...@git.apache.org>.
Github user srowen commented on the issue:

    https://github.com/apache/spark/pull/16997
  
    You are still bold-facing code elements, and now back-ticked a string, which isn't code. There are still typos like "create dataset" instead of "create a Dataset". Do you mean to write something to indicate a class name will be in the message? then write something like "[class name]". There is no object name here. Please review carefully before you ask for another review.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #16997: Updated the Spark SQL Programming guide with Custom obje...

Posted by srowen <gi...@git.apache.org>.
Github user srowen commented on the issue:

    https://github.com/apache/spark/pull/16997
  
    This still has formatting and text problems. I'm sorry I don't think I can go around again for this when it's not an important change, and I'd like to close this.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org