You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@spark.apache.org by pfcoperez <gi...@git.apache.org> on 2017/02/06 16:51:40 UTC

[GitHub] spark pull request #16823: [SPARK] Config methods simplification at SparkSes...

GitHub user pfcoperez opened a pull request:

    https://github.com/apache/spark/pull/16823

    [SPARK] Config methods simplification at SparkSession#Builder

    ## What changes were proposed in this pull request?
    
    `SparkSession`' companion object `Builder` inner class presents several implementations for config setter which are exact copies of the same method just varying the value input type.
    
    There are setters for `Double`, `Long` and `Boolean`. And the three of them transform the input value into an string via the default `.toString` method.
    
    These are basic types and, therefore, child classes of `AnyVal`. Why explicitly enumerating them in a set of methods replicating code?
    
    The `config` implementation for `String` values can be invoke by a single method leveraging Scala's generic types with type bounds:
    
    `def config[T <: AnyVal](key: String, value: T): Builder = config(key, value.toString)`
    
    This PR tries to: 
    
    - Reduce the amount of replicated code. That implies that further code updates will require modifying different methods with the dangers derived from possible oversights.
    - Increase the code readability.
    - Add all Scala basic types as possible inputs for the builder configuration pseudo-dsl implemented by `SparkSession#Builder`
    
    ## How was this patch tested?
    
    This PRs changes the implementation of already tested, by the current test suites. Hence, no more testing has been added as no actual functionality is added.

You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/pfcoperez/spark sparksession_buildersimplification

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/spark/pull/16823.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #16823
    
----

----


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #16823: [SPARK] Config methods simplification at SparkSession#Bu...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/16823
  
    Can one of the admins verify this patch?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #16823: [SPARK] Config methods simplification at SparkSession#Bu...

Posted by srowen <gi...@git.apache.org>.
Github user srowen commented on the issue:

    https://github.com/apache/spark/pull/16823
  
    Agree, though we are talking about duplicating 1 line of code in 3 nearby places. It's not meaningfully duplicating anything.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #16823: [SPARK] Config methods simplification at SparkSession#Bu...

Posted by pfcoperez <gi...@git.apache.org>.
Github user pfcoperez commented on the issue:

    https://github.com/apache/spark/pull/16823
  
    @andrewor14 OK, no discussion: I'll close it.
    
    Just an observation:
    
    ![image](https://cloud.githubusercontent.com/assets/273379/22659656/bd4933ca-ec9e-11e6-8cba-e32fc7cbd4a2.png)
    
    
    `AnyVal` is not  `Any` hence: "any objects" will hardly fit in the input type. 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #16823: [SPARK] Config methods simplification at SparkSession#Bu...

Posted by srowen <gi...@git.apache.org>.
Github user srowen commented on the issue:

    https://github.com/apache/spark/pull/16823
  
    Yeah, non-starter I'm afraid.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #16823: [SPARK] Config methods simplification at SparkSes...

Posted by pfcoperez <gi...@git.apache.org>.
Github user pfcoperez closed the pull request at:

    https://github.com/apache/spark/pull/16823


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #16823: [SPARK] Config methods simplification at SparkSession#Bu...

Posted by andrewor14 <gi...@git.apache.org>.
Github user andrewor14 commented on the issue:

    https://github.com/apache/spark/pull/16823
  
    This is a bad idea! First it breaks backward compatibility, and second, we intentionally didn't want to make it so general that the user can pass in any objects. Can you please close this PR?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #16823: [SPARK] Config methods simplification at SparkSession#Bu...

Posted by pfcoperez <gi...@git.apache.org>.
Github user pfcoperez commented on the issue:

    https://github.com/apache/spark/pull/16823
  
    @andrewor14 @srowen In any case, I just wanted to add that copying code is, basically the worst strategy. 
    
    If you wanted to constraint the types for those tree and not just **AnyVal** sub-classes I would recommend doing something as:
    
    ```
    def config(key: String, value: Double): Builder = config(key, value.toString)
    def config(key: String, value: Boolean): Builder = config(key, value.toString)
    def config(key: String, value: Long): Builder = config(key, value.toString)		
    ```
    
    Exactly same interface, no copy-paste code. 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org