You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@spark.apache.org by jaceklaskowski <gi...@git.apache.org> on 2015/09/14 20:58:48 UTC

[GitHub] spark pull request: Introduce config constants object

GitHub user jaceklaskowski opened a pull request:

    https://github.com/apache/spark/pull/8753

    Introduce config constants object

    A small refactoring to introduce a Scala object to keep property/environment names in a single place for YARN cluster deployment first (as I hate seeing strings all over the code, and most importantly avoid any typos in the future). I couldn't resist after reviewing c0052d8d09eebadadb5ed35ac512caaf73919551.
    
    If it gets accepted I'm going to move all the configuration names to the file. I did the first step and want to be told I'm right or get corrected.
    
    (It's also to verify my understanding of how the pull request process works in the project.)

You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/jaceklaskowski/spark yarn-config-params

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/spark/pull/8753.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #8753
    
----
commit 0bb05999a736dac1b6350d075a4ea95d2db3ef32
Author: Jacek Laskowski <ja...@deepsense.io>
Date:   2015-09-14T18:46:00Z

    Introduce config constants object

----


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: Introduce config constants object

Posted by srowen <gi...@git.apache.org>.
Github user srowen commented on the pull request:

    https://github.com/apache/spark/pull/8753#issuecomment-140206085
  
    Type and default value are good points. That adds a lot over merely factoring out the name.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: Introduce config constants object

Posted by srowen <gi...@git.apache.org>.
Github user srowen commented on the pull request:

    https://github.com/apache/spark/pull/8753#issuecomment-140199709
  
    (I think that one was closed from no activity, which maybe says something. It was more about recording consistent default values?)
    
    The upshot is certainly avoiding typos and there's no denying that helps. A lot of typos would be caught because stuff breaks if the key is wrong, but you never know.
    
    A minor objection is simply that it's an extra layer of indirection, which is itself a small cost paid repeatedly.  My slightly bigger objection is that this "spell checking" is being implemented as compile-time dependencies and that causes some small trouble one way or another. If all your constants live in one big file, then it has to be in a "core" module yet has constants for every module in the project. If you localize constants, in my experience, you eventually hit a point where you add a dependency across to another module just to import its constants, which has a cost too. Or you duplicate the key anyway. And you can't use this to avoid typos in non-code artifacts like docs. Better than nothing but is it a false sense of security?
    
    The poor practical argument against is that it's worth doing consistently if at all, and doing it consistently could be a huge change.
    
    Anyway that's why I personally don't like constants for constants. I'm sure reasonable people disagree. Maybe there is a subset of the code where it's an obvious win? lots of error potential, only in-module dependencies, code-only references, etc


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: Introduce config constants object

Posted by vanzin <gi...@git.apache.org>.
Github user vanzin commented on the pull request:

    https://github.com/apache/spark/pull/8753#issuecomment-140205806
  
    It's not just typos; it encapsulates several things about the config option in a single place:
    - name
    - type
    - default value
    
    Right now, every place where a config option is used needs to duplicate those three things for the code to be consistent.
    
    If you add documentation to those 3, as `SQLConf` does, you could write a simple tool to automatically generate documentation for those options, removing yet another place where duplication might occur.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: Introduce config constants object

Posted by vanzin <gi...@git.apache.org>.
Github user vanzin commented on the pull request:

    https://github.com/apache/spark/pull/8753#issuecomment-143797926
  
    @jaceklaskowski could you close this PR please? thanks


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: Introduce config constants object

Posted by srowen <gi...@git.apache.org>.
Github user srowen commented on the pull request:

    https://github.com/apache/spark/pull/8753#issuecomment-140414831
  
    I think it's fine to highlight this discussion on the user list (can just point to it) and ask if anyone has more opinions. It's OK to close this PR but still continue a bit of discussion on it. If there's some reasonable expectation someone might work on it, make a JIRA outlining the change and then start working on it.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: Introduce config constants object

Posted by srowen <gi...@git.apache.org>.
Github user srowen commented on the pull request:

    https://github.com/apache/spark/pull/8753#issuecomment-140183180
  
    @jaceklaskowski A lot of your PRs do not have JIRAs. That's OK for something very trivial but I'd have to push back on you now to follow https://cwiki.apache.org/confluence/display/SPARK/Contributing+to+Spark
    
    I think this doesn't add enough value to change. It puts the prop value another hop away from its usage, and doesn't make a lot of sense to do this for just one property. I agree it helps avoid typos, but also introduces dependencies, and if taken to its logical conclusion, almost doesn't help avoid mistakes -- you have to figure out what key is for which key.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: Introduce config constants object

Posted by jaceklaskowski <gi...@git.apache.org>.
Github user jaceklaskowski closed the pull request at:

    https://github.com/apache/spark/pull/8753


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: Introduce config constants object

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/8753#issuecomment-140176928
  
    Can one of the admins verify this patch?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: Introduce config constants object

Posted by vanzin <gi...@git.apache.org>.
Github user vanzin commented on the pull request:

    https://github.com/apache/spark/pull/8753#issuecomment-140197679
  
    I think SPARK-529 used to be a tracking bug for this, but Sean (err) closed it... I even had a patch at some point to implement it, but people didn't like it because it touched too many things.
    
    I still think something along those lines would be useful. Maybe model it after `SQLConf`, for example.
    
    Ah, and while working on that bug, I did find not just typos, but also conflicting default values in different parts of the code...


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: Introduce config constants object

Posted by jaceklaskowski <gi...@git.apache.org>.
Github user jaceklaskowski commented on the pull request:

    https://github.com/apache/spark/pull/8753#issuecomment-140414166
  
    Should we then move the discussion to the mailing list and/or creating a task in JIRA?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: Introduce config constants object

Posted by srowen <gi...@git.apache.org>.
Github user srowen commented on the pull request:

    https://github.com/apache/spark/pull/8753#issuecomment-141068481
  
    (Do you mind closing this PR? the next step would be a different discussion or change.)


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: Introduce config constants object

Posted by jaceklaskowski <gi...@git.apache.org>.
Github user jaceklaskowski commented on the pull request:

    https://github.com/apache/spark/pull/8753#issuecomment-140193323
  
    Is this better discussed on the user or dev mailing lists? I disagree on the use of another dependency hop using such an object, but can agree with you on doing it for just a single property.
    
    I'm surprised that such large code base doesn't have a central place for all property names. Their names are everywhere and kudos to the maintainers for no typos (I at least haven't spot one so far)!


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: Introduce config constants object

Posted by vanzin <gi...@git.apache.org>.
Github user vanzin commented on the pull request:

    https://github.com/apache/spark/pull/8753#issuecomment-140206475
  
    Yeah, I'm not necessarily arguing for this PR (haven't really looked at the code), but for the idea of centralizing this information somewhere (even if it's in a "per-module" conf object or something).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org