You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@spark.apache.org by shijinkui <gi...@git.apache.org> on 2014/10/09 15:19:11 UTC

[GitHub] spark pull request: [SPARK-3781] code Style format and little impr...

GitHub user shijinkui opened a pull request:

    https://github.com/apache/spark/pull/2734

    [SPARK-3781] code Style format and little improvement

    1. use scala recommended usage
    2. method body's left bracket
    3. parameter list format
    4. explicit mutable collection, such as "new mutable.HashMap"
    5. others

You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/shijinkui/spark master

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/spark/pull/2734.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #2734
    
----
commit bd103b78adebf933d3e33d4628c2394a845dbaa2
Author: shijinkui <sh...@163.com>
Date:   2014-09-29T05:34:02Z

    code style format

commit 0c9d1754f9c8bf36236f915868eb6f62863fea25
Author: shijinkui <sh...@163.com>
Date:   2014-10-03T14:21:44Z

    code format

commit 237bacc3d1c911e6475da4ad08dcd4d0031883ec
Author: shijinkui <sh...@163.com>
Date:   2014-10-08T03:05:01Z

    resolve conflic

commit 78f69b9526ea546251dac5eddf4da9c9eb6e20ad
Author: shijinkui <sh...@163.com>
Date:   2014-10-08T03:31:24Z

    code format

commit 725eec51fb7d29d0df99b92df3ef62fcec301d90
Author: 玄畅 <ji...@alibaba-inc.com>
Date:   2014-10-08T14:42:29Z

    resolve test fail

commit e54344b33b4c8cb4c1ff0dfb18a08188de464cfc
Author: 玄畅 <ji...@alibaba-inc.com>
Date:   2014-10-09T08:55:18Z

    code format

----


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-3781] code format and little improvemen...

Posted by srowen <gi...@git.apache.org>.
Github user srowen commented on the pull request:

    https://github.com/apache/spark/pull/2734#issuecomment-58536623
  
    @shijinkui I think there are some problems with what you are trying to contribute. You do not need to keep opening pull requests; update one. I believe the consensus is that broad, trivial style changes are not accepted because they disrupt merges.
    
    I think many of these changes introduce style violations, or do not seem to me to add value. I will comment individually on some of them. Maybe it's best if you start a discussion first about what if any of these would be accepted. You should also run scalastyle.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-3781] code format and little improvemen...

Posted by vanzin <gi...@git.apache.org>.
Github user vanzin commented on the pull request:

    https://github.com/apache/spark/pull/2734#issuecomment-58546354
  
    I think what Sean meant to say is:
    https://cwiki.apache.org/confluence/display/SPARK/Spark+Code+Style+Guide
    
    Spark has its own code style guide, which may differ from Scala's.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-3781] code format and little improvemen...

Posted by shijinkui <gi...@git.apache.org>.
Github user shijinkui commented on a diff in the pull request:

    https://github.com/apache/spark/pull/2734#discussion_r18687084
  
    --- Diff: core/src/main/scala/org/apache/spark/SparkContext.scala ---
    @@ -108,13 +112,12 @@ class SparkContext(config: SparkConf) extends Logging {
        * @param environment Environment variables to set on worker nodes.
        */
       def this(
    -      master: String,
    -      appName: String,
    -      sparkHome: String = null,
    -      jars: Seq[String] = Nil,
    -      environment: Map[String, String] = Map(),
    -      preferredNodeLocationData: Map[String, Set[SplitInfo]] = Map()) =
    -  {
    +    master: String,
    --- End diff --
    
    i agree :+1: 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-3781] code format and little improvemen...

Posted by shijinkui <gi...@git.apache.org>.
Github user shijinkui commented on a diff in the pull request:

    https://github.com/apache/spark/pull/2734#discussion_r18686261
  
    --- Diff: core/src/main/scala/org/apache/spark/Aggregator.scala ---
    @@ -34,16 +34,15 @@ case class Aggregator[K, V, C] (
         mergeValue: (C, V) => C,
         mergeCombiners: (C, C) => C) {
     
    -  private val externalSorting = SparkEnv.get.conf.getBoolean("spark.shuffle.spill", true)
    +  private val externalSorting = SparkEnv.get.conf.getBoolean("spark.shuffle.spill", defaultValue = true)
     
       @deprecated("use combineValuesByKey with TaskContext argument", "0.9.0")
       def combineValuesByKey(iter: Iterator[_ <: Product2[K, V]]): Iterator[(K, C)] =
         combineValuesByKey(iter, null)
     
    -  def combineValuesByKey(iter: Iterator[_ <: Product2[K, V]],
    -                         context: TaskContext): Iterator[(K, C)] = {
    +  def combineValuesByKey(iter: Iterator[_ <: Product2[K, V]], context: TaskContext): Iterator[(K, C)] = {
    --- End diff --
    
    can line length limit increase to 120 or more.
    now the screen is bigger.
    does long line code affect the compiling result and affect the program performance?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-3781] code format and little improvemen...

Posted by srowen <gi...@git.apache.org>.
Github user srowen commented on a diff in the pull request:

    https://github.com/apache/spark/pull/2734#discussion_r18656769
  
    --- Diff: core/src/main/scala/org/apache/spark/Aggregator.scala ---
    @@ -34,16 +34,15 @@ case class Aggregator[K, V, C] (
         mergeValue: (C, V) => C,
         mergeCombiners: (C, C) => C) {
     
    -  private val externalSorting = SparkEnv.get.conf.getBoolean("spark.shuffle.spill", true)
    +  private val externalSorting = SparkEnv.get.conf.getBoolean("spark.shuffle.spill", defaultValue = true)
     
       @deprecated("use combineValuesByKey with TaskContext argument", "0.9.0")
       def combineValuesByKey(iter: Iterator[_ <: Product2[K, V]]): Iterator[(K, C)] =
         combineValuesByKey(iter, null)
     
    -  def combineValuesByKey(iter: Iterator[_ <: Product2[K, V]],
    -                         context: TaskContext): Iterator[(K, C)] = {
    +  def combineValuesByKey(iter: Iterator[_ <: Product2[K, V]], context: TaskContext): Iterator[(K, C)] = {
    --- End diff --
    
    Changes like this make the line longer than 100 characters, it appears. That's why it was wrapped.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-3781] code format and little improvemen...

Posted by srowen <gi...@git.apache.org>.
Github user srowen commented on a diff in the pull request:

    https://github.com/apache/spark/pull/2734#discussion_r18656785
  
    --- Diff: core/src/main/scala/org/apache/spark/Aggregator.scala ---
    @@ -34,16 +34,15 @@ case class Aggregator[K, V, C] (
         mergeValue: (C, V) => C,
         mergeCombiners: (C, C) => C) {
     
    -  private val externalSorting = SparkEnv.get.conf.getBoolean("spark.shuffle.spill", true)
    +  private val externalSorting = SparkEnv.get.conf.getBoolean("spark.shuffle.spill", defaultValue = true)
    --- End diff --
    
    Why add the parameter name here? it's obviously the default value.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-3781] code format and little improvemen...

Posted by shijinkui <gi...@git.apache.org>.
Github user shijinkui commented on the pull request:

    https://github.com/apache/spark/pull/2734#issuecomment-58612450
  
    have add three sub tasks #SPARK-3849


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-3781] code Style format and little impr...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/2734#issuecomment-58507800
  
    Can one of the admins verify this patch?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-3781] code format and little improvemen...

Posted by srowen <gi...@git.apache.org>.
Github user srowen commented on a diff in the pull request:

    https://github.com/apache/spark/pull/2734#discussion_r18656848
  
    --- Diff: core/src/main/scala/org/apache/spark/SparkContext.scala ---
    @@ -108,13 +112,12 @@ class SparkContext(config: SparkConf) extends Logging {
        * @param environment Environment variables to set on worker nodes.
        */
       def this(
    -      master: String,
    -      appName: String,
    -      sparkHome: String = null,
    -      jars: Seq[String] = Nil,
    -      environment: Map[String, String] = Map(),
    -      preferredNodeLocationData: Map[String, Set[SplitInfo]] = Map()) =
    -  {
    +    master: String,
    --- End diff --
    
    Changing this indentation makes the start of the code body less apparent. Moving the `{` seems reasonable but not enough to justify a change IMHO.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-3781] code format and little improvemen...

Posted by shijinkui <gi...@git.apache.org>.
Github user shijinkui closed the pull request at:

    https://github.com/apache/spark/pull/2734


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-3781] code format and little improvemen...

Posted by JoshRosen <gi...@git.apache.org>.
Github user JoshRosen commented on the pull request:

    https://github.com/apache/spark/pull/2734#issuecomment-58849017
  
    Hi @shijinkui,
    
    After some discussion, we've decided that we'd like to avoid merging pull requests that make large, sweeping style changes/improvements, since these changes tend to create maintenance headaches for us by making `git blame` less useful and creating merge-conflicts when backporting to maintenance branches.  However, we'd be open to automatic style checks if they can be conditionally applied only to new / modified code (see https://issues.apache.org/jira/browse/SPARK-3849 for more details).
    
    In the meantime, do you mind closing this pull request?  Thanks!


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-3781] code format and little improvemen...

Posted by srowen <gi...@git.apache.org>.
Github user srowen commented on a diff in the pull request:

    https://github.com/apache/spark/pull/2734#discussion_r18656871
  
    --- Diff: core/src/main/scala/org/apache/spark/SparkContext.scala ---
    @@ -444,20 +447,20 @@ class SparkContext(config: SparkConf) extends Logging {
       // Methods for creating RDDs
     
       /** Distribute a local Scala collection to form an RDD.
    -   *
    -   * @note Parallelize acts lazily. If `seq` is a mutable collection and is
    -   * altered after the call to parallelize and before the first action on the
    -   * RDD, the resultant RDD will reflect the modified collection. Pass a copy of
    -   * the argument to avoid this.
    -   */
    +    *
    --- End diff --
    
    This is incorrect indentation. It was correct before.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-3781] code format and little improvemen...

Posted by shijinkui <gi...@git.apache.org>.
Github user shijinkui commented on a diff in the pull request:

    https://github.com/apache/spark/pull/2734#discussion_r18687088
  
    --- Diff: core/src/main/scala/org/apache/spark/SparkContext.scala ---
    @@ -444,20 +447,20 @@ class SparkContext(config: SparkConf) extends Logging {
       // Methods for creating RDDs
     
       /** Distribute a local Scala collection to form an RDD.
    -   *
    -   * @note Parallelize acts lazily. If `seq` is a mutable collection and is
    -   * altered after the call to parallelize and before the first action on the
    -   * RDD, the resultant RDD will reflect the modified collection. Pass a copy of
    -   * the argument to avoid this.
    -   */
    +    *
    --- End diff --
    
    OK.



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-3781] code format and little improvemen...

Posted by srowen <gi...@git.apache.org>.
Github user srowen commented on a diff in the pull request:

    https://github.com/apache/spark/pull/2734#discussion_r18689129
  
    --- Diff: core/src/main/scala/org/apache/spark/Aggregator.scala ---
    @@ -34,16 +34,15 @@ case class Aggregator[K, V, C] (
         mergeValue: (C, V) => C,
         mergeCombiners: (C, C) => C) {
     
    -  private val externalSorting = SparkEnv.get.conf.getBoolean("spark.shuffle.spill", true)
    +  private val externalSorting = SparkEnv.get.conf.getBoolean("spark.shuffle.spill", defaultValue = true)
    --- End diff --
    
    Yes, I understand the logic. My IDE does this automatically without having to write it out, and I suspect that is true of many if not all other devs. I also don't see this as common practice in other Scala code.  Still I'd be interested what other people think of this standard, specifically for `Boolean` args with default value.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-3781] code format and little improvemen...

Posted by JoshRosen <gi...@git.apache.org>.
Github user JoshRosen commented on the pull request:

    https://github.com/apache/spark/pull/2734#issuecomment-58555352
  
    Hi @shijinkui,
    
    Do you mind holding off on making major style refactoring patches until we've finished [SPARK-3849](https://issues.apache.org/jira/browse/SPARK-3849), which adds style-checking rules so that we don't re-introduce style problems?  If you've noticed any style issues that aren't caught by the current scalastyle checks, could you open subtasks on that issue?  Thanks!


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-3781] code format and little improvemen...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/2734#issuecomment-58572770
  
    Can one of the admins verify this patch?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-3781] code format and little improvemen...

Posted by shijinkui <gi...@git.apache.org>.
Github user shijinkui commented on a diff in the pull request:

    https://github.com/apache/spark/pull/2734#discussion_r18687068
  
    --- Diff: core/src/main/scala/org/apache/spark/Aggregator.scala ---
    @@ -34,16 +34,15 @@ case class Aggregator[K, V, C] (
         mergeValue: (C, V) => C,
         mergeCombiners: (C, C) => C) {
     
    -  private val externalSorting = SparkEnv.get.conf.getBoolean("spark.shuffle.spill", true)
    +  private val externalSorting = SparkEnv.get.conf.getBoolean("spark.shuffle.spill", defaultValue = true)
    --- End diff --
    
    http://misto.ch/detecting-and-naming-boolean-parameters/
    scala plugin tips: Detecting and Naming Boolean Parameters



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org