You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@spark.apache.org by GitBox <gi...@apache.org> on 2020/11/11 02:47:49 UTC

[GitHub] [spark] jsoref opened a new pull request #30323: Spelling

jsoref opened a new pull request #30323:
URL: https://github.com/apache/spark/pull/30323


   Some thoughts:
   * This PR is too big (I know this) -- typically large projects ask about splitting in some way -- but I leave the how to the project -- it might be by top level directory, it might be by file type, it might be "comments", then "locals", then "public apis" or something else
   * I keep distinct commits by word family because it makes rebasing / dealing w/ conflicts or dropping things in case they're controversial.
   * My general preference is to squash at the very last minute (or let the project do it)
   * There are definitely some changes to java final classes -- I expect to be asked to drop these
   
   <!--
   Thanks for sending a pull request!  Here are some tips for you:
     1. If this is your first time, please read our contributor guidelines: https://spark.apache.org/contributing.html
     2. Ensure you have added or run the appropriate tests for your PR: https://spark.apache.org/developer-tools.html
     3. If the PR is unfinished, add '[WIP]' in your PR title, e.g., '[WIP][SPARK-XXXX] Your PR title ...'.
     4. Be sure to keep the PR description updated to reflect all changes.
     5. Please write your PR title to summarize what this PR proposes.
     6. If possible, provide a concise example to reproduce the issue for a faster review.
     7. If you want to add a new configuration, please read the guideline first for naming configurations in
        'core/src/main/scala/org/apache/spark/internal/config/ConfigEntry.scala'.
   -->
   
   ### What changes were proposed in this pull request?
   <!--
   Please clarify what changes you are proposing. The purpose of this section is to outline the changes and how this PR fixes the issue. 
   If possible, please consider writing useful notes for better and faster reviews in your PR. See the examples below.
     1. If you refactor some codes with changing classes, showing the class hierarchy will help reviewers.
     2. If you fix some SQL features, you can provide some references of other DBMSes.
     3. If there is design documentation, please add the link.
     4. If there is a discussion in the mailing list, please add the link.
   -->
   
   This PR corrects misspellings identified by the [check-spelling action](https://github.com/marketplace/actions/check-spelling).
   
   
   ### Why are the changes needed?
   <!--
   Please clarify why the changes are needed. For instance,
     1. If you propose a new API, clarify the use case for a new API.
     2. If you fix a bug, you can clarify why it is a bug.
   -->
   
   Misspelled words make it harder to read / understand content.
   
   ### Does this PR introduce _any_ user-facing change?
   <!--
   Note that it means *any* user-facing change including all aspects such as the documentation fix.
   If yes, please clarify the previous behavior and the change this PR proposes - provide the console output, description and/or an example to show the behavior difference if possible.
   If possible, please also clarify if this is a user-facing change compared to the released Spark versions or within the unreleased branches such as master.
   If no, write 'No'.
   -->
   
   I believe so. Unfortunately, this PR exceeds GitHub's limits, which means it'll be vaguely hard for everyone. I'll try to leave at least some marks to call some things out.
   
   ### How was this patch tested?
   <!--
   If tests were added, say they were added here. Please make sure to add some test cases that check the changes thoroughly including negative and positive cases if possible.
   If it was tested in a way different from regular unit tests, please clarify how you tested step by step, ideally copy and paste-able, so that other reviewers can test and check, and descendants can verify in the future.
   If tests were not added, please describe why they were not added and/or why it was difficult to add.
   -->
   
   The misspellings have been reported at https://github.com/jsoref/spark/commit/706a726f87a0bbf5e31467fae9015218773db85b#commitcomment-44064356
   
   The action reports that the changes in this PR would make it happy: https://github.com/jsoref/spark/commit/9f88454aa3eb1ff5af0c867fcd619c51d6999188
   
   Note: this PR does not include the action. If you're interested in running a spell check on every PR and push, that can be offered separately.
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] HyukjinKwon commented on a change in pull request #30323: Spelling

Posted by GitBox <gi...@apache.org>.
HyukjinKwon commented on a change in pull request #30323:
URL: https://github.com/apache/spark/pull/30323#discussion_r521049841



##########
File path: common/network-common/src/main/java/org/apache/spark/network/util/AbstractFileRegion.java
##########
@@ -24,7 +24,7 @@
 
   @Override
   @SuppressWarnings("deprecation")
-  public final long transfered() {

Review comment:
       This should be kept for compatibility I believe.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] jsoref commented on a change in pull request #30323: Spelling

Posted by GitBox <gi...@apache.org>.
jsoref commented on a change in pull request #30323:
URL: https://github.com/apache/spark/pull/30323#discussion_r521503874



##########
File path: core/src/main/resources/org/apache/spark/ui/static/dataTables.rowsGroup.js
##########
@@ -166,7 +166,7 @@ RowsGroup.prototype = {
 			this._mergeColumn(newSequenceRow, (iRow-1), columnsIndexesCopy)
 	},
 	
-	_toogleDirection: function(dir)
+	_toggleDirection: function(dir)

Review comment:
       Can you list which files I should be excluding?




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] jsoref commented on pull request #30323: Spelling

Posted by GitBox <gi...@apache.org>.
jsoref commented on pull request #30323:
URL: https://github.com/apache/spark/pull/30323#issuecomment-741319257


   So, we're left with three commits:
   * 0ff10b7ee2a13df02ba2ab5c4c5c86851a531c08 - `create`
   * 6cbd74162a77fdc4c04c867952ec42c6e493288b - `enabled`
   * d0a58cc5fdb8956f41e2161a5d1f4f6d90c40353 - `filters`
   
   I'm going to move `enabled` and `filters` into distinct PRs because I suspect they're more likely to be API breaks.
   
   This PR can be closed once `create` (the last commit standing here) is merged.
   
   I'm not entirely certain the other two will be merged, and even if they are, they're probably much more complicated than generic `spelling` fixes (as I suspect they'll require shims).


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] srowen commented on pull request #30323: Spelling

Posted by GitBox <gi...@apache.org>.
srowen commented on pull request #30323:
URL: https://github.com/apache/spark/pull/30323#issuecomment-725592297


   Hm, could have user-facing implications. I'd leave it (here).


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] holdenk commented on pull request #30323: Spelling

Posted by GitBox <gi...@apache.org>.
holdenk commented on pull request #30323:
URL: https://github.com/apache/spark/pull/30323#issuecomment-725159710


   How would you feel about doing this by sub project? We could start in the graph processing where it's less actively developed?


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] srowen commented on a change in pull request #30323: Spelling

Posted by GitBox <gi...@apache.org>.
srowen commented on a change in pull request #30323:
URL: https://github.com/apache/spark/pull/30323#discussion_r523814877



##########
File path: sql/hive-thriftserver/src/main/java/org/apache/hive/service/cli/OperationState.java
##########
@@ -31,7 +31,7 @@
   CANCELED(TOperationState.CANCELED_STATE, true),
   CLOSED(TOperationState.CLOSED_STATE, true),
   ERROR(TOperationState.ERROR_STATE, true),
-  UNKNOWN(TOperationState.UKNOWN_STATE, false),
+  UNKNOWN(TOperationState.UNKNOWN_STATE, false),

Review comment:
       Yeah I just wouldn't touch the Hive code here for now. Anything non-trivial can be dealt with later.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins commented on pull request #30323: Spelling

Posted by GitBox <gi...@apache.org>.
AmplabJenkins commented on pull request #30323:
URL: https://github.com/apache/spark/pull/30323#issuecomment-725094389


   Can one of the admins verify this patch?


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] jsoref commented on pull request #30323: Spelling

Posted by GitBox <gi...@apache.org>.
jsoref commented on pull request #30323:
URL: https://github.com/apache/spark/pull/30323#issuecomment-735313703


   Things I know will remain:
   
   * `create` (it spans multiple directories) -- we can address this after the current open PRs resolve (probably using this PR)
   * `enabled` (`legacy_setops_precedence_enbled` appears to be a public API -- addressing it would be done as its own distinct PR if at all -- one approach is to add a correct spelling making that the preferred and adding a deprecated annotation to the current spelling -- another approach is to just add a comment acknowledging the API botch)
   
   I think that's everything, but we'll see.
   
   Note to self: I changed one `E.g.` to `For example` (which actually fit w/ a second one that I did the same to earlier), so I'm going to have a merge conflict: resolver=drop.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] jsoref commented on pull request #30323: Spelling

Posted by GitBox <gi...@apache.org>.
jsoref commented on pull request #30323:
URL: https://github.com/apache/spark/pull/30323#issuecomment-727642668


   So, one thing I noticed, there are a bunch of places where the line length (100 chars) is already violated. My tentative plan is not to fix those (outside of scope). For things where I'm adding a new comment, I'll try to avoid introducing new errors of that category.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] srowen commented on pull request #30323: Spelling

Posted by GitBox <gi...@apache.org>.
srowen commented on pull request #30323:
URL: https://github.com/apache/spark/pull/30323#issuecomment-728981618


   Sure. Chunks of 100 files or so, whatever feels logical.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] jsoref commented on a change in pull request #30323: Spelling

Posted by GitBox <gi...@apache.org>.
jsoref commented on a change in pull request #30323:
URL: https://github.com/apache/spark/pull/30323#discussion_r521767390



##########
File path: sql/hive-thriftserver/src/main/java/org/apache/hive/service/cli/OperationState.java
##########
@@ -31,7 +31,7 @@
   CANCELED(TOperationState.CANCELED_STATE, true),
   CLOSED(TOperationState.CLOSED_STATE, true),
   ERROR(TOperationState.ERROR_STATE, true),
-  UNKNOWN(TOperationState.UKNOWN_STATE, false),
+  UNKNOWN(TOperationState.UNKNOWN_STATE, false),

Review comment:
       Apparently this is a downstream for this: https://github.com/apache/hive/search?q=UKNOWN_STATE, so I'll need to change that first.
   
   I'll drop this shortly. If you have any advice for steps I should take w/ that project, I'm open to advice. Otherwise, I'll do something similar to what I did here (it took a number of days to produce this initial PR).

##########
File path: docs/monitoring.md
##########
@@ -421,7 +421,7 @@ to handle the Spark Context setup and tear down.
 
 In addition to viewing the metrics in the UI, they are also available as JSON.  This gives developers
 an easy way to create new visualizations and monitoring tools for Spark.  The JSON is available for
-both running applications, and in the history server.  The endpoints are mounted at `/api/v1`.  Eg.,
+both running applications, and in the history server.  The endpoints are mounted at `/api/v1`.  For example

Review comment:
       https://github.com/apache/spark/pull/30323#discussion_r521447840

##########
File path: sql/hive/src/test/scala/org/apache/spark/sql/hive/InsertSuite.scala
##########
@@ -35,7 +35,7 @@ import org.apache.spark.util.Utils
 
 case class TestData(key: Int, value: String)
 
-case class ThreeCloumntable(key: Int, value: String, key1: String)
+case class ThreeColumntable(key: Int, value: String, key1: String)

Review comment:
       Would this be ok?
   ```suggestion
   case class ThreeColumnTable(key: Int, value: String, key1: String)
   ```




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] jsoref commented on a change in pull request #30323: Spelling

Posted by GitBox <gi...@apache.org>.
jsoref commented on a change in pull request #30323:
URL: https://github.com/apache/spark/pull/30323#discussion_r521500938



##########
File path: core/src/main/scala/org/apache/spark/util/Utils.scala
##########
@@ -1090,20 +1090,20 @@ private[spark] object Utils extends Logging {
     }
     // checks if the hostport contains IPV6 ip and parses the host, port
     if (hostPort != null && hostPort.split(":").length > 2) {
-      val indx: Int = hostPort.lastIndexOf("]:")

Review comment:
       I imagine that this PR will be split out into at least one more slice (probably after the first split PR is merged).




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] zero323 commented on a change in pull request #30323: Spelling

Posted by GitBox <gi...@apache.org>.
zero323 commented on a change in pull request #30323:
URL: https://github.com/apache/spark/pull/30323#discussion_r521756500



##########
File path: R/pkg/tests/fulltests/test_sparkSQL.R
##########
@@ -2066,7 +2066,7 @@ test_that("higher order functions", {
     createDataFrame(data.frame(id = 1)),
     expr("CAST(array(1.0, 2.0, -3.0, -4.0) AS array<double>) xs"),
     expr("CAST(array(0.0, 3.0, 48.0) AS array<double>) ys"),
-    expr("array('FAILED', 'SUCCEDED') as vs"),
+    expr("array('FAILED', 'SUCCEEDED') as vs"),

Review comment:
       The test here is based on equivalence between SQL expressions and the R implementation, so it is data independent. 




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] jsoref commented on pull request #30323: Spelling

Posted by GitBox <gi...@apache.org>.
jsoref commented on pull request #30323:
URL: https://github.com/apache/spark/pull/30323#issuecomment-725166959


   I presume you mean https://github.com/apache/spark/tree/master/graphx/ ?


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] srowen commented on a change in pull request #30323: Spelling

Posted by GitBox <gi...@apache.org>.
srowen commented on a change in pull request #30323:
URL: https://github.com/apache/spark/pull/30323#discussion_r521507218



##########
File path: core/src/main/resources/org/apache/spark/ui/static/dataTables.rowsGroup.js
##########
@@ -166,7 +166,7 @@ RowsGroup.prototype = {
 			this._mergeColumn(newSequenceRow, (iRow-1), columnsIndexesCopy)
 	},
 	
-	_toogleDirection: function(dir)
+	_toggleDirection: function(dir)

Review comment:
       Most of the javascript is third-party, except for a few that are probably obviously specific to the Spark UI. That's most of it. There are some copied Hive classes here and there but I don't think you're touching those.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins removed a comment on pull request #30323: Spelling

Posted by GitBox <gi...@apache.org>.
AmplabJenkins removed a comment on pull request #30323:
URL: https://github.com/apache/spark/pull/30323#issuecomment-725094389


   Can one of the admins verify this patch?


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] jsoref commented on a change in pull request #30323: Spelling

Posted by GitBox <gi...@apache.org>.
jsoref commented on a change in pull request #30323:
URL: https://github.com/apache/spark/pull/30323#discussion_r521079811



##########
File path: common/network-common/src/main/java/org/apache/spark/network/util/AbstractFileRegion.java
##########
@@ -24,7 +24,7 @@
 
   @Override
   @SuppressWarnings("deprecation")
-  public final long transfered() {

Review comment:
       Yeah (I noticed that as I was reviewing -- https://github.com/apache/spark/pull/30323#discussion_r521050706 -- it takes a long time to write the full review)




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] jsoref commented on pull request #30323: Spelling

Posted by GitBox <gi...@apache.org>.
jsoref commented on pull request #30323:
URL: https://github.com/apache/spark/pull/30323#issuecomment-741488008


   See comments by @srowen @cloud-fan on 0ff10b7ee2a13df02ba2ab5c4c5c86851a531c08


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] jsoref commented on a change in pull request #30323: Spelling

Posted by GitBox <gi...@apache.org>.
jsoref commented on a change in pull request #30323:
URL: https://github.com/apache/spark/pull/30323#discussion_r521502299



##########
File path: R/pkg/tests/fulltests/test_sparkSQL.R
##########
@@ -2066,7 +2066,7 @@ test_that("higher order functions", {
     createDataFrame(data.frame(id = 1)),
     expr("CAST(array(1.0, 2.0, -3.0, -4.0) AS array<double>) xs"),
     expr("CAST(array(0.0, 3.0, 48.0) AS array<double>) ys"),
-    expr("array('FAILED', 'SUCCEDED') as vs"),
+    expr("array('FAILED', 'SUCCEEDED') as vs"),

Review comment:
       The lack of a second instance of the string. I'm pretty sure it's a misspelling, but usually code changes would be pairwise.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] jsoref commented on a change in pull request #30323: Spelling

Posted by GitBox <gi...@apache.org>.
jsoref commented on a change in pull request #30323:
URL: https://github.com/apache/spark/pull/30323#discussion_r523825893



##########
File path: mllib/src/main/scala/org/apache/spark/ml/feature/Binarizer.scala
##########
@@ -112,7 +112,7 @@ final class Binarizer @Since("1.4.0") (@Since("1.4.0") override val uid: String)
         (Seq($(inputCol)), Seq($(outputCol)), Seq($(threshold)))
       }
 
-    val ouputCols = inputColNames.zip(tds).map { case (inputColName, td) =>
+    val mappedOutputCols = inputColNames.zip(tds).map { case (inputColName, td) =>

Review comment:
       There's an `outputCols` thing a few lines above, the oddly named variable here accidentally didn't collide with it...




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] jsoref commented on pull request #30323: Spelling

Posted by GitBox <gi...@apache.org>.
jsoref commented on pull request #30323:
URL: https://github.com/apache/spark/pull/30323#issuecomment-725120497


   * I can rebase regularly or infrequently (depends on project preference).
   * I can update my changes regularly or only rarely.
   -- for large changes like this, the odds of getting conflicts are very high (which is part of why I prefer to work w/ small commits as rebuilding them individually is generally fairly easy)


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] jsoref commented on a change in pull request #30323: Spelling

Posted by GitBox <gi...@apache.org>.
jsoref commented on a change in pull request #30323:
URL: https://github.com/apache/spark/pull/30323#discussion_r521738296



##########
File path: core/src/main/resources/org/apache/spark/ui/static/dataTables.rowsGroup.js
##########
@@ -166,7 +166,7 @@ RowsGroup.prototype = {
 			this._mergeColumn(newSequenceRow, (iRow-1), columnsIndexesCopy)
 	},
 	
-	_toogleDirection: function(dir)
+	_toggleDirection: function(dir)

Review comment:
       I'm going to drop the changes for `core/src/main/resources/org/apache/spark/ui/static/dataTables.rowsGroup.js` in:
   3c3a8dc388
   0fcc89b7fa
   




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] srowen closed pull request #30323: Spelling

Posted by GitBox <gi...@apache.org>.
srowen closed pull request #30323:
URL: https://github.com/apache/spark/pull/30323


   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] jsoref commented on a change in pull request #30323: Spelling

Posted by GitBox <gi...@apache.org>.
jsoref commented on a change in pull request #30323:
URL: https://github.com/apache/spark/pull/30323#discussion_r521746547



##########
File path: python/pyspark/cloudpickle/cloudpickle_fast.py
##########
@@ -556,7 +556,7 @@ def dump(self, obj):
         # `dispatch` attribute.  Earlier versions of the protocol 5 CloudPickler
         # used `CloudPickler.dispatch` as a class-level attribute storing all
         # reducers implemented by cloudpickle, but the attribute name was not a
-        # great choice given the meaning of `Cloudpickler.dispatch` when
+        # great choice given the meaning of `Cloudpickle.dispatch` when

Review comment:
       Hmmm, looks like it should be `CloudPickler.dispatch` and not `Cloudpickler.dispatch`..




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] jsoref commented on pull request #30323: Spelling

Posted by GitBox <gi...@apache.org>.
jsoref commented on pull request #30323:
URL: https://github.com/apache/spark/pull/30323#issuecomment-739987352


   Fwiw, these are the [outstanding changes in this PR](https://github.com/jsoref/spark/compare/spelling-sql-core..jsoref:spelling). I expect this PR to be closed with just one commit `create`. And I'll create a distinct PR for `legacy_setops_precedence_enbled` (`enabled`) and `PushedFilers` (`filters`) which, for the time being, I've added to this series.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] jsoref commented on pull request #30323: Spelling

Posted by GitBox <gi...@apache.org>.
jsoref commented on pull request #30323:
URL: https://github.com/apache/spark/pull/30323#issuecomment-728721289


   ok, so, part one is merged. 
   `bin`, `repl`, and `streaming` seem to be the next least recently touched...
   
   What's do you recommend doing next?


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] srowen commented on a change in pull request #30323: Spelling

Posted by GitBox <gi...@apache.org>.
srowen commented on a change in pull request #30323:
URL: https://github.com/apache/spark/pull/30323#discussion_r521447840



##########
File path: docs/monitoring.md
##########
@@ -421,7 +421,7 @@ to handle the Spark Context setup and tear down.
 
 In addition to viewing the metrics in the UI, they are also available as JSON.  This gives developers
 an easy way to create new visualizations and monitoring tools for Spark.  The JSON is available for
-both running applications, and in the history server.  The endpoints are mounted at `/api/v1`.  Eg.,
+both running applications, and in the history server.  The endpoints are mounted at `/api/v1`.  E.g.,

Review comment:
       You're welcome to write out "For example" in a case like this, where "E.g." isn't really correct usage to start a sentence. But, fine to leave it as is here.

##########
File path: mllib/src/main/scala/org/apache/spark/mllib/clustering/DistanceMeasure.scala
##########
@@ -78,7 +78,7 @@ private[spark] abstract class DistanceMeasure extends Serializable {
   /**
    * Compute distance between centers in a distributed way.
    */
-  def computeStatisticsDistributedly(
+  def computeStatisticsDistributively(

Review comment:
       Although "Distributedly" is a weird neologism, I don't think it's wrong, or at least, "Distributively" sounds weirder. I'd revert this.

##########
File path: core/src/main/scala/org/apache/spark/util/Utils.scala
##########
@@ -1090,20 +1090,20 @@ private[spark] object Utils extends Logging {
     }
     // checks if the hostport contains IPV6 ip and parses the host, port
     if (hostPort != null && hostPort.split(":").length > 2) {
-      val indx: Int = hostPort.lastIndexOf("]:")

Review comment:
       Likewise, not necessary but I don't mind fixing a few odd local var names. That said also fine to not do this in the name of keeping this large change from getting really large.

##########
File path: core/src/test/scala/org/apache/spark/util/JsonProtocolSuite.scala
##########
@@ -968,7 +968,7 @@ private[spark] object JsonProtocolSuite extends Assertions {
   private val stackTrace = {
     Array[StackTraceElement](
       new StackTraceElement("Apollo", "Venus", "Mercury", 42),
-      new StackTraceElement("Afollo", "Vemus", "Mercurry", 420),
+      new StackTraceElement("Afollo", "Vemus", "Mercury", 420),
       new StackTraceElement("Ayollo", "Vesus", "Blackberry", 4200)

Review comment:
       I think it's intentional, for whatever reason. You can leave it or add a comment

##########
File path: sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/analysis/UnsupportedOperationsSuite.scala
##########
@@ -81,7 +81,7 @@ class UnsupportedOperationsSuite extends SparkFunSuite {
 
   // Commands
   assertNotSupportedInStreamingPlan(
-    "commmands",
+    "commands",
     DummyCommand(),
     outputMode = Append,
     expectedMsgs = "commands" :: Nil)

Review comment:
       I'd just leave it

##########
File path: python/pyspark/mllib/regression.py
##########
@@ -739,7 +739,7 @@ def _validate(self, dstream):
                 "dstream should be a DStream object, got %s" % type(dstream))
         if not self._model:
             raise ValueError(
-                "Model must be intialized using setInitialWeights")
+                "Model must be initialized using setInitialWeights")

Review comment:
       Yes, use this instead of 'initialised'

##########
File path: core/src/test/scala/org/apache/spark/deploy/worker/WorkerSuite.scala
##########
@@ -342,7 +342,7 @@ class WorkerSuite extends SparkFunSuite with Matchers with BeforeAndAfter {
     testWorkDirCleanupAndRemoveMetadataWithConfig(true)
   }
 
-  test("WorkdDirCleanup cleans only app dirs when" +
+  test("WorkDirCleanup cleans only app dirs when" +

Review comment:
       What's suspicious? looks like a clean typo fix

##########
File path: common/network-common/src/test/java/org/apache/spark/network/crypto/AuthEngineSuite.java
##########
@@ -150,8 +150,8 @@ public void testEncryptedMessage() throws Exception {
 
       ByteArrayWritableChannel channel = new ByteArrayWritableChannel(data.length);
       TransportCipher.EncryptedMessage emsg = handler.createEncryptedMessage(buf);
-      while (emsg.transfered() < emsg.count()) {
-        emsg.transferTo(channel, emsg.transfered());
+      while (emsg.transferred() < emsg.count()) {
+        emsg.transferTo(channel, emsg.transferred());

Review comment:
       Yeah, looks like this was misspelled in netty and there is a new correctly-spelled version, which we can use safely.

##########
File path: core/src/main/resources/org/apache/spark/ui/static/dataTables.rowsGroup.js
##########
@@ -166,7 +166,7 @@ RowsGroup.prototype = {
 			this._mergeColumn(newSequenceRow, (iRow-1), columnsIndexesCopy)
 	},
 	
-	_toogleDirection: function(dir)
+	_toggleDirection: function(dir)

Review comment:
       Yeah I wouldn't bother fixing these; if we update it'll get overwritten anyway

##########
File path: R/pkg/tests/fulltests/test_sparkSQL.R
##########
@@ -2066,7 +2066,7 @@ test_that("higher order functions", {
     createDataFrame(data.frame(id = 1)),
     expr("CAST(array(1.0, 2.0, -3.0, -4.0) AS array<double>) xs"),
     expr("CAST(array(0.0, 3.0, 48.0) AS array<double>) ys"),
-    expr("array('FAILED', 'SUCCEDED') as vs"),
+    expr("array('FAILED', 'SUCCEEDED') as vs"),

Review comment:
       What's the issue here?

##########
File path: R/pkg/tests/fulltests/test_jvm_api.R
##########
@@ -20,11 +20,11 @@ context("JVM API")
 sparkSession <- sparkR.session(master = sparkRTestMaster, enableHiveSupport = FALSE)
 
 test_that("Create and call methods on object", {
-  jarr <- sparkR.newJObject("java.util.ArrayList")
+  jarray <- sparkR.newJObject("java.util.ArrayList")

Review comment:
       I don't think you _have_ to change these as they're more abbreviation than typo, but if there aren't many, it's OK. 

##########
File path: core/src/test/scala/org/apache/spark/rpc/netty/NettyRpcEnvSuite.scala
##########
@@ -73,7 +73,7 @@ class NettyRpcEnvSuite extends RpcEnvSuite with MockitoSugar with TimeLimits {
 
     val nettyEnv = env.asInstanceOf[NettyRpcEnv]
     val client = mock[TransportClient]
-    val senderAddress = RpcAddress("locahost", 12345)
+    val senderAddress = RpcAddress("localhost", 12345)

Review comment:
       LOL may not matter for the test but yes

##########
File path: external/avro/src/main/scala/org/apache/spark/sql/avro/AvroDeserializer.scala
##########
@@ -58,10 +58,10 @@ private[sql] class AvroDeserializer(
 
   private lazy val decimalConversions = new DecimalConversion()
 
-  private val dateRebaseFunc = DataSourceUtils.creteDateRebaseFuncInRead(
+  private val dateRebaseFunc = DataSourceUtils.createDateRebaseFuncInRead(
     datetimeRebaseMode, "Avro")
 
-  private val timestampRebaseFunc = DataSourceUtils.creteTimestampRebaseFuncInRead(
+  private val timestampRebaseFunc = DataSourceUtils.createTimestampRebaseFuncInRead(

Review comment:
       LOL I'm sure it is not. Just like the POSIX function `creat` was a mistake so many years ago.

##########
File path: common/network-common/src/main/java/org/apache/spark/network/client/TransportClient.java
##########
@@ -303,7 +303,7 @@ public void close() {
   @Override
   public String toString() {
     return new ToStringBuilder(this, ToStringStyle.SHORT_PREFIX_STYLE)
-      .append("remoteAdress", channel.remoteAddress())
+      .append("remoteAddress", channel.remoteAddress())

Review comment:
       This is probably fine; it's just the string representation and I can't see anyone relying on the toString of TransportClient

##########
File path: launcher/src/main/java/org/apache/spark/launcher/SparkLauncher.java
##########
@@ -93,7 +93,7 @@
    * Maximum time (in ms) to wait for a child process to connect back to the launcher server
    * when using @link{#start()}.
    */
-  public static final String CHILD_CONNECTION_TIMEOUT = "spark.launcher.childConectionTimeout";
+  public static final String CHILD_CONNECTION_TIMEOUT = "spark.launcher.childConnectionTimeout";

Review comment:
       Yeah... unfortunately I would leave this as is here. We can deprecate this and recognize the correctly spelled version, but that could be separate.

##########
File path: project/MimaExcludes.scala
##########
@@ -1488,7 +1488,7 @@ object MimaExcludes {
       ProblemFilters.exclude[DirectMissingMethodProblem]("org.apache.spark.SparkEnv.getThreadLocal"),
       ProblemFilters.exclude[DirectMissingMethodProblem]("org.apache.spark.mllib.rdd.RDDFunctions.treeReduce"),
       ProblemFilters.exclude[DirectMissingMethodProblem]("org.apache.spark.mllib.rdd.RDDFunctions.treeAggregate"),
-      ProblemFilters.exclude[DirectMissingMethodProblem]("org.apache.spark.mllib.tree.configuration.Strategy.defaultStategy"),
+      ProblemFilters.exclude[DirectMissingMethodProblem]("org.apache.spark.mllib.tree.configuration.Strategy.defaultStrategy"),

Review comment:
       Likewise.

##########
File path: python/pyspark/mllib/clustering.py
##########
@@ -843,7 +843,7 @@ def setInitialCenters(self, centers, weights):
     @since('1.5.0')
     def setRandomCenters(self, dim, weight, seed):
         """
-        Set the initial centres to be random samples from
+        Set the initial centers to be random samples from

Review comment:
       Yes it's British spelling. We generally use US spelling. This is OK but also OK to ignore.

##########
File path: python/pyspark/cloudpickle/cloudpickle_fast.py
##########
@@ -556,7 +556,7 @@ def dump(self, obj):
         # `dispatch` attribute.  Earlier versions of the protocol 5 CloudPickler
         # used `CloudPickler.dispatch` as a class-level attribute storing all
         # reducers implemented by cloudpickle, but the attribute name was not a
-        # great choice given the meaning of `Cloudpickler.dispatch` when
+        # great choice given the meaning of `Cloudpickle.dispatch` when

Review comment:
       This is correct already it seems. CloudPickler is a thing

##########
File path: mllib/src/main/scala/org/apache/spark/ml/feature/Selector.scala
##########
@@ -77,7 +77,7 @@ private[feature] trait SelectorParams extends Params
    * @group param
    */
   @Since("3.1.0")
-  final val fpr = new DoubleParam(this, "fpr", "The higest p-value for features to be kept.",
+  final val fpr = new DoubleParam(this, "fpr", "The highest p-value for features to be kept.",

Review comment:
       Not in any meaningful API sense. This is fine, just docs

##########
File path: project/MimaExcludes.scala
##########
@@ -730,7 +730,7 @@ object MimaExcludes {
     ProblemFilters.exclude[DirectMissingMethodProblem]("org.apache.spark.deploy.SparkSubmit.prepareSubmitEnvironment"),
 
     // [SPARK-22324][SQL][PYTHON] Upgrade Arrow to 0.8.0
-    ProblemFilters.exclude[FinalMethodProblem]("org.apache.spark.network.util.AbstractFileRegion.transfered"),
+    ProblemFilters.exclude[FinalMethodProblem]("org.apache.spark.network.util.AbstractFileRegion.transferred"),

Review comment:
       This you probably need to leave as is

##########
File path: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/subquery.scala
##########
@@ -566,7 +566,7 @@ object RewriteCorrelatedScalarSubquery extends Rule[LogicalPlan] with AliasHelpe
                 subqueryRoot = Project(projList ++ havingInputs, subqueryRoot)
               case s @ SubqueryAlias(alias, _) =>
                 subqueryRoot = SubqueryAlias(alias, subqueryRoot)
-              case op => sys.error(s"Unexpected operator $op in corelated subquery")
+              case op => sys.error(s"Unexpected operator $op in correlated subquery")

Review comment:
       It's fine.

##########
File path: python/pyspark/ml/regression.py
##########
@@ -1442,7 +1442,7 @@ def setParams(self, *, featuresCol="features", labelCol="label", predictionCol="
                   maxDepth=5, maxBins=32, minInstancesPerNode=1, minInfoGain=0.0,
                   maxMemoryInMB=256, cacheNodeIds=False, subsamplingRate=1.0,
                   checkpointInterval=10, lossType="squared", maxIter=20, stepSize=0.1, seed=None,
-                  impuriy="variance", featureSubsetStrategy="all", validationTol=0.01,
+                  impurity="variance", featureSubsetStrategy="all", validationTol=0.01,

Review comment:
       Yikes, looks like a bug. I'm not sure this would have had effect before, as it mismatches the param name. @HyukjinKwon are there implications to changing this in python?

##########
File path: sql/core/src/main/java/org/apache/spark/sql/execution/datasources/parquet/VectorizedRleValuesReader.java
##########
@@ -486,7 +486,7 @@ public void readDoubles(
     }
   }
 
-  public void readBinarys(
+  public void readBinaries(

Review comment:
       Yeah I wouldn't change this here (unfortunately) even though it's probably effectively private to Spark




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] jsoref commented on a change in pull request #30323: Spelling

Posted by GitBox <gi...@apache.org>.
jsoref commented on a change in pull request #30323:
URL: https://github.com/apache/spark/pull/30323#discussion_r523813885



##########
File path: sql/hive-thriftserver/src/main/java/org/apache/hive/service/cli/OperationState.java
##########
@@ -31,7 +31,7 @@
   CANCELED(TOperationState.CANCELED_STATE, true),
   CLOSED(TOperationState.CLOSED_STATE, true),
   ERROR(TOperationState.ERROR_STATE, true),
-  UNKNOWN(TOperationState.UKNOWN_STATE, false),
+  UNKNOWN(TOperationState.UNKNOWN_STATE, false),

Review comment:
       apache/hive#1674 https://github.com/apache/hive/pull/1674/commits/40b81969b47d310363f14fb751e7fb7db7bc10d3




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] jsoref commented on a change in pull request #30323: Spelling

Posted by GitBox <gi...@apache.org>.
jsoref commented on a change in pull request #30323:
URL: https://github.com/apache/spark/pull/30323#discussion_r521049587



##########
File path: R/pkg/tests/fulltests/test_sparkSQL.R
##########
@@ -2066,7 +2066,7 @@ test_that("higher order functions", {
     createDataFrame(data.frame(id = 1)),
     expr("CAST(array(1.0, 2.0, -3.0, -4.0) AS array<double>) xs"),
     expr("CAST(array(0.0, 3.0, 48.0) AS array<double>) ys"),
-    expr("array('FAILED', 'SUCCEDED') as vs"),
+    expr("array('FAILED', 'SUCCEEDED') as vs"),

Review comment:
       This worries me

##########
File path: R/CRAN_RELEASE.md
##########
@@ -25,7 +25,7 @@ To release SparkR as a package to CRAN, we would use the `devtools` package. Ple
 
 First, check that the `Version:` field in the `pkg/DESCRIPTION` file is updated. Also, check for stale files not under source control.
 
-Note that while `run-tests.sh` runs `check-cran.sh` (which runs `R CMD check`), it is doing so with `--no-manual --no-vignettes`, which skips a few vignettes or PDF checks - therefore it will be preferred to run `R CMD check` on the source package built manually before uploading a release. Also note that for CRAN checks for pdf vignettes to success, `qpdf` tool must be there (to install it, eg. `yum -q -y install qpdf`).
+Note that while `run-tests.sh` runs `check-cran.sh` (which runs `R CMD check`), it is doing so with `--no-manual --no-vignettes`, which skips a few vignettes or PDF checks - therefore it will be preferred to run `R CMD check` on the source package built manually before uploading a release. Also note that for CRAN checks for pdf vignettes to success, `qpdf` tool must be there (to install it, e.g. `yum -q -y install qpdf`).

Review comment:
       I'm happy to drop `e.g.` and `i.e.`

##########
File path: core/src/test/scala/org/apache/spark/deploy/worker/WorkerSuite.scala
##########
@@ -342,7 +342,7 @@ class WorkerSuite extends SparkFunSuite with Matchers with BeforeAndAfter {
     testWorkDirCleanupAndRemoveMetadataWithConfig(true)
   }
 
-  test("WorkdDirCleanup cleans only app dirs when" +
+  test("WorkDirCleanup cleans only app dirs when" +

Review comment:
       This is suspicious

##########
File path: core/src/main/scala/org/apache/spark/scheduler/BarrierJobAllocationFailed.scala
##########
@@ -45,10 +45,10 @@ private[spark] object BarrierJobAllocationFailed {
   val ERROR_MESSAGE_RUN_BARRIER_WITH_UNSUPPORTED_RDD_CHAIN_PATTERN =
     "[SPARK-24820][SPARK-24821]: Barrier execution mode does not allow the following pattern of " +
       "RDD chain within a barrier stage:\n1. Ancestor RDDs that have different number of " +
-      "partitions from the resulting RDD (eg. union()/coalesce()/first()/take()/" +
+      "partitions from the resulting RDD (e.g. union()/coalesce()/first()/take()/" +

Review comment:
       This is some form of output change

##########
File path: dev/create-release/translate-contributors.py
##########
@@ -207,7 +207,7 @@ def generate_candidates(author, issues):
             print(p)
         # In interactive mode, additionally provide "custom" option and await user response
         if INTERACTIVE_MODE:
-            print("    [%d] %s - Raw Github username" % (raw_index, author))
+            print("    [%d] %s - Raw GitHub username" % (raw_index, author))

Review comment:
       This is user facing

##########
File path: docs/sql-ref-syntax-dml-insert-into.md
##########
@@ -100,29 +100,29 @@ SELECT * FROM students;
 ```sql
 -- Assuming the persons table has already been created and populated.
 SELECT * FROM persons;
-+-------------+-------------------------+---------+
-|         name|                  address|      ssn|
-+-------------+-------------------------+---------+
-|Dora Williams|134 Forest Ave, Melo Park|123456789|
-+-------------+-------------------------+---------+
-|  Eddie Davis|  245 Market St, Milpitas|345678901|
-+-------------+-------------------------+---------+
++-------------+--------------------------+---------+
+|         name|                   address|      ssn|
++-------------+--------------------------+---------+
+|Dora Williams|134 Forest Ave, Menlo Park|123456789|

Review comment:
       This address would actually exist, and the cities are all in the Silicon Valley area, so there's no good reason to misspell this city.

##########
File path: project/MimaExcludes.scala
##########
@@ -730,7 +730,7 @@ object MimaExcludes {
     ProblemFilters.exclude[DirectMissingMethodProblem]("org.apache.spark.deploy.SparkSubmit.prepareSubmitEnvironment"),
 
     // [SPARK-22324][SQL][PYTHON] Upgrade Arrow to 0.8.0
-    ProblemFilters.exclude[FinalMethodProblem]("org.apache.spark.network.util.AbstractFileRegion.transfered"),
+    ProblemFilters.exclude[FinalMethodProblem]("org.apache.spark.network.util.AbstractFileRegion.transferred"),

Review comment:
       ?

##########
File path: sql/core/src/main/java/org/apache/spark/sql/execution/datasources/parquet/VectorizedRleValuesReader.java
##########
@@ -486,7 +486,7 @@ public void readDoubles(
     }
   }
 
-  public void readBinarys(
+  public void readBinaries(

Review comment:
       this might be a frozen public api...

##########
File path: sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/expressions/StringExpressionsSuite.scala
##########
@@ -422,7 +422,7 @@ class StringExpressionsSuite extends SparkFunSuite with ExpressionEvalHelper {
     checkEvaluation(SoundEx(Literal("Uhrbach")), "U612")
     checkEvaluation(SoundEx(Literal("Moskowitz")), "M232")
     checkEvaluation(SoundEx(Literal("Moskovitz")), "M213")
-    checkEvaluation(SoundEx(Literal("relyheewsgeessg")), "R422")
+    checkEvaluation(SoundEx(Literal("reltheewsgeessg")), "R422")

Review comment:
       I didn't mean to change this (will drop)

##########
File path: common/network-common/src/test/java/org/apache/spark/network/crypto/AuthEngineSuite.java
##########
@@ -150,8 +150,8 @@ public void testEncryptedMessage() throws Exception {
 
       ByteArrayWritableChannel channel = new ByteArrayWritableChannel(data.length);
       TransportCipher.EncryptedMessage emsg = handler.createEncryptedMessage(buf);
-      while (emsg.transfered() < emsg.count()) {
-        emsg.transferTo(channel, emsg.transfered());
+      while (emsg.transferred() < emsg.count()) {
+        emsg.transferTo(channel, emsg.transferred());

Review comment:
       I hope I can keep this

##########
File path: common/network-common/src/main/java/org/apache/spark/network/client/TransportClient.java
##########
@@ -303,7 +303,7 @@ public void close() {
   @Override
   public String toString() {
     return new ToStringBuilder(this, ToStringStyle.SHORT_PREFIX_STYLE)
-      .append("remoteAdress", channel.remoteAddress())
+      .append("remoteAddress", channel.remoteAddress())

Review comment:
       This is probably a dangerous change

##########
File path: core/src/test/scala/org/apache/spark/rpc/netty/NettyRpcEnvSuite.scala
##########
@@ -73,7 +73,7 @@ class NettyRpcEnvSuite extends RpcEnvSuite with MockitoSugar with TimeLimits {
 
     val nettyEnv = env.asInstanceOf[NettyRpcEnv]
     val client = mock[TransportClient]
-    val senderAddress = RpcAddress("locahost", 12345)
+    val senderAddress = RpcAddress("localhost", 12345)

Review comment:
       This is scary

##########
File path: common/network-common/src/test/java/org/apache/spark/network/util/TransportFrameDecoderSuite.java
##########
@@ -98,7 +98,7 @@ public void testConsolidationPerf() throws Exception {
             writtenBytes += pieceBytes;
           }
           logger.info("Writing 300MiB frame buf with consolidation of threshold " + threshold
-              + " took " + totalTime + " milis");
+              + " took " + totalTime + " millis");

Review comment:
       This is a logging output change

##########
File path: core/src/main/resources/org/apache/spark/ui/static/dataTables.rowsGroup.js
##########
@@ -166,7 +166,7 @@ RowsGroup.prototype = {
 			this._mergeColumn(newSequenceRow, (iRow-1), columnsIndexesCopy)
 	},
 	
-	_toogleDirection: function(dir)
+	_toggleDirection: function(dir)

Review comment:
       I try to skip resources that come from third party repos -- if this or anything else fall into that category, please let me know.

##########
File path: common/network-common/src/main/java/org/apache/spark/network/util/AbstractFileRegion.java
##########
@@ -24,7 +24,7 @@
 
   @Override
   @SuppressWarnings("deprecation")
-  public final long transfered() {
+  public final long transferred() {

Review comment:
       Oops, I should have dropped this

##########
File path: core/src/test/scala/org/apache/spark/util/JsonProtocolSuite.scala
##########
@@ -968,7 +968,7 @@ private[spark] object JsonProtocolSuite extends Assertions {
   private val stackTrace = {
     Array[StackTraceElement](
       new StackTraceElement("Apollo", "Venus", "Mercury", 42),
-      new StackTraceElement("Afollo", "Vemus", "Mercurry", 420),
+      new StackTraceElement("Afollo", "Vemus", "Mercury", 420),
       new StackTraceElement("Ayollo", "Vesus", "Blackberry", 4200)

Review comment:
       I'm not sure what the goal of this blob is (`BlackBerry` shouldn't be written the way it is written here either). If things are intentionally misspelled, it'd be great if there were a comment justifying the nonstandard use.

##########
File path: dev/github_jira_sync.py
##########
@@ -157,7 +157,7 @@ def reset_pr_labels(pr_num, jira_components):
     considered = considered + [pr_num]
 
     url = pr['html_url']
-    title = "[Github] Pull Request #%s (%s)" % (pr['number'], pr['user']['login'])
+    title = "[GitHub] Pull Request #%s (%s)" % (pr['number'], pr['user']['login'])

Review comment:
       This is likely user facing (as is the block above)

##########
File path: dev/create-release/release-build.sh
##########
@@ -452,7 +452,7 @@ if [[ "$1" == "publish-release" ]]; then
 
   if ! is_dry_run; then
     nexus_upload=$NEXUS_ROOT/deployByRepositoryId/$staged_repo_id
-    echo "Uplading files to $nexus_upload"
+    echo "Uploading files to $nexus_upload"

Review comment:
       This is public facing for some weak definition

##########
File path: dev/appveyor-guide.md
##########
@@ -33,22 +33,22 @@ Currently, SparkR on Windows is being tested with [AppVeyor](https://ci.appveyor
     
   <img width="379" alt="2016-09-04 11 07 58" src="https://cloud.githubusercontent.com/assets/6477701/18228810/2f674e5e-7299-11e6-929d-5c2dff269ddc.png">
 
-- Click "Github".
+- Click "GitHub".
 
   <img width="360" alt="2016-09-04 11 08 10" src="https://cloud.githubusercontent.com/assets/6477701/18228811/344263a0-7299-11e6-90b7-9b1c7b6b8b01.png">
 
 
-#### After signing up, go to profile to link Github and AppVeyor.
+#### After signing up, go to profile to link GitHub and AppVeyor.
 
 - Click your account and then click "Profile".
 
   <img width="204" alt="2016-09-04 11 09 43" src="https://cloud.githubusercontent.com/assets/6477701/18228803/12a4b810-7299-11e6-9140-5cfc277297b1.png">
 
-- Enable the link with GitHub via clicking "Link Github account".
+- Enable the link with GitHub via clicking "Link GitHub account".

Review comment:
       I hope no one is publicly misspelling the GitHub brand.

##########
File path: core/src/main/scala/org/apache/spark/rpc/netty/NettyRpcEnv.scala
##########
@@ -254,14 +254,14 @@ private[netty] class NettyRpcEnv(
 
       val timeoutCancelable = timeoutScheduler.schedule(new Runnable {
         override def run(): Unit = {
-          val remoteReceAddr = if (remoteAddr == null) {
+          val remoteRecAddr = if (remoteAddr == null) {

Review comment:
       This is odd

##########
File path: docs/sql-ref-syntax-dml-insert-into.md
##########
@@ -69,11 +69,11 @@ INSERT INTO students VALUES
     ('Amy Smith', '123 Park Ave, San Jose', 111111);
 
 SELECT * FROM students;
-+---------+---------------------+----------+
-|     name|              address|student_id|
-+---------+---------------------+----------+
-|Amy Smith|123 Park Ave,San Jose|    111111|
-+---------+---------------------+----------+
++---------+----------------------+----------+
+|     name|               address|student_id|
++---------+----------------------+----------+
+|Amy Smith|123 Park Ave, San Jose|    111111|

Review comment:
       This is odd. The `INSERT` above doesn't match the output w/o this change.

##########
File path: dev/create-release/translate-contributors.py
##########
@@ -104,12 +104,12 @@
 
 def generate_candidates(author, issues):
     candidates = []
-    # First check for full name of Github user
+    # First check for full name of GitHub user
     github_name = get_github_name(author, github_client)
     if github_name:
-        candidates.append((github_name, "Full name of Github user %s" % author))
+        candidates.append((github_name, "Full name of GitHub user %s" % author))
     else:
-        candidates.append((NOT_FOUND, "No full name found for Github user %s" % author))
+        candidates.append((NOT_FOUND, "No full name found for GitHub user %s" % author))

Review comment:
       This is probably user facing

##########
File path: examples/src/main/python/sql/arrow.py
##########
@@ -285,7 +285,7 @@ def asof_join(l, r):
     ser_to_frame_pandas_udf_example(spark)
     print("Running pandas_udf example: Series to Series")
     ser_to_ser_pandas_udf_example(spark)
-    print("Running pandas_udf example: Iterator of Series to Iterator of Seires")
+    print("Running pandas_udf example: Iterator of Series to Iterator of Series")

Review comment:
       ??

##########
File path: docs/sql-ref-syntax-qry-select-orderby.md
##########
@@ -28,7 +28,7 @@ clause, this clause guarantees a total order in the output.
 ### Syntax
 
 ```sql
-ORDER BY { expression [ sort_direction | nulls_sort_oder ] [ , ... ] }
+ORDER BY { expression [ sort_direction | nulls_sort_order ] [ , ... ] }

Review comment:
       This is scary

##########
File path: launcher/src/main/java/org/apache/spark/launcher/SparkLauncher.java
##########
@@ -93,7 +93,7 @@
    * Maximum time (in ms) to wait for a child process to connect back to the launcher server
    * when using @link{#start()}.
    */
-  public static final String CHILD_CONNECTION_TIMEOUT = "spark.launcher.childConectionTimeout";
+  public static final String CHILD_CONNECTION_TIMEOUT = "spark.launcher.childConnectionTimeout";

Review comment:
       This is a public thing...

##########
File path: licenses-binary/LICENSE-jsp-api.txt
##########
@@ -722,7 +722,7 @@ Oracle facilitates your further distribution of this package by adding
 the Classpath Exception to the necessary parts of its GPLv2 code, which
 permits you to use that code in combination with other independent
 modules not licensed under the GPLv2.  However, note that this would
-not permit you to commingle code under an incompatible license with
+not permit you to comingle code under an incompatible license with

Review comment:
       Oops. I try to avoid touching license files -- I'll drop

##########
File path: python/pyspark/context.py
##########
@@ -258,7 +258,7 @@ def _do_init(self, master, appName, sparkHome, pyFiles, environment, batchSize,
                         sys.path.insert(1, filepath)
                 except Exception:
                     warnings.warn(
-                        "Failed to add file [%s] speficied in 'spark.submit.pyFiles' to "
+                        "Failed to add file [%s] specified in 'spark.submit.pyFiles' to "

Review comment:
       user facing?

##########
File path: project/MimaExcludes.scala
##########
@@ -1488,7 +1488,7 @@ object MimaExcludes {
       ProblemFilters.exclude[DirectMissingMethodProblem]("org.apache.spark.SparkEnv.getThreadLocal"),
       ProblemFilters.exclude[DirectMissingMethodProblem]("org.apache.spark.mllib.rdd.RDDFunctions.treeReduce"),
       ProblemFilters.exclude[DirectMissingMethodProblem]("org.apache.spark.mllib.rdd.RDDFunctions.treeAggregate"),
-      ProblemFilters.exclude[DirectMissingMethodProblem]("org.apache.spark.mllib.tree.configuration.Strategy.defaultStategy"),
+      ProblemFilters.exclude[DirectMissingMethodProblem]("org.apache.spark.mllib.tree.configuration.Strategy.defaultStrategy"),

Review comment:
       ?

##########
File path: mllib/src/main/scala/org/apache/spark/ml/feature/Selector.scala
##########
@@ -77,7 +77,7 @@ private[feature] trait SelectorParams extends Params
    * @group param
    */
   @Since("3.1.0")
-  final val fpr = new DoubleParam(this, "fpr", "The higest p-value for features to be kept.",
+  final val fpr = new DoubleParam(this, "fpr", "The highest p-value for features to be kept.",

Review comment:
       Is this public in some special way?

##########
File path: python/pyspark/ml/regression.py
##########
@@ -1442,7 +1442,7 @@ def setParams(self, *, featuresCol="features", labelCol="label", predictionCol="
                   maxDepth=5, maxBins=32, minInstancesPerNode=1, minInfoGain=0.0,
                   maxMemoryInMB=256, cacheNodeIds=False, subsamplingRate=1.0,
                   checkpointInterval=10, lossType="squared", maxIter=20, stepSize=0.1, seed=None,
-                  impuriy="variance", featureSubsetStrategy="all", validationTol=0.01,
+                  impurity="variance", featureSubsetStrategy="all", validationTol=0.01,

Review comment:
       api?

##########
File path: external/avro/src/main/scala/org/apache/spark/sql/avro/AvroDeserializer.scala
##########
@@ -58,10 +58,10 @@ private[sql] class AvroDeserializer(
 
   private lazy val decimalConversions = new DecimalConversion()
 
-  private val dateRebaseFunc = DataSourceUtils.creteDateRebaseFuncInRead(
+  private val dateRebaseFunc = DataSourceUtils.createDateRebaseFuncInRead(
     datetimeRebaseMode, "Avro")
 
-  private val timestampRebaseFunc = DataSourceUtils.creteTimestampRebaseFuncInRead(
+  private val timestampRebaseFunc = DataSourceUtils.createTimestampRebaseFuncInRead(

Review comment:
       I'm really uncertain about this -- maybe `crete` is a thing??

##########
File path: python/pyspark/mllib/regression.py
##########
@@ -739,7 +739,7 @@ def _validate(self, dstream):
                 "dstream should be a DStream object, got %s" % type(dstream))
         if not self._model:
             raise ValueError(
-                "Model must be intialized using setInitialWeights")
+                "Model must be initialized using setInitialWeights")

Review comment:
       just a note that this is en-US

##########
File path: python/pyspark/cloudpickle/cloudpickle_fast.py
##########
@@ -556,7 +556,7 @@ def dump(self, obj):
         # `dispatch` attribute.  Earlier versions of the protocol 5 CloudPickler
         # used `CloudPickler.dispatch` as a class-level attribute storing all
         # reducers implemented by cloudpickle, but the attribute name was not a
-        # great choice given the meaning of `Cloudpickler.dispatch` when
+        # great choice given the meaning of `Cloudpickle.dispatch` when

Review comment:
       ?

##########
File path: python/pyspark/mllib/clustering.py
##########
@@ -843,7 +843,7 @@ def setInitialCenters(self, centers, weights):
     @since('1.5.0')
     def setRandomCenters(self, dim, weight, seed):
         """
-        Set the initial centres to be random samples from
+        Set the initial centers to be random samples from

Review comment:
       if this is an en-GB thing, I'll drop (I try not to enforce a particular style; this repo appears very mixed; I think this change was done by an automated pass)

##########
File path: python/pyspark/ml/regression.pyi
##########
@@ -477,7 +477,7 @@ class GBTRegressor(
         maxIter: int = ...,
         stepSize: float = ...,
         seed: Optional[int] = ...,
-        impuriy: str = ...,
+        impurity: str = ...,

Review comment:
       api?

##########
File path: python/pyspark/sql/utils.py
##########
@@ -145,10 +145,10 @@ def toJArray(gateway, jtype, arr):
     :param jtype: java type of element in array
     :param arr: python type list
     """
-    jarr = gateway.new_array(jtype, len(arr))
+    jarray = gateway.new_array(jtype, len(arr))

Review comment:
       `jarray` is used for the same sort of thing in other files

##########
File path: resource-managers/kubernetes/core/src/main/scala/org/apache/spark/deploy/k8s/Config.scala
##########
@@ -420,7 +420,7 @@ private[spark] object Config extends Logging {
   val KUBERNETES_FILE_UPLOAD_PATH =
     ConfigBuilder("spark.kubernetes.file.upload.path")
       .doc("Hadoop compatible file system path where files from the local file system " +
-        "will be uploded to in cluster mode.")
+        "will be uploaded to in cluster mode.")

Review comment:
       user facing

##########
File path: resource-managers/kubernetes/core/src/test/scala/org/apache/spark/deploy/k8s/KubernetesVolumeUtilsSuite.scala
##########
@@ -49,14 +49,14 @@ class KubernetesVolumeUtilsSuite extends SparkFunSuite {
     val sparkConf = new SparkConf(false)
     sparkConf.set("test.persistentVolumeClaim.volumeName.mount.path", "/path")
     sparkConf.set("test.persistentVolumeClaim.volumeName.mount.readOnly", "true")
-    sparkConf.set("test.persistentVolumeClaim.volumeName.options.claimName", "claimeName")

Review comment:
       ??

##########
File path: resource-managers/kubernetes/core/src/main/scala/org/apache/spark/scheduler/cluster/k8s/ExecutorPodsAllocator.scala
##########
@@ -180,7 +180,7 @@ private[spark] class ExecutorPodsAllocator(
     // It's possible that we have outstanding pods that are outdated when dynamic allocation
     // decides to downscale the application. So check if we can release any pending pods early
     // instead of waiting for them to time out. Drop them first from the unacknowledged list,
-    // then from the pending. However, in order to prevent too frequent frunctuation, newly

Review comment:
       ?????

##########
File path: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/subquery.scala
##########
@@ -566,7 +566,7 @@ object RewriteCorrelatedScalarSubquery extends Rule[LogicalPlan] with AliasHelpe
                 subqueryRoot = Project(projList ++ havingInputs, subqueryRoot)
               case s @ SubqueryAlias(alias, _) =>
                 subqueryRoot = SubqueryAlias(alias, subqueryRoot)
-              case op => sys.error(s"Unexpected operator $op in corelated subquery")
+              case op => sys.error(s"Unexpected operator $op in correlated subquery")

Review comment:
       user facing

##########
File path: sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/DataSourceUtils.scala
##########
@@ -165,7 +165,7 @@ object DataSourceUtils {
       "Gregorian calendar.", null)
   }
 
-  def creteDateRebaseFuncInRead(
+  def createDateRebaseFuncInRead(

Review comment:
       see earlier about "is `crete` a thing?"

##########
File path: sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/analysis/UnsupportedOperationsSuite.scala
##########
@@ -81,7 +81,7 @@ class UnsupportedOperationsSuite extends SparkFunSuite {
 
   // Commands
   assertNotSupportedInStreamingPlan(
-    "commmands",
+    "commands",
     DummyCommand(),
     outputMode = Append,
     expectedMsgs = "commands" :: Nil)

Review comment:
       my change here is probably wrong. would `commands_` be ok?
   Often you insert a randomly uppercased letter to indicate you're expecting an error...

##########
File path: sql/catalyst/src/main/scala/org/apache/spark/sql/internal/SQLConf.scala
##########
@@ -1116,7 +1116,7 @@ object SQLConf {
 
   val CODEGEN_FACTORY_MODE = buildConf("spark.sql.codegen.factoryMode")
     .doc("This config determines the fallback behavior of several codegen generators " +
-      "during tests. `FALLBACK` means trying codegen first and then fallbacking to " +
+      "during tests. `FALLBACK` means trying codegen first and then falling back to " +

Review comment:
       user facing...

##########
File path: sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/parser/ErrorParserSuite.scala
##########
@@ -77,7 +77,7 @@ class ErrorParserSuite extends AnalysisTest {
   }
 
   test("SPARK-21136: misleading error message due to problematic antlr grammar") {
-    intercept("select * from a left joinn b on a.id = b.id", "missing 'JOIN' at 'joinn'")
+    intercept("select * from a left join_ b on a.id = b.id", "missing 'JOIN' at 'join_'")

Review comment:
       I'm assuming this is ok (it's in the family of the `commmands` thing above)

##########
File path: sql/core/src/test/scala/org/apache/spark/sql/SQLQuerySuite.scala
##########
@@ -1316,7 +1316,7 @@ class SQLQuerySuite extends QueryTest with SharedSparkSession with AdaptiveSpark
     )
   }
 
-  test("oder by asc by default when not specify ascending and descending") {

Review comment:
       ?

##########
File path: sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/parser/DataTypeParserSuite.scala
##########
@@ -124,8 +124,8 @@ class DataTypeParserSuite extends SparkFunSuite {
   unsupported("struct<x int, y string>")
 
   test("Do not print empty parentheses for no params") {
-    assert(intercept("unkwon").getMessage.contains("unkwon is not supported"))
-    assert(intercept("unkwon(1,2,3)").getMessage.contains("unkwon(1,2,3) is not supported"))

Review comment:
       does this need to be misspelled?

##########
File path: sql/core/src/test/java/test/org/apache/spark/sql/Java8DatasetAggregatorSuite.java
##########
@@ -34,43 +34,43 @@
   @Test
   public void testTypedAggregationAverage() {
     KeyValueGroupedDataset<String, Tuple2<String, Integer>> grouped = generateGroupedDataset();
-    Dataset<Tuple2<String, Double>> agged = grouped.agg(
+    Dataset<Tuple2<String, Double>> aggregated = grouped.agg(

Review comment:
       Java doesn't have to be terse...

##########
File path: sql/hive/compatibility/src/test/scala/org/apache/spark/sql/hive/execution/HiveCompatibilitySuite.scala
##########
@@ -719,7 +719,7 @@ class HiveCompatibilitySuite extends HiveQueryFileTest with BeforeAndAfter {
     "groupby_multi_insert_common_distinct",
     "groupby_multi_single_reducer2",
     "groupby_multi_single_reducer3",
-    "groupby_mutli_insert_common_distinct",

Review comment:
       is this a reference to another repo?

##########
File path: sql/core/src/test/scala/org/apache/spark/sql/execution/command/DDLParserSuite.scala
##########
@@ -783,7 +783,7 @@ class DDLParserSuite extends AnalysisTest with SharedSparkSession {
       "escape.delim" -> "y",
       "serialization.format" -> "x",
       "line.delim" -> "\n",
-      "colelction.delim" -> "a", // yes, it's a typo from Hive :)
+      "collection.delim" -> "a", // yes, it's a typo from Hive :)

Review comment:
       oops, will drop

##########
File path: sql/hive/src/main/scala/org/apache/spark/sql/hive/execution/HiveOptions.scala
##########
@@ -108,7 +108,7 @@ object HiveOptions {
     "fieldDelim" -> "field.delim",
     "escapeDelim" -> "escape.delim",
     // The following typo is inherited from Hive...
-    "collectionDelim" -> "colelction.delim",
+    "collectionDelim" -> "collection.delim",

Review comment:
       will drop

##########
File path: sql/core/src/test/resources/sql-tests/inputs/postgreSQL/insert.sql
##########
@@ -589,26 +589,26 @@ drop table inserttest;
 -- that estate->es_result_relation_info is appropriately set/reset for each
 -- routed tuple)
 -- [SPARK-29718] Support PARTITION BY [RANGE|LIST|HASH] and PARTITION OF in CREATE TABLE
--- create table donothingbrtrig_test (a int, b text) partition by list (a);
--- create table donothingbrtrig_test1 (b text, a int);
--- create table donothingbrtrig_test2 (c text, b text, a int);
--- alter table donothingbrtrig_test2 drop column c;
--- create or replace function donothingbrtrig_func() returns trigger as $$begin raise notice 'b: %', new.b; return NULL; end$$ language plpgsql;
--- create trigger donothingbrtrig1 before insert on donothingbrtrig_test1 for each row execute procedure donothingbrtrig_func();
--- create trigger donothingbrtrig2 before insert on donothingbrtrig_test2 for each row execute procedure donothingbrtrig_func();
--- alter table donothingbrtrig_test attach partition donothingbrtrig_test1 for values in (1);
--- alter table donothingbrtrig_test attach partition donothingbrtrig_test2 for values in (2);
--- insert into donothingbrtrig_test values (1, 'foo'), (2, 'bar');
+-- create table do nothingbrtrig_test (a int, b text) partition by list (a);
+-- create table do nothingbrtrig_test1 (b text, a int);
+-- create table do nothingbrtrig_test2 (c text, b text, a int);
+-- alter table do nothingbrtrig_test2 drop column c;
+-- create or replace function do nothingbrtrig_func() returns trigger as $$begin raise notice 'b: %', new.b; return NULL; end$$ language plpgsql;
+-- create trigger do nothingbrtrig1 before insert on do nothingbrtrig_test1 for each row execute procedure do nothingbrtrig_func();
+-- create trigger do nothingbrtrig2 before insert on do nothingbrtrig_test2 for each row execute procedure do nothingbrtrig_func();
+-- alter table do nothingbrtrig_test attach partition do nothingbrtrig_test1 for values in (1);
+-- alter table do nothingbrtrig_test attach partition do nothingbrtrig_test2 for values in (2);
+-- insert into do nothingbrtrig_test values (1, 'foo'), (2, 'bar');
 -- [SPARK-29386] Copy data between a file and a table
--- copy donothingbrtrig_test from stdout;
+-- copy do nothingbrtrig_test from stdout;

Review comment:
       oops, didn't mean to do this (will drop)

##########
File path: sql/hive/src/test/resources/ql/src/test/queries/clientpositive/overridden_confs.q
##########
@@ -1,4 +1,4 @@
 set hive.exec.post.hooks=org.apache.hadoop.hive.ql.hooks.VerifyOverriddenConfigsHook;
-set hive.config.doesnt.exit=abc;
+set hive.config.does_not.exit=abc;

Review comment:
       probably need to drop?

##########
File path: sql/hive/compatibility/src/test/scala/org/apache/spark/sql/hive/execution/HiveCompatibilitySuite.scala
##########
@@ -961,8 +961,8 @@ class HiveCompatibilitySuite extends HiveQueryFileTest with BeforeAndAfter {
     "subq2",
     "subquery_exists",
     "subquery_exists_having",
-    "subquery_notexists",
-    "subquery_notexists_having",

Review comment:
       ibid?




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] bersprockets commented on pull request #30323: Spelling

Posted by GitBox <gi...@apache.org>.
bersprockets commented on pull request #30323:
URL: https://github.com/apache/spark/pull/30323#issuecomment-725568719


   I also noticed ```pushedFilers```, which I would imagine should be ```pushedFilters```:
   
   ```bash
   bash-3.2$ find . -name "*.scala" | xargs grep 'PushedFilers'
   find . -name "*.scala" | xargs grep 'PushedFilers'
   ./external/avro/src/main/scala/org/apache/spark/sql/v2/avro/AvroScan.scala:    super.getMetaData() ++ Map("PushedFilers" -> seqToString(pushedFilters))
   ./external/avro/src/test/scala/org/apache/spark/sql/avro/AvroSuite.scala:           |PushedFilers: \\[IsNotNull\\(value\\), GreaterThan\\(value,2\\)\\]
   ./sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/v2/csv/CSVScan.scala:    super.getMetaData() ++ Map("PushedFilers" -> seqToString(pushedFilters))
   ./sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/v2/orc/OrcScan.scala:    super.getMetaData() ++ Map("PushedFilers" -> seqToString(pushedFilters))
   ./sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/v2/parquet/ParquetScan.scala:    super.getMetaData() ++ Map("PushedFilers" -> seqToString(pushedFilters))
   ./sql/core/src/test/scala/org/apache/spark/sql/ExplainSuite.scala:            "|PushedFilers: \\[.*\\(id\\), .*\\(value\\), .*\\(id,1\\), .*\\(value,2\\)\\]",
   ./sql/core/src/test/scala/org/apache/spark/sql/ExplainSuite.scala:            "|PushedFilers: \\[.*\\(id\\), .*\\(value\\), .*\\(id,1\\), .*\\(value,2\\)\\]",
   ./sql/core/src/test/scala/org/apache/spark/sql/ExplainSuite.scala:            "|PushedFilers: \\[IsNotNull\\(value\\), GreaterThan\\(value,2\\)\\]",
   bash-3.2$ 
   ```


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins commented on pull request #30323: Spelling

Posted by GitBox <gi...@apache.org>.
AmplabJenkins commented on pull request #30323:
URL: https://github.com/apache/spark/pull/30323#issuecomment-725094691


   Can one of the admins verify this patch?


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] jsoref commented on a change in pull request #30323: Spelling

Posted by GitBox <gi...@apache.org>.
jsoref commented on a change in pull request #30323:
URL: https://github.com/apache/spark/pull/30323#discussion_r523816167



##########
File path: sql/hive-thriftserver/src/main/java/org/apache/hive/service/cli/OperationState.java
##########
@@ -31,7 +31,7 @@
   CANCELED(TOperationState.CANCELED_STATE, true),
   CLOSED(TOperationState.CLOSED_STATE, true),
   ERROR(TOperationState.ERROR_STATE, true),
-  UNKNOWN(TOperationState.UKNOWN_STATE, false),
+  UNKNOWN(TOperationState.UNKNOWN_STATE, false),

Review comment:
       Yeah, in the process of dropping this (currently reviewing the build failures)




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] jsoref commented on pull request #30323: Spelling

Posted by GitBox <gi...@apache.org>.
jsoref commented on pull request #30323:
URL: https://github.com/apache/spark/pull/30323#issuecomment-725160571


   Fine by me


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] HyukjinKwon commented on a change in pull request #30323: Spelling

Posted by GitBox <gi...@apache.org>.
HyukjinKwon commented on a change in pull request #30323:
URL: https://github.com/apache/spark/pull/30323#discussion_r521053593



##########
File path: core/src/main/scala/org/apache/spark/util/Utils.scala
##########
@@ -1090,20 +1090,20 @@ private[spark] object Utils extends Logging {
     }
     // checks if the hostport contains IPV6 ip and parses the host, port
     if (hostPort != null && hostPort.split(":").length > 2) {
-      val indx: Int = hostPort.lastIndexOf("]:")

Review comment:
       this is just variable name. I think it's fine to use a short name




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] jsoref commented on pull request #30323: Spelling

Posted by GitBox <gi...@apache.org>.
jsoref commented on pull request #30323:
URL: https://github.com/apache/spark/pull/30323#issuecomment-741414066


   I've split out some of the things that I dropped earlier. I'm going to leave the rest alone for now (I've left the ones we might revisit as "unresolved" to make them easier to spot).


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org