You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@spark.apache.org by GitBox <gi...@apache.org> on 2021/10/08 21:43:02 UTC

[GitHub] [spark] ueshin opened a new pull request #34227: [SPARK-36961][PYTHON] Use PEP526 style variable type hints

ueshin opened a new pull request #34227:
URL: https://github.com/apache/spark/pull/34227


   ### What changes were proposed in this pull request?
   
   Uses PEP526 style variable type hints.
   
   ### Why are the changes needed?
   
   Now that we have started using newer Python syntax in the code base.
   We should use PEP526 style variable type hints.
   
   - https://www.python.org/dev/peps/pep-0526/
   
   ### Does this PR introduce _any_ user-facing change?
   
   No.
   
   ### How was this patch tested?
   
   Existing tests.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] ueshin commented on a change in pull request #34227: [SPARK-36961][PYTHON] Use PEP526 style variable type hints

Posted by GitBox <gi...@apache.org>.
ueshin commented on a change in pull request #34227:
URL: https://github.com/apache/spark/pull/34227#discussion_r726538886



##########
File path: python/pyspark/pandas/config.py
##########
@@ -246,9 +246,9 @@ def validate(self, v: Any) -> None:
         default="plotly",
         types=str,
     ),
-]  # type: List[Option]
+]
 
-_options_dict = dict(zip((option.key for option in _options), _options))  # type: Dict[str, Option]

Review comment:
       Good catch. Let's keep them.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins commented on pull request #34227: [SPARK-36961][PYTHON] Use PEP526 style variable type hints

Posted by GitBox <gi...@apache.org>.
AmplabJenkins commented on pull request #34227:
URL: https://github.com/apache/spark/pull/34227#issuecomment-940570920


   
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/48583/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] HyukjinKwon commented on pull request #34227: [SPARK-36961][PYTHON] Use PEP526 style variable type hints

Posted by GitBox <gi...@apache.org>.
HyukjinKwon commented on pull request #34227:
URL: https://github.com/apache/spark/pull/34227#issuecomment-941797642


   Merged to master.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] zero323 commented on a change in pull request #34227: [SPARK-36961][PYTHON] Use PEP526 style variable type hints

Posted by GitBox <gi...@apache.org>.
zero323 commented on a change in pull request #34227:
URL: https://github.com/apache/spark/pull/34227#discussion_r725369691



##########
File path: python/pyspark/pandas/frame.py
##########
@@ -10439,8 +10443,9 @@ def gen_names(
             v: Union[Any, Sequence[Any], Dict[Name, Any], Callable[[Name], Any]],
             curnames: List[Name],
         ) -> List[Label]:
+            newnames: List[Name]
             if is_scalar(v):
-                newnames = [cast(Any, v)]  # type: List[Name]
+                newnames = [cast(Any, v)]
             elif is_list_like(v) and not is_dict_like(v):
                 newnames = list(cast(Sequence[Any], v))

Review comment:
       Couldn't these `cast` to `Name`? It's still `Any`, but intention might be clearer and should be still valid if `Name` type is ever refined.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins commented on pull request #34227: [SPARK-36961][PYTHON] Use PEP526 style variable type hints

Posted by GitBox <gi...@apache.org>.
AmplabJenkins commented on pull request #34227:
URL: https://github.com/apache/spark/pull/34227#issuecomment-941205446


   
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/144158/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA commented on pull request #34227: [SPARK-36961][PYTHON] Use PEP526 style variable type hints

Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #34227:
URL: https://github.com/apache/spark/pull/34227#issuecomment-940519990


   **[Test build #144105 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/144105/testReport)** for PR 34227 at commit [`126c89c`](https://github.com/apache/spark/commit/126c89cad9a2549b60a5da29276795d1249de1e6).
    * This patch passes all tests.
    * This patch merges cleanly.
    * This patch adds no public classes.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins removed a comment on pull request #34227: [SPARK-36961][PYTHON] Use PEP526 style variable type hints

Posted by GitBox <gi...@apache.org>.
AmplabJenkins removed a comment on pull request #34227:
URL: https://github.com/apache/spark/pull/34227#issuecomment-941205446


   
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/144158/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA removed a comment on pull request #34227: [SPARK-36961][PYTHON] Use PEP526 style variable type hints

Posted by GitBox <gi...@apache.org>.
SparkQA removed a comment on pull request #34227:
URL: https://github.com/apache/spark/pull/34227#issuecomment-940416929


   **[Test build #144101 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/144101/testReport)** for PR 34227 at commit [`be288e2`](https://github.com/apache/spark/commit/be288e23e9d2fe995aa147d360233a68195816f4).


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA commented on pull request #34227: [SPARK-36961][PYTHON] Use PEP526 style variable type hints

Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #34227:
URL: https://github.com/apache/spark/pull/34227#issuecomment-941198455


   Kubernetes integration test starting
   URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/48636/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins removed a comment on pull request #34227: [SPARK-36961][PYTHON] Use PEP526 style variable type hints

Posted by GitBox <gi...@apache.org>.
AmplabJenkins removed a comment on pull request #34227:
URL: https://github.com/apache/spark/pull/34227#issuecomment-940570920


   
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/48583/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA removed a comment on pull request #34227: [SPARK-36961][PYTHON] Use PEP526 style variable type hints

Posted by GitBox <gi...@apache.org>.
SparkQA removed a comment on pull request #34227:
URL: https://github.com/apache/spark/pull/34227#issuecomment-940506892


   **[Test build #144105 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/144105/testReport)** for PR 34227 at commit [`126c89c`](https://github.com/apache/spark/commit/126c89cad9a2549b60a5da29276795d1249de1e6).


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins removed a comment on pull request #34227: [SPARK-36961][PYTHON] Use PEP526 style variable type hints

Posted by GitBox <gi...@apache.org>.
AmplabJenkins removed a comment on pull request #34227:
URL: https://github.com/apache/spark/pull/34227#issuecomment-941205446






-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] ueshin commented on pull request #34227: [SPARK-36961][PYTHON] Use PEP526 style variable type hints

Posted by GitBox <gi...@apache.org>.
ueshin commented on pull request #34227:
URL: https://github.com/apache/spark/pull/34227#issuecomment-940418937


   @xinrong-databricks 
   
   > Does this PR aim to change pandas API on Spark only?
   
   This also include two files in sql module:
   - python/pyspark/sql/pandas/conversion.py
   - python/pyspark/sql/pandas/types.py


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins commented on pull request #34227: [SPARK-36961][PYTHON] Use PEP526 style variable type hints

Posted by GitBox <gi...@apache.org>.
AmplabJenkins commented on pull request #34227:
URL: https://github.com/apache/spark/pull/34227#issuecomment-940506513


   
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/48579/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA commented on pull request #34227: [SPARK-36961][PYTHON] Use PEP526 style variable type hints

Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #34227:
URL: https://github.com/apache/spark/pull/34227#issuecomment-941164068


   **[Test build #144158 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/144158/testReport)** for PR 34227 at commit [`96c2a12`](https://github.com/apache/spark/commit/96c2a120377c4f9c6722cd21b4ab4aa0d123b896).


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA commented on pull request #34227: [SPARK-36961][PYTHON] Use PEP526 style variable type hints

Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #34227:
URL: https://github.com/apache/spark/pull/34227#issuecomment-940432421


   **[Test build #144101 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/144101/testReport)** for PR 34227 at commit [`be288e2`](https://github.com/apache/spark/commit/be288e23e9d2fe995aa147d360233a68195816f4).
    * This patch passes all tests.
    * This patch merges cleanly.
    * This patch adds no public classes.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] zero323 commented on a change in pull request #34227: [SPARK-36961][PYTHON] Use PEP526 style variable type hints

Posted by GitBox <gi...@apache.org>.
zero323 commented on a change in pull request #34227:
URL: https://github.com/apache/spark/pull/34227#discussion_r725404588



##########
File path: python/pyspark/pandas/namespace.py
##########
@@ -397,22 +400,24 @@ def read_csv(
     if nrows is not None:
         sdf = sdf.limit(nrows)
 
+    index_spark_column_names: List[str]
+    index_names: List[Label]
     if index_col is not None:
         if isinstance(index_col, (str, int)):
             index_col = [index_col]
         for col in index_col:
             if col not in column_labels:
                 raise KeyError(col)
         index_spark_column_names = [column_labels[col] for col in index_col]
-        index_names = [(col,) for col in index_col]  # type: List[Label]
+        index_names = [(col,) for col in index_col]
         column_labels = OrderedDict(
             (label, col) for label, col in column_labels.items() if label not in index_col
         )
     else:
         index_spark_column_names = []
         index_names = []
 
-    psdf = DataFrame(
+    psdf: DataFrame = DataFrame(

Review comment:
       Is this really needed?




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins removed a comment on pull request #34227: [SPARK-36961][PYTHON] Use PEP526 style variable type hints

Posted by GitBox <gi...@apache.org>.
AmplabJenkins removed a comment on pull request #34227:
URL: https://github.com/apache/spark/pull/34227#issuecomment-940439795


   
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/144101/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins removed a comment on pull request #34227: [SPARK-36961][PYTHON] Use PEP526 style variable type hints

Posted by GitBox <gi...@apache.org>.
AmplabJenkins removed a comment on pull request #34227:
URL: https://github.com/apache/spark/pull/34227#issuecomment-939156224


   
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/144038/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins commented on pull request #34227: [SPARK-36961][PYTHON] Use PEP526 style variable type hints

Posted by GitBox <gi...@apache.org>.
AmplabJenkins commented on pull request #34227:
URL: https://github.com/apache/spark/pull/34227#issuecomment-941251177


   
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/48636/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA commented on pull request #34227: [SPARK-36961][PYTHON] Use PEP526 style variable type hints

Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #34227:
URL: https://github.com/apache/spark/pull/34227#issuecomment-940468039


   Kubernetes integration test starting
   URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/48579/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins removed a comment on pull request #34227: [SPARK-36961][PYTHON] Use PEP526 style variable type hints

Posted by GitBox <gi...@apache.org>.
AmplabJenkins removed a comment on pull request #34227:
URL: https://github.com/apache/spark/pull/34227#issuecomment-940529303


   
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/144105/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA commented on pull request #34227: [SPARK-36961][PYTHON] Use PEP526 style variable type hints

Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #34227:
URL: https://github.com/apache/spark/pull/34227#issuecomment-940416929


   **[Test build #144101 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/144101/testReport)** for PR 34227 at commit [`be288e2`](https://github.com/apache/spark/commit/be288e23e9d2fe995aa147d360233a68195816f4).


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] ueshin commented on pull request #34227: [SPARK-36961][PYTHON] Use PEP526 style variable type hints

Posted by GitBox <gi...@apache.org>.
ueshin commented on pull request #34227:
URL: https://github.com/apache/spark/pull/34227#issuecomment-939137466


   cc @zero323 @xinrong-databricks @HyukjinKwon @itholic


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] zero323 commented on a change in pull request #34227: [SPARK-36961][PYTHON] Use PEP526 style variable type hints

Posted by GitBox <gi...@apache.org>.
zero323 commented on a change in pull request #34227:
URL: https://github.com/apache/spark/pull/34227#discussion_r725404967



##########
File path: python/pyspark/pandas/namespace.py
##########
@@ -1378,11 +1383,11 @@ def read_sql_table(
     reader.options(**options)
     sdf = reader.format("jdbc").load()
     index_spark_columns, index_names = _get_index_map(sdf, index_col)
-    psdf = DataFrame(
+    psdf: DataFrame = DataFrame(

Review comment:
       As above. It seems like type checker should be able to fill this, since it is just a constructor.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA removed a comment on pull request #34227: [SPARK-36961][PYTHON] Use PEP526 style variable type hints

Posted by GitBox <gi...@apache.org>.
SparkQA removed a comment on pull request #34227:
URL: https://github.com/apache/spark/pull/34227#issuecomment-939137097


   **[Test build #144038 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/144038/testReport)** for PR 34227 at commit [`1068e7d`](https://github.com/apache/spark/commit/1068e7d2e50a3dc8fe9d727a0f5fe7dffad5fa15).


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins commented on pull request #34227: [SPARK-36961][PYTHON] Use PEP526 style variable type hints

Posted by GitBox <gi...@apache.org>.
AmplabJenkins commented on pull request #34227:
URL: https://github.com/apache/spark/pull/34227#issuecomment-940439795


   
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/144101/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] ueshin commented on a change in pull request #34227: [SPARK-36961][PYTHON] Use PEP526 style variable type hints

Posted by GitBox <gi...@apache.org>.
ueshin commented on a change in pull request #34227:
URL: https://github.com/apache/spark/pull/34227#discussion_r726538712



##########
File path: python/pyspark/pandas/internal.py
##########
@@ -664,14 +664,14 @@ def __init__(
                 NATURAL_ORDER_COLUMN_NAME, F.monotonically_increasing_id()
             )
 
-        self._sdf = spark_frame  # type: SparkDataFrame
+        self._sdf: SparkDataFrame = spark_frame

Review comment:
       It's just making sure the type.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] zero323 commented on pull request #34227: [SPARK-36961][PYTHON] Use PEP526 style variable type hints

Posted by GitBox <gi...@apache.org>.
zero323 commented on pull request #34227:
URL: https://github.com/apache/spark/pull/34227#issuecomment-939196035


   It is awesome @ueshin! Thanks for working on this.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA removed a comment on pull request #34227: [SPARK-36961][PYTHON] Use PEP526 style variable type hints

Posted by GitBox <gi...@apache.org>.
SparkQA removed a comment on pull request #34227:
URL: https://github.com/apache/spark/pull/34227#issuecomment-941164068


   **[Test build #144158 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/144158/testReport)** for PR 34227 at commit [`96c2a12`](https://github.com/apache/spark/commit/96c2a120377c4f9c6722cd21b4ab4aa0d123b896).


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] zero323 commented on a change in pull request #34227: [SPARK-36961][PYTHON] Use PEP526 style variable type hints

Posted by GitBox <gi...@apache.org>.
zero323 commented on a change in pull request #34227:
URL: https://github.com/apache/spark/pull/34227#discussion_r725390299



##########
File path: python/pyspark/pandas/indexes/multi.py
##########
@@ -1150,7 +1148,7 @@ def intersection(self, other: Union[DataFrame, Series, Index, List]) -> "MultiIn
 
         index_fields = self._index_fields_for_union_like(other, func_name="intersection")
 
-        default_name = [SPARK_INDEX_NAME_FORMAT(i) for i in range(self.nlevels)]  # type: List
+        default_name: List = [SPARK_INDEX_NAME_FORMAT(i) for i in range(self.nlevels)]

Review comment:
       `List[str]`?




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins commented on pull request #34227: [SPARK-36961][PYTHON] Use PEP526 style variable type hints

Posted by GitBox <gi...@apache.org>.
AmplabJenkins commented on pull request #34227:
URL: https://github.com/apache/spark/pull/34227#issuecomment-941205446






-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] zero323 commented on a change in pull request #34227: [SPARK-36961][PYTHON] Use PEP526 style variable type hints

Posted by GitBox <gi...@apache.org>.
zero323 commented on a change in pull request #34227:
URL: https://github.com/apache/spark/pull/34227#discussion_r726574531



##########
File path: python/pyspark/pandas/namespace.py
##########
@@ -1378,11 +1383,11 @@ def read_sql_table(
     reader.options(**options)
     sdf = reader.format("jdbc").load()
     index_spark_columns, index_names = _get_index_map(sdf, index_col)
-    psdf = DataFrame(
+    psdf: DataFrame = DataFrame(

Review comment:
       I haven't, but it seems like it is caused by `@no_type_check` on initializer:
   
   https://github.com/apache/spark/blob/20051eb69904de6afc27fe5adb18bcc760c78701/python/pyspark/pandas/frame.py#L440
   
   
   ```python
   # test.py
   from typing import no_type_check
   
   class Foo:
       @no_type_check
       def __init__(self):
           pass
   
   reveal_type(Foo())
   ```
   
   ```
   $  mypy test.py
   test.py:9: note: Revealed type is "Any"
   ```




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] ueshin edited a comment on pull request #34227: [SPARK-36961][PYTHON] Use PEP526 style variable type hints

Posted by GitBox <gi...@apache.org>.
ueshin edited a comment on pull request #34227:
URL: https://github.com/apache/spark/pull/34227#issuecomment-940418937


   @xinrong-databricks 
   
   > Does this PR aim to change pandas API on Spark only?
   
   This also includes two files in sql module:
   - python/pyspark/sql/pandas/conversion.py
   - python/pyspark/sql/pandas/types.py


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA commented on pull request #34227: [SPARK-36961][PYTHON] Use PEP526 style variable type hints

Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #34227:
URL: https://github.com/apache/spark/pull/34227#issuecomment-940499407


   Kubernetes integration test status failure
   URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/48579/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA commented on pull request #34227: [SPARK-36961][PYTHON] Use PEP526 style variable type hints

Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #34227:
URL: https://github.com/apache/spark/pull/34227#issuecomment-940506892


   **[Test build #144105 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/144105/testReport)** for PR 34227 at commit [`126c89c`](https://github.com/apache/spark/commit/126c89cad9a2549b60a5da29276795d1249de1e6).


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA removed a comment on pull request #34227: [SPARK-36961][PYTHON] Use PEP526 style variable type hints

Posted by GitBox <gi...@apache.org>.
SparkQA removed a comment on pull request #34227:
URL: https://github.com/apache/spark/pull/34227#issuecomment-941164068


   **[Test build #144158 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/144158/testReport)** for PR 34227 at commit [`96c2a12`](https://github.com/apache/spark/commit/96c2a120377c4f9c6722cd21b4ab4aa0d123b896).


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] zero323 commented on a change in pull request #34227: [SPARK-36961][PYTHON] Use PEP526 style variable type hints

Posted by GitBox <gi...@apache.org>.
zero323 commented on a change in pull request #34227:
URL: https://github.com/apache/spark/pull/34227#discussion_r725343767



##########
File path: python/pyspark/pandas/config.py
##########
@@ -246,9 +246,9 @@ def validate(self, v: Any) -> None:
         default="plotly",
         types=str,
     ),
-]  # type: List[Option]
+]
 
-_options_dict = dict(zip((option.key for option in _options), _options))  # type: Dict[str, Option]

Review comment:
       Do we remove these two intentionally? 




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] HyukjinKwon closed pull request #34227: [SPARK-36961][PYTHON] Use PEP526 style variable type hints

Posted by GitBox <gi...@apache.org>.
HyukjinKwon closed pull request #34227:
URL: https://github.com/apache/spark/pull/34227


   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] ueshin commented on a change in pull request #34227: [SPARK-36961][PYTHON] Use PEP526 style variable type hints

Posted by GitBox <gi...@apache.org>.
ueshin commented on a change in pull request #34227:
URL: https://github.com/apache/spark/pull/34227#discussion_r726588294



##########
File path: python/pyspark/pandas/namespace.py
##########
@@ -1378,11 +1383,11 @@ def read_sql_table(
     reader.options(**options)
     sdf = reader.format("jdbc").load()
     index_spark_columns, index_names = _get_index_map(sdf, index_col)
-    psdf = DataFrame(
+    psdf: DataFrame = DataFrame(

Review comment:
       It still shows errors like `Need type annotation for "psdf"  [var-annotated]`, but seems better than `@no_type_check`.
   Let me use `type: ignore[no-untyped-def]` with keeping the existing variable type annotations to avoid errors.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA commented on pull request #34227: [SPARK-36961][PYTHON] Use PEP526 style variable type hints

Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #34227:
URL: https://github.com/apache/spark/pull/34227#issuecomment-939137097


   **[Test build #144038 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/144038/testReport)** for PR 34227 at commit [`1068e7d`](https://github.com/apache/spark/commit/1068e7d2e50a3dc8fe9d727a0f5fe7dffad5fa15).


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA commented on pull request #34227: [SPARK-36961][PYTHON] Use PEP526 style variable type hints

Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #34227:
URL: https://github.com/apache/spark/pull/34227#issuecomment-939177279


   Kubernetes integration test status failure
   URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/48515/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] ueshin commented on a change in pull request #34227: [SPARK-36961][PYTHON] Use PEP526 style variable type hints

Posted by GitBox <gi...@apache.org>.
ueshin commented on a change in pull request #34227:
URL: https://github.com/apache/spark/pull/34227#discussion_r726546695



##########
File path: python/pyspark/pandas/frame.py
##########
@@ -10974,7 +10979,7 @@ def quantile(psser: "Series") -> Column:
             # |[2, 3, 4]|[6, 7, 8]|
             # +---------+---------+
 
-            cols_dict = OrderedDict()  # type: OrderedDict
+            cols_dict: OrderedDict = OrderedDict()

Review comment:
       `OrderedDict` seems not a generic type in Python<3.9.
   
   ```py
   >>> d: OrderedDict[str, str]
   Traceback (most recent call last):
     File "<stdin>", line 1, in <module>
   TypeError: 'type' object is not subscriptable
   ```
   
   I guess `Dict[str, List[Column]]` should be fine?
   Otherwise, as this is related to runtime, I'd leave it as is for now, just in case.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] ueshin commented on a change in pull request #34227: [SPARK-36961][PYTHON] Use PEP526 style variable type hints

Posted by GitBox <gi...@apache.org>.
ueshin commented on a change in pull request #34227:
URL: https://github.com/apache/spark/pull/34227#discussion_r726538712



##########
File path: python/pyspark/pandas/internal.py
##########
@@ -664,14 +664,14 @@ def __init__(
                 NATURAL_ORDER_COLUMN_NAME, F.monotonically_increasing_id()
             )
 
-        self._sdf = spark_frame  # type: SparkDataFrame
+        self._sdf: SparkDataFrame = spark_frame

Review comment:
       It's just make sure the type.

##########
File path: python/pyspark/pandas/namespace.py
##########
@@ -1378,11 +1383,11 @@ def read_sql_table(
     reader.options(**options)
     sdf = reader.format("jdbc").load()
     index_spark_columns, index_names = _get_index_map(sdf, index_col)
-    psdf = DataFrame(
+    psdf: DataFrame = DataFrame(

Review comment:
       Actually without it, `psdf` is inferred as `Any` for some reason.
   Have you experienced such a thing before?




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA commented on pull request #34227: [SPARK-36961][PYTHON] Use PEP526 style variable type hints

Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #34227:
URL: https://github.com/apache/spark/pull/34227#issuecomment-939147416


   **[Test build #144038 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/144038/testReport)** for PR 34227 at commit [`1068e7d`](https://github.com/apache/spark/commit/1068e7d2e50a3dc8fe9d727a0f5fe7dffad5fa15).
    * This patch passes all tests.
    * This patch merges cleanly.
    * This patch adds the following public classes _(experimental)_:
     * `        new_class = type(NameTypeHolder.short_name, (NameTypeHolder,), `


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] zero323 commented on a change in pull request #34227: [SPARK-36961][PYTHON] Use PEP526 style variable type hints

Posted by GitBox <gi...@apache.org>.
zero323 commented on a change in pull request #34227:
URL: https://github.com/apache/spark/pull/34227#discussion_r725395298



##########
File path: python/pyspark/pandas/internal.py
##########
@@ -664,14 +664,14 @@ def __init__(
                 NATURAL_ORDER_COLUMN_NAME, F.monotonically_increasing_id()
             )
 
-        self._sdf = spark_frame  # type: SparkDataFrame
+        self._sdf: SparkDataFrame = spark_frame

Review comment:
       Out of curiosity ‒ why is this necessary?




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins removed a comment on pull request #34227: [SPARK-36961][PYTHON] Use PEP526 style variable type hints

Posted by GitBox <gi...@apache.org>.
AmplabJenkins removed a comment on pull request #34227:
URL: https://github.com/apache/spark/pull/34227#issuecomment-939182956


   
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/48515/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins commented on pull request #34227: [SPARK-36961][PYTHON] Use PEP526 style variable type hints

Posted by GitBox <gi...@apache.org>.
AmplabJenkins commented on pull request #34227:
URL: https://github.com/apache/spark/pull/34227#issuecomment-939182956


   
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/48515/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA commented on pull request #34227: [SPARK-36961][PYTHON] Use PEP526 style variable type hints

Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #34227:
URL: https://github.com/apache/spark/pull/34227#issuecomment-941185270


   **[Test build #144158 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/144158/testReport)** for PR 34227 at commit [`96c2a12`](https://github.com/apache/spark/commit/96c2a120377c4f9c6722cd21b4ab4aa0d123b896).
    * This patch passes all tests.
    * This patch merges cleanly.
    * This patch adds no public classes.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] HyukjinKwon closed pull request #34227: [SPARK-36961][PYTHON] Use PEP526 style variable type hints

Posted by GitBox <gi...@apache.org>.
HyukjinKwon closed pull request #34227:
URL: https://github.com/apache/spark/pull/34227


   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins commented on pull request #34227: [SPARK-36961][PYTHON] Use PEP526 style variable type hints

Posted by GitBox <gi...@apache.org>.
AmplabJenkins commented on pull request #34227:
URL: https://github.com/apache/spark/pull/34227#issuecomment-940529303


   
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/144105/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] zero323 commented on a change in pull request #34227: [SPARK-36961][PYTHON] Use PEP526 style variable type hints

Posted by GitBox <gi...@apache.org>.
zero323 commented on a change in pull request #34227:
URL: https://github.com/apache/spark/pull/34227#discussion_r726576342



##########
File path: python/pyspark/pandas/namespace.py
##########
@@ -1378,11 +1383,11 @@ def read_sql_table(
     reader.options(**options)
     sdf = reader.format("jdbc").load()
     index_spark_columns, index_names = _get_index_map(sdf, index_col)
-    psdf = DataFrame(
+    psdf: DataFrame = DataFrame(

Review comment:
       I guess annotation is here to stay :)




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] ueshin commented on a change in pull request #34227: [SPARK-36961][PYTHON] Use PEP526 style variable type hints

Posted by GitBox <gi...@apache.org>.
ueshin commented on a change in pull request #34227:
URL: https://github.com/apache/spark/pull/34227#discussion_r726581295



##########
File path: python/pyspark/pandas/namespace.py
##########
@@ -1378,11 +1383,11 @@ def read_sql_table(
     reader.options(**options)
     sdf = reader.format("jdbc").load()
     index_spark_columns, index_names = _get_index_map(sdf, index_col)
-    psdf = DataFrame(
+    psdf: DataFrame = DataFrame(

Review comment:
       
   Or use `type: ignore[no-untyped-def]` instead of `no_type_check`
   
   ```py
   # test.py
   
   class Foo:
       def __init__(self):  # type: ignore[no-untyped-def]
           pass
   
   reveal_type(Foo())
   ```
   
   ```
   % mypy test.py
   test.py:7: note: Revealed type is "test.Foo"
   ```
   
   Let me use `type: ignore[no-untyped-def]` here for now.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA commented on pull request #34227: [SPARK-36961][PYTHON] Use PEP526 style variable type hints

Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #34227:
URL: https://github.com/apache/spark/pull/34227#issuecomment-940566689


   Kubernetes integration test status failure
   URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/48583/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] zero323 commented on a change in pull request #34227: [SPARK-36961][PYTHON] Use PEP526 style variable type hints

Posted by GitBox <gi...@apache.org>.
zero323 commented on a change in pull request #34227:
URL: https://github.com/apache/spark/pull/34227#discussion_r725377610



##########
File path: python/pyspark/pandas/frame.py
##########
@@ -10974,7 +10979,7 @@ def quantile(psser: "Series") -> Column:
             # |[2, 3, 4]|[6, 7, 8]|
             # +---------+---------+
 
-            cols_dict = OrderedDict()  # type: OrderedDict
+            cols_dict: OrderedDict = OrderedDict()

Review comment:
       Can we refine this? At first glance it looks like `OrderedDict[str, List[Column]]`, but I could be wrong.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] zero323 commented on a change in pull request #34227: [SPARK-36961][PYTHON] Use PEP526 style variable type hints

Posted by GitBox <gi...@apache.org>.
zero323 commented on a change in pull request #34227:
URL: https://github.com/apache/spark/pull/34227#discussion_r725404265



##########
File path: python/pyspark/pandas/namespace.py
##########
@@ -397,22 +400,24 @@ def read_csv(
     if nrows is not None:
         sdf = sdf.limit(nrows)
 
+    index_spark_column_names: List[str]
+    index_names: List[Label]
     if index_col is not None:
         if isinstance(index_col, (str, int)):
             index_col = [index_col]
         for col in index_col:
             if col not in column_labels:
                 raise KeyError(col)
         index_spark_column_names = [column_labels[col] for col in index_col]
-        index_names = [(col,) for col in index_col]  # type: List[Label]
+        index_names = [(col,) for col in index_col]
         column_labels = OrderedDict(
             (label, col) for label, col in column_labels.items() if label not in index_col
         )
     else:
         index_spark_column_names = []
         index_names = []
 
-    psdf = DataFrame(
+    psdf: DataFrame = DataFrame(

Review comment:
       Is this really needed?




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] zero323 commented on a change in pull request #34227: [SPARK-36961][PYTHON] Use PEP526 style variable type hints

Posted by GitBox <gi...@apache.org>.
zero323 commented on a change in pull request #34227:
URL: https://github.com/apache/spark/pull/34227#discussion_r725347549



##########
File path: python/pyspark/pandas/categorical.py
##########
@@ -239,8 +239,9 @@ def add_categories(
                 FutureWarning,
             )
 
+        categories: List
         if is_list_like(new_categories):
-            categories = list(new_categories)  # type: List

Review comment:
       If this cannot be narrowed down to specific type, we might consider using `List[Any]` explicitly (ideally we should be able to set `disallow_any_generics` in the future).




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA commented on pull request #34227: [SPARK-36961][PYTHON] Use PEP526 style variable type hints

Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #34227:
URL: https://github.com/apache/spark/pull/34227#issuecomment-941230846


   Kubernetes integration test status failure
   URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/48636/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] HyukjinKwon commented on pull request #34227: [SPARK-36961][PYTHON] Use PEP526 style variable type hints

Posted by GitBox <gi...@apache.org>.
HyukjinKwon commented on pull request #34227:
URL: https://github.com/apache/spark/pull/34227#issuecomment-941797642


   Merged to master.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA commented on pull request #34227: [SPARK-36961][PYTHON] Use PEP526 style variable type hints

Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #34227:
URL: https://github.com/apache/spark/pull/34227#issuecomment-941164068






-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA commented on pull request #34227: [SPARK-36961][PYTHON] Use PEP526 style variable type hints

Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #34227:
URL: https://github.com/apache/spark/pull/34227#issuecomment-940524341


   Kubernetes integration test starting
   URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/48583/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] zero323 commented on a change in pull request #34227: [SPARK-36961][PYTHON] Use PEP526 style variable type hints

Posted by GitBox <gi...@apache.org>.
zero323 commented on a change in pull request #34227:
URL: https://github.com/apache/spark/pull/34227#discussion_r726570201



##########
File path: python/pyspark/pandas/frame.py
##########
@@ -10974,7 +10979,7 @@ def quantile(psser: "Series") -> Column:
             # |[2, 3, 4]|[6, 7, 8]|
             # +---------+---------+
 
-            cols_dict = OrderedDict()  # type: OrderedDict
+            cols_dict: OrderedDict = OrderedDict()

Review comment:
       My intuition is that we get more from specifying key and value types, than knowing that it is specifically `OrderedDict`. For all supported versions `dict` is already ordered and the only method that is specific to `OrderedDict` (`move_to_end`) is not used here. But I don't have very strong feelings about it.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins removed a comment on pull request #34227: [SPARK-36961][PYTHON] Use PEP526 style variable type hints

Posted by GitBox <gi...@apache.org>.
AmplabJenkins removed a comment on pull request #34227:
URL: https://github.com/apache/spark/pull/34227#issuecomment-940506513


   
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/48579/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] ueshin commented on a change in pull request #34227: [SPARK-36961][PYTHON] Use PEP526 style variable type hints

Posted by GitBox <gi...@apache.org>.
ueshin commented on a change in pull request #34227:
URL: https://github.com/apache/spark/pull/34227#discussion_r726577755



##########
File path: python/pyspark/pandas/namespace.py
##########
@@ -1378,11 +1383,11 @@ def read_sql_table(
     reader.options(**options)
     sdf = reader.format("jdbc").load()
     index_spark_columns, index_names = _get_index_map(sdf, index_col)
-    psdf = DataFrame(
+    psdf: DataFrame = DataFrame(

Review comment:
       Ah, that's the reason!
   We should add type hints to the initializer later.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA commented on pull request #34227: [SPARK-36961][PYTHON] Use PEP526 style variable type hints

Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #34227:
URL: https://github.com/apache/spark/pull/34227#issuecomment-939153551


   Kubernetes integration test starting
   URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/48515/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] zero323 commented on a change in pull request #34227: [SPARK-36961][PYTHON] Use PEP526 style variable type hints

Posted by GitBox <gi...@apache.org>.
zero323 commented on a change in pull request #34227:
URL: https://github.com/apache/spark/pull/34227#discussion_r725406921



##########
File path: python/pyspark/pandas/typedef/typehints.py
##########
@@ -45,7 +45,7 @@
     from pandas import Int8Dtype, Int16Dtype, Int32Dtype, Int64Dtype
 
     extension_dtypes_available = True
-    extension_dtypes = (Int8Dtype, Int16Dtype, Int32Dtype, Int64Dtype)  # type: Tuple
+    extension_dtypes: Tuple = (Int8Dtype, Int16Dtype, Int32Dtype, Int64Dtype)

Review comment:
       How about `Tuple[Type[Any], ...]` or `Tuple[type]`?




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins removed a comment on pull request #34227: [SPARK-36961][PYTHON] Use PEP526 style variable type hints

Posted by GitBox <gi...@apache.org>.
AmplabJenkins removed a comment on pull request #34227:
URL: https://github.com/apache/spark/pull/34227#issuecomment-941251177


   
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/48636/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins commented on pull request #34227: [SPARK-36961][PYTHON] Use PEP526 style variable type hints

Posted by GitBox <gi...@apache.org>.
AmplabJenkins commented on pull request #34227:
URL: https://github.com/apache/spark/pull/34227#issuecomment-939156224


   
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/144038/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] xinrong-databricks commented on pull request #34227: [SPARK-36961][PYTHON] Use PEP526 style variable type hints

Posted by GitBox <gi...@apache.org>.
xinrong-databricks commented on pull request #34227:
URL: https://github.com/apache/spark/pull/34227#issuecomment-939159371


   Does this PR aim to change pandas API on Spark only?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org