You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@spark.apache.org by "zhengruifeng (via GitHub)" <gi...@apache.org> on 2023/09/20 13:30:14 UTC
[GitHub] [spark] zhengruifeng commented on a diff in pull request #43014: [SPARK-45235][CONNECT][PYTHON] Support map and array parameters by `sql()`

zhengruifeng commented on code in PR #43014:
URL: https://github.com/apache/spark/pull/43014#discussion_r1331636427


##########
python/pyspark/sql/connect/plan.py:
##########
@@ -1049,21 +1049,23 @@ def __init__(self, query: str, args: Optional[Union[Dict[str, Any], List]] = Non
         self._query = query
         self._args = args
 
+    def __to_expr(self, session: "SparkConnectClient", v: Any) -> proto.Expression:

Review Comment:
   I think we'd better rename it `_to_expr`.
   
   [the python standard](https://peps.python.org/pep-0008/#method-names-and-instance-variables) say:
   
   
   > Note: there is some controversy about the use of __names
   
   
   > In addition, the following special forms using leading or trailing underscores are recognized (these can generally be combined with any case convention):
   
   > `_single_leading_underscore`: weak “internal use” indicator. E.g. from M import * does not import objects whose names start with an underscore.
   > `single_trailing_underscore_`: used by convention to avoid conflicts with Python keyword, e.g.
   tkinter.Toplevel(master, class_='ClassName')
   > `__double_leading_underscore`: when naming a class attribute, invokes name mangling (inside class FooBar, __boo becomes _FooBar__boo; see below).
   > `__double_leading_and_trailing_underscore__`: “magic” objects or attributes that live in user-controlled namespaces. E.g. __init__, __import__ or __file__. Never invent such names; only use them as documented.
   



##########
python/pyspark/sql/tests/connect/test_connect_basic.py:
##########
@@ -1237,13 +1237,23 @@ def test_sql(self):
         self.assertEqual(1, len(pdf.index))
 
     def test_sql_with_named_args(self):
-        df = self.connect.sql("SELECT * FROM range(10) WHERE id > :minId", args={"minId": 7})
-        df2 = self.spark.sql("SELECT * FROM range(10) WHERE id > :minId", args={"minId": 7})
+        from pyspark.sql.functions import create_map, lit
+        from pyspark.sql.connect.functions import lit as clit
+        from pyspark.sql.connect.functions import create_map as ccreate_map

Review Comment:
   https://github.com/apache/spark/blob/8c27de68756d4b0e5940211340a0b323d808aead/python/pyspark/sql/tests/connect/test_connect_basic.py#L78-L79
   
   I think you can use already imported `SF` and `CF`, to be consistent with other tests



##########
python/pyspark/sql/connect/plan.py:
##########
@@ -1049,21 +1049,23 @@ def __init__(self, query: str, args: Optional[Union[Dict[str, Any], List]] = Non
         self._query = query
         self._args = args
 
+    def __to_expr(self, session: "SparkConnectClient", v: Any) -> proto.Expression:

Review Comment:
   I think we'd better rename it `_to_expr`.
   
   [the python standard](https://peps.python.org/pep-0008/#method-names-and-instance-variables) say:
   
   
   > Note: there is some controversy about the use of __names
   
   
   > In addition, the following special forms using leading or trailing underscores are recognized (these can generally be combined with any case convention):
   
   > `_single_leading_underscore`: weak “internal use” indicator. E.g. from M import * does not import objects whose names start with an underscore.
   > `single_trailing_underscore_`: used by convention to avoid conflicts with Python keyword, e.g.
   tkinter.Toplevel(master, class_='ClassName')
   > `__double_leading_underscore`: when naming a class attribute, invokes name mangling (inside class FooBar, __boo becomes _FooBar__boo; see below).
   > `__double_leading_and_trailing_underscore__`: “magic” objects or attributes that live in user-controlled namespaces. E.g. __init__, __import__ or __file__. Never invent such names; only use them as documented.
   



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org