You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@spark.apache.org by GitBox <gi...@apache.org> on 2022/02/09 11:43:40 UTC
[GitHub] [spark] zero323 commented on a change in pull request #35410: [WIP][SPARK-38121][PYTHON][SQL] Use SparkSession instead of SQLContext inside PySpark
zero323 commented on a change in pull request #35410:
URL: https://github.com/apache/spark/pull/35410#discussion_r802573722
##########
File path: python/pyspark/sql/dataframe.py
##########
@@ -104,10 +105,25 @@ class DataFrame(PandasMapOpsMixin, PandasConversionMixin):
.. versionadded:: 1.3.0
"""
- def __init__(self, jdf: JavaObject, sql_ctx: "SQLContext"):
- self._jdf = jdf
- self.sql_ctx = sql_ctx
- self._sc: SparkContext = cast(SparkContext, sql_ctx and sql_ctx._sc)
+ def __init__(
+ self,
+ jdf: JavaObject,
+ sql_ctx: Optional["SQLContext"] = None,
+ session: Optional["SparkSession"] = None,
+ ):
+ assert sql_ctx is not None or session is not None
+ self._session = session
+ self._sql_ctx = sql_ctx
Review comment:
Just thinking out loud. Why not stick to a single argument (`sql_ctx` or `session` ‒ it might require some research, but I fairly sure it is usually passed as positional) and
```python
self.session = session if isinstance(sql_ctx, SparkSession) else sql_ctx.sparkSession
```
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org