You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@spark.apache.org by GitBox <gi...@apache.org> on 2019/05/04 19:29:37 UTC
[GitHub] [spark] AlJohri edited a comment on issue #21834:
[SPARK-22814][SQL] Support Date/Timestamp in a JDBC partition column
AlJohri edited a comment on issue #21834: [SPARK-22814][SQL] Support Date/Timestamp in a JDBC partition column
URL: https://github.com/apache/spark/pull/21834#issuecomment-489357987
@gatorsmile @maropu this currently does not work with pyspark due to this line:
https://github.com/apache/spark/blob/d9bcacf94b93fe76542b5c1fd852559075ef6faa/python/pyspark/sql/readwriter.py#L563-L564
it tries to convert `lowerBound` and `upperBound` to an `int`.
The resulting traceback is:
```
---------------------------------------------------------------------------
TypeError Traceback (most recent call last)
<ipython-input-40-2636f0dd1e0a> in <module>
16 upperBound=now,
17 numPartitions=sc.defaultParallelism,
---> 18 properties={'driver': 'org.postgresql.Driver'})
19 .join(article_metadata, on=['url'], how='left')
20 .orderBy('timestamp', ascending=False))
/usr/lib/spark/python/pyspark/sql/readwriter.py in jdbc(self, url, table, column, lowerBound, upperBound, numPartitions, predicates, properties)
550 assert numPartitions is not None, \
551 "numPartitions can not be None when ``column`` is specified"
--> 552 return self._df(self._jreader.jdbc(url, table, column, int(lowerBound), int(upperBound),
553 int(numPartitions), jprop))
554 if predicates is not None:
TypeError: int() argument must be a string, a bytes-like object or a number, not 'datetime.datetime'
```
I think just removing the `int` may fix the issue but I'm not 100% sure.
EDIT: looks like the pyspark [docs](https://spark.apache.org/docs/latest/api/python/pyspark.sql.html#pyspark.sql.DataFrameReader.jdbc) specify integer column
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
With regards,
Apache Git Services
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org