You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@spark.apache.org by li...@apache.org on 2017/01/17 18:37:36 UTC
spark git commit: [SPARK-19239][PYSPARK] Check parameters whether
equals None when specify the column in jdbc API
Repository: spark
Updated Branches:
refs/heads/master a23debd7b -> 843ec8ec4
[SPARK-19239][PYSPARK] Check parameters whether equals None when specify the column in jdbc API
## What changes were proposed in this pull request?
The `jdbc` API do not check the `lowerBound` and `upperBound` when we
specified the ``column``, and just throw the following exception:
>```int() argument must be a string or a number, not 'NoneType'```
If we check the parameter, we can give a more friendly suggestion.
## How was this patch tested?
Test using the pyspark shell, without the lowerBound and upperBound parameters.
Author: DjvuLee <li...@bytedance.com>
Closes #16599 from djvulee/pysparkFix.
Project: http://git-wip-us.apache.org/repos/asf/spark/repo
Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/843ec8ec
Tree: http://git-wip-us.apache.org/repos/asf/spark/tree/843ec8ec
Diff: http://git-wip-us.apache.org/repos/asf/spark/diff/843ec8ec
Branch: refs/heads/master
Commit: 843ec8ec42a16d6b52ad161b98bedb4f9952964b
Parents: a23debd
Author: DjvuLee <li...@bytedance.com>
Authored: Tue Jan 17 10:37:29 2017 -0800
Committer: gatorsmile <ga...@gmail.com>
Committed: Tue Jan 17 10:37:29 2017 -0800
----------------------------------------------------------------------
python/pyspark/sql/readwriter.py | 9 ++++++---
1 file changed, 6 insertions(+), 3 deletions(-)
----------------------------------------------------------------------
http://git-wip-us.apache.org/repos/asf/spark/blob/843ec8ec/python/pyspark/sql/readwriter.py
----------------------------------------------------------------------
diff --git a/python/pyspark/sql/readwriter.py b/python/pyspark/sql/readwriter.py
index b0c51b1..d31f3fb 100644
--- a/python/pyspark/sql/readwriter.py
+++ b/python/pyspark/sql/readwriter.py
@@ -399,7 +399,8 @@ class DataFrameReader(OptionUtils):
accessible via JDBC URL ``url`` and connection ``properties``.
Partitions of the table will be retrieved in parallel if either ``column`` or
- ``predicates`` is specified.
+ ``predicates`` is specified. ``lowerBound`, ``upperBound`` and ``numPartitions``
+ is needed when ``column`` is specified.
If both ``column`` and ``predicates`` are specified, ``column`` will be used.
@@ -429,8 +430,10 @@ class DataFrameReader(OptionUtils):
for k in properties:
jprop.setProperty(k, properties[k])
if column is not None:
- if numPartitions is None:
- numPartitions = self._spark._sc.defaultParallelism
+ assert lowerBound is not None, "lowerBound can not be None when ``column`` is specified"
+ assert upperBound is not None, "upperBound can not be None when ``column`` is specified"
+ assert numPartitions is not None, \
+ "numPartitions can not be None when ``column`` is specified"
return self._df(self._jreader.jdbc(url, table, column, int(lowerBound), int(upperBound),
int(numPartitions), jprop))
if predicates is not None:
---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@spark.apache.org
For additional commands, e-mail: commits-help@spark.apache.org