You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@spark.apache.org by "Jason Yarbrough (Jira)" <ji...@apache.org> on 2021/03/23 23:10:00 UTC

[jira] [Updated] (SPARK-34844) JDBCRelation columnPartition function includes the first stride in the lower partition

     [ https://issues.apache.org/jira/browse/SPARK-34844?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Jason Yarbrough updated SPARK-34844:
------------------------------------
    Description: 
Currently, columnPartition in JDBCRelation contains logic that adds the first stride into the lower partition. Because of this, the lower bound isn't used as the ceiling for the lower partition.

For example, say we have data 0-10, 10 partitions, and the lowerBound is set to 1. The lower/first partition should contain anything < 1. However, in the current implementation, it would include anything < 2.

A possible easy fix would be changing the following code on line 132:

currentValue += stride

To:

if (i != 0) currentValue += stride

Or include currentValue += stride within the if statement on line 131... although this creates a pretty bad looking side-effect.

  was:
Currently, columnPartition in JDBCRelation contains logic that adds the first stride into the lower partition. Because of this, the lower bound isn't used as the ceiling for the lower partition.

For example, say we have data 0-10, 10 partitions, and the lowerBound is set to 1. The lower/first partition should contain anything < 1. However, in the current implementation, it would include anything < 2.

A possible easy fix would be changing the following code on line 132:

currentValue += stride

To:

if (i != 0) currentValue += stride

Or include currentValue += stride within the if statement on line 131... although this creates a pretty nasty looking side-effect.


> JDBCRelation columnPartition function includes the first stride in the lower partition
> --------------------------------------------------------------------------------------
>
>                 Key: SPARK-34844
>                 URL: https://issues.apache.org/jira/browse/SPARK-34844
>             Project: Spark
>          Issue Type: Improvement
>          Components: SQL
>    Affects Versions: 3.0.0
>            Reporter: Jason Yarbrough
>            Priority: Minor
>
> Currently, columnPartition in JDBCRelation contains logic that adds the first stride into the lower partition. Because of this, the lower bound isn't used as the ceiling for the lower partition.
> For example, say we have data 0-10, 10 partitions, and the lowerBound is set to 1. The lower/first partition should contain anything < 1. However, in the current implementation, it would include anything < 2.
> A possible easy fix would be changing the following code on line 132:
> currentValue += stride
> To:
> if (i != 0) currentValue += stride
> Or include currentValue += stride within the if statement on line 131... although this creates a pretty bad looking side-effect.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org