You are viewing a plain text version of this content. The canonical link for it is here.

Posted to reviews@spark.apache.org by travishegner <gi...@git.apache.org> on 2015/09/16 22:30:48 UTC

[GitHub] spark pull request: [SPARK-10648] Proposed bug fix when oracle ret...

GitHub user travishegner opened a pull request:

    https://github.com/apache/spark/pull/8780

    [SPARK-10648] Proposed bug fix when oracle returns -127 as a scale to a numeric type

    In my environment, the precision and scale are undefined in the oracle database, but spark is detecting them to be 0 and -127 respectively.
    
    If I understand those two values correctly, they should never logically be defined as less than zero, so the proposed changes should correctly default the precision and scale instead of trying to use erroneous values.
    
    If there is a valid use case of a negative precision or scale, then I can re-work this to test for the exact 0 AND -127 case and handle it appropriately.

You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/travishegner/spark master

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/spark/pull/8780.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #8780
    
----
commit b0c3be317d0b681e183543de97093120a51a6222
Author: Travis Hegner <th...@trilliumit.com>
Date:   2015-09-16T20:12:23Z

    Proposed bug fix when oracle returns -127 as a scale to a numeric type

----


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request: [SPARK-10648] Proposed bug fix when oracle ret...

Posted by travishegner <gi...@git.apache.org>.

Github user travishegner commented on the pull request:

    https://github.com/apache/spark/pull/8780#issuecomment-141462484
  
    That is exactly what I was afraid of. Would the patch make more sense to *only* check precision for a zero value? Does it ever make sense to have a precision of zero (or less than zero for that matter)? Could we safely enforce defaults if precision is zero (or less) regardless of scale? That would solve my problem still, hopefully without compromising functionality for everyone else.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request: [SPARK-10648] Proposed bug fix when oracle ret...

Posted by cloud-fan <gi...@git.apache.org>.

Github user cloud-fan commented on the pull request:

    https://github.com/apache/spark/pull/8780#issuecomment-144541760
  
    I met this problem before, and actually it's not spark that detect them to be 0 and -127, but JDBC. My solution is just adding a `OracleDialect` to handle this sepcial case:
    ```
    case object OracleDialect extends JdbcDialect {
      override def canHandle(url: String): Boolean = url.startsWith("jdbc:oracle")
      override def getCatalystType(
          sqlType: Int, typeName: String, size: Int, md: MetadataBuilder): Option[DataType] = {
        if (sqlType == Types.NUMERIC && typeName == "NUMBER" && size == 0) Some(LongType) else None
      }
    }
    
    ......
    
    JdbcDialects.registerDialect(OracleDialect)
    ```
    So if we want to support oracle officially, we can add `OracleDialect` into spark like we did for mysql, postgre, etc. Or @travishegner you can try `OracleDialect` to fix your problem without changing spark code.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request: [SPARK-10648] Proposed bug fix when oracle ret...

Posted by travishegner <gi...@git.apache.org>.

Github user travishegner commented on a diff in the pull request:

    https://github.com/apache/spark/pull/8780#discussion_r41172742
  
    --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/types/DecimalType.scala ---
    @@ -140,7 +140,12 @@ object DecimalType extends AbstractDataType {
       }
     
       private[sql] def bounded(precision: Int, scale: Int): DecimalType = {
    -    DecimalType(min(precision, MAX_PRECISION), min(scale, MAX_SCALE))
    --- End diff --
    
    I will take your word for the risk involved, I am very new to this project.
    
    From a layman's perspective, it seems that doing some basic checks when instantiating the type would make the type more robust. If I understand correctly a `precision <= 0` is not allowed, so this patch returns a /default/ decimal. Similarly, a `scale > precision` is not allowed, so it returns a decimal with the scale truncated to the size of the precision. My thoughts are that this will catch unexpected inputs and still behave in an expected way. Users instantiating these decimals in ways are intended will still get the same type back.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request: [SPARK-10648] Proposed bug fix when oracle ret...

Posted by holdenk <gi...@git.apache.org>.

Github user holdenk commented on the pull request:

    https://github.com/apache/spark/pull/8780#issuecomment-140896861
  
    So I understand, the goal of this patch is that if an invalid value is returned (e.g. a precision or scale <= 0), then the defaults are used yes?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request: [SPARK-10648] Proposed bug fix when oracle ret...

Posted by rxin <gi...@git.apache.org>.

Github user rxin commented on the pull request:

    https://github.com/apache/spark/pull/8780#issuecomment-144502287
  
    cc @davies to take a quick look


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request: [SPARK-10648] Proposed bug fix when oracle ret...

Posted by travishegner <gi...@git.apache.org>.

Github user travishegner commented on the pull request:

    https://github.com/apache/spark/pull/8780#issuecomment-142282068
  
    I'm making sure the new version builds, but here are the rules:
    
    ```scala
      private[sql] def bounded(precision: Int, scale: Int): DecimalType = {
        if (precision <= 0)
          DecimalType.SYSTEM_DEFAULT
        else if (scale > precision)
          DecimalType(min(precision, MAX_PRECISION), min(precision, MAX_SCALE))
        else
          DecimalType(min(precision, MAX_PRECISION), min(scale, MAX_SCALE))
      }
    ```
    
    For both Decimal and Numeric types in `JDBCRDD.scala`, it now will call the `DecimalType.bounded()` regardless the values of `precision` and `scale`. This allows all of the validity checks to happen inside of that function. The rules above are simply if precision is invalid, return a `SYSTEM_DEFAULT`. if `scale > precision` then force `scale = precision`. And all else retains previous behavior.
    
    Once I verify that it builds and runs, I will update the pull request. Does this look accurate?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request: [SPARK-10648] Proposed bug fix when oracle ret...

Posted by davies <gi...@git.apache.org>.

Github user davies commented on a diff in the pull request:

    https://github.com/apache/spark/pull/8780#discussion_r40835035
  
    --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/types/DecimalType.scala ---
    @@ -140,7 +140,12 @@ object DecimalType extends AbstractDataType {
       }
     
       private[sql] def bounded(precision: Int, scale: Int): DecimalType = {
    -    DecimalType(min(precision, MAX_PRECISION), min(scale, MAX_SCALE))
    --- End diff --
    
    It's risky to change this one, I'd only change the JDBCRDD


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request: [SPARK-10648] Proposed bug fix when oracle ret...

Posted by travishegner <gi...@git.apache.org>.

Github user travishegner commented on the pull request:

    https://github.com/apache/spark/pull/8780#issuecomment-142004449
  
    Working on a new patch... Would it ever be possible to have a case where precision is 0 (essentially undefined), but scale is still intentionally set? Or is it that setting the precision is required in order to set a scale?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request: [SPARK-10648] Proposed bug fix when oracle ret...

Posted by davies <gi...@git.apache.org>.

Github user davies commented on a diff in the pull request:

    https://github.com/apache/spark/pull/8780#discussion_r40834945
  
    --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/jdbc/JDBCRDD.scala ---
    @@ -66,9 +66,7 @@ private[sql] object JDBCRDD extends Logging {
           case java.sql.Types.CLOB          => StringType
           case java.sql.Types.DATALINK      => null
           case java.sql.Types.DATE          => DateType
    -      case java.sql.Types.DECIMAL
    -        if precision != 0 || scale != 0 => DecimalType.bounded(precision, scale)
    --- End diff --
    
    How about change this to `scale >= 0`?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request: [SPARK-10648] Proposed bug fix when oracle ret...

Posted by holdenk <gi...@git.apache.org>.

Github user holdenk commented on the pull request:

    https://github.com/apache/spark/pull/8780#issuecomment-140913785
  
    I'm not super sure on that, one question would be if this is reasonable behavior for all databases or only Oracle.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request: [SPARK-10648] Proposed bug fix when oracle ret...

Posted by travishegner <gi...@git.apache.org>.

Github user travishegner commented on the pull request:

    https://github.com/apache/spark/pull/8780#issuecomment-154087900
  
    Please see PR #9495 for the oracle dialect solution proposed above.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request: [SPARK-10648] Proposed bug fix when oracle ret...

Posted by rxin <gi...@git.apache.org>.

Github user rxin commented on the pull request:

    https://github.com/apache/spark/pull/8780#issuecomment-141609708
  
    That should work.
    
    On Sep 18, 2015, at 7:15 AM, Travis Hegner <no...@github.com> wrote:
    
    That is exactly what I was afraid of. Would the patch make more sense to
    *only* check precision for a zero value? Does it ever make sense to have a
    precision of zero (or less than zero for that matter)? Could we safely
    enforce defaults if precision is zero (or less) regardless of scale? That
    would solve my problem still, hopefully without compromising functionality
    for everyone else.
    
    —
    Reply to this email directly or view it on GitHub
    <https://github.com/apache/spark/pull/8780#issuecomment-141462484>.



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request: [SPARK-10648] Proposed bug fix when oracle ret...

Posted by rxin <gi...@git.apache.org>.

Github user rxin commented on the pull request:

    https://github.com/apache/spark/pull/8780#issuecomment-142103750
  
    precision = 10 and scale = -20 should be fine.
    
    ```
    scala> Seq((121, 134)).toDF("a","b")
    res0: org.apache.spark.sql.DataFrame = [a: int, b: int]
    
    scala>  import org.apache.spark.sql.types._
    import org.apache.spark.sql.types._
    
    scala> res0.select($"a".cast(DecimalType(2, -1))).show()
    +------------------------+
    |cast(a as decimal(2,-1))|
    +------------------------+
    |                  1.2E+2|
    +------------------------+
    
    
    scala> 
    
    scala> res0.select($"a".cast(DecimalType(1, -2))).show()
    +------------------------+
    |cast(a as decimal(1,-2))|
    +------------------------+
    |                    1E+2|
    +------------------------+
    ```


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request: [SPARK-10648] Proposed bug fix when oracle ret...

Posted by bdolbeare <gi...@git.apache.org>.

Github user bdolbeare commented on the pull request:

    https://github.com/apache/spark/pull/8780#issuecomment-145598968
  
    The problem with Oracle is that you can define numbers without providing precision or scale:
    
    	column_name NUMBER  (this is the only case that doesn't work very well for Oracle support)
    		has a precision of 0 and scale of -127 in JDBC ResultSetMetaData
    		"If a precision is not specified, the column stores values as given."  http://docs.oracle.com/cd/B28359_01/server.111/b28318/datatype.htm#i16209		
    		
    	column_name NUMBER(10) 
    		has a precision of 10 and scale of 0 in JDBC ResultSetMetaData
    		
    	column_name NUMBER(10,2) 
    		has a precision of 10 and scale of 2 in JDBC ResultSetMetaData	
    
    I think the best solution is to handle this in a OracleDialect since this is a quirk of Oracle.  I've done that for my own code but it would be nice to have two changes in Spark:
    
    1)  Access to more of the metadata fields (e.g. scale) in the dialect.getCatalystType call (currently the precision is provided but the scale is not)
    2)  Change line 406 in JDBCRDD to support creating a Decimal without a predefined precision/scale.  It seems that this would work in cases where there is a consistent
    precision/scale for a field and also this Oracle nuance where the precision/scale differ per row.
    
    
    For now, this is what I've done in my own OracleDialect:
    
    	object OracleDialect extends JdbcDialect {
    	  override def getCatalystType(sqlType: Int, typeName: String, size: Int, md: MetadataBuilder): Option[DataType] = {
    		// Handle NUMBER fields that have no precision/scale in special way because JDBC ResultSetMetaData converts this to 0 procision and -127 scale
    		if (sqlType == Types.NUMERIC && size == 0) {
    		  // This is sub-optimal as we have to pick a precision/scale in advance whereas the data in Oracle is allowed 
    		  //  to have different precision/scale for each value.  This conversion works in our domain for now though we 
    		  //  need a more durable solution.  Look into changing JDBCRDD (line 406):
    		  //    FROM:  mutableRow.update(i, Decimal(decimalVal, p, s))
    		  //    TO:  mutableRow.update(i, Decimal(decimalVal))
    		  Some(DecimalType(DecimalType.MAX_PRECISION, 10))
    		} // Handle Timestamp with timezone (for now we are just converting this to a string with default format)
    		else if (sqlType == -101) {
    		  Some(StringType)
    		} else None
    	  }
    	}



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request: [SPARK-10648] Proposed bug fix when oracle ret...

Posted by travishegner <gi...@git.apache.org>.

Github user travishegner commented on the pull request:

    https://github.com/apache/spark/pull/8780#issuecomment-145609065
  
    @cloud-fan @bdolbeare @davies I'm certainly open to doing this in an oracle specific way if that is what is required. I was simply hoping to solve my problem while simultaneously making the whole project more robust. I completely understand if you don't believe that it's the right direction. Thanks for looking into it with me!


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request: [SPARK-10648] Proposed bug fix when oracle ret...

Posted by travishegner <gi...@git.apache.org>.

Github user travishegner commented on the pull request:

    https://github.com/apache/spark/pull/8780#issuecomment-142045620
  
    But a negative scale is inherently less than a defined precision... or do you mean precision should never be less than the absolute value of scale? Is that something that should be tested for, and overridden with defaults if true?
    
    This also would fix my problem without DB specific hacks.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request: [SPARK-10648] Proposed bug fix when oracle ret...

Posted by travishegner <gi...@git.apache.org>.

Github user travishegner commented on the pull request:

    https://github.com/apache/spark/pull/8780#issuecomment-144054401
  
    So any thoughts on merging this?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request: [SPARK-10648] Proposed bug fix when oracle ret...

Posted by rxin <gi...@git.apache.org>.

Github user rxin commented on the pull request:

    https://github.com/apache/spark/pull/8780#issuecomment-141351034
  
    (I actually don't know if Spark implements this correctly -- we should test it)


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request: [SPARK-10648] Proposed bug fix when oracle ret...

Posted by AmplabJenkins <gi...@git.apache.org>.

Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/8780#issuecomment-140881328
  
    Can one of the admins verify this patch?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request: [SPARK-10648] Proposed bug fix when oracle ret...

Posted by yhuai <gi...@git.apache.org>.

Github user yhuai commented on the pull request:

    https://github.com/apache/spark/pull/8780#issuecomment-153918977
  
    @travishegner Will you have time to continue your work? I think our resolution is to create a oracle dialect and we automatically register it (see https://github.com/apache/spark/blob/master/sql/core/src/main/scala/org/apache/spark/sql/jdbc/JdbcDialects.scala).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request: [SPARK-10648] Proposed bug fix when oracle ret...

Posted by SparkQA <gi...@git.apache.org>.

Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/8780#issuecomment-154003137
  
    **[Test build #1986 has started](https://amplab.cs.berkeley.edu/jenkins/job/NewSparkPullRequestBuilder/1986/consoleFull)** for PR 8780 at commit [`d11141c`](https://github.com/apache/spark/commit/d11141ca6558c9022131d849190843350088e3a1).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request: [SPARK-10648] Proposed bug fix when oracle ret...

Posted by rxin <gi...@git.apache.org>.

Github user rxin commented on the pull request:

    https://github.com/apache/spark/pull/8780#issuecomment-142095013
  
    Oh actually - let me correct it.
    
    If scale is positive, then precision needs to >= scale.
    
    If scale is negative, then precision can be anything (>0).
    
    I'm not sure if precision == 0 makes any sense, since that effectively means null value.
    



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request: [SPARK-10648] Proposed bug fix when oracle ret...

Posted by rxin <gi...@git.apache.org>.

Github user rxin commented on the pull request:

    https://github.com/apache/spark/pull/8780#issuecomment-142037639
  
    They would all be null then. It doesn't make sense to have precision < scale.



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request: [SPARK-10648] Proposed bug fix when oracle ret...

Posted by rxin <gi...@git.apache.org>.

Github user rxin commented on the pull request:

    https://github.com/apache/spark/pull/8780#issuecomment-145693030
  
    @travishegner looks like it is best to just do it in the oracle dialect.



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request: [SPARK-10648] Proposed bug fix when oracle ret...

Posted by travishegner <gi...@git.apache.org>.

Github user travishegner commented on the pull request:

https://github.com/apache/spark/pull/8780#issuecomment-140936357

I'm not sure if oracle can be associated with anything *reasonable*, but sometimes you have to play the hand you are dealt. :)

I can only answer your question with a question... Would there ever be a use case in the Decimal() class where the precision and/or the scale would be set to a negative value?

If not, then I'd have to imagine that this patch makes the intention of the code more accurate across the board. If yes, then I may have to explore an oracle-only type of patch, which may or may not ever be committed back, depending on it's usefulness to the community.

I'd have to assume that there isn't a use case for negative values given the way precision and scale are used and defined, but you'll have to forgive any ignorance on my part as I'm still fairly new to scala. I hadn't even browsed the source for spark until about one week ago. I'm still in the alpha stages of even testing spark in general, so while it's seemingly solved the problem for me in my testing, I could easily be overlooking something.

---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request: [SPARK-10648] Proposed bug fix when oracle ret...

Posted by rxin <gi...@git.apache.org>.

Github user rxin commented on the pull request:

    https://github.com/apache/spark/pull/8780#issuecomment-141350914
  
    Actually scale can be negative. It just means the number of 0s to the left of decimal point.
    
    For example, for number 123, precision = 2 and scale = -1, then 123 would become 120.



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request: [SPARK-10648] Proposed bug fix when oracle ret...

Posted by travishegner <gi...@git.apache.org>.

Github user travishegner commented on a diff in the pull request:

    https://github.com/apache/spark/pull/8780#discussion_r41171411
  
    --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/jdbc/JDBCRDD.scala ---
    @@ -66,9 +66,7 @@ private[sql] object JDBCRDD extends Logging {
           case java.sql.Types.CLOB          => StringType
           case java.sql.Types.DATALINK      => null
           case java.sql.Types.DATE          => DateType
    -      case java.sql.Types.DECIMAL
    -        if precision != 0 || scale != 0 => DecimalType.bounded(precision, scale)
    --- End diff --
    
    That wouldn't work as demonstrated earlier in the thread. A negative scale is legal to reduce precision to the 10s place, 100s place, etc...


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request: [SPARK-10648] Proposed bug fix when oracle ret...

Posted by travishegner <gi...@git.apache.org>.

Github user travishegner commented on the pull request:

    https://github.com/apache/spark/pull/8780#issuecomment-142100699
  
    OK, after looking at this a little further, it seems that DecimalType.bounded() should be called regardless of precision and scale values in JDBCRDD.scala, and then let the .bounded() method validate the precision and scale values. If it finds invalid values, it can return defaults, or throw an exception, depending on which condition is met. I'm going to move in that direction for this.
    
    I'm still a little confused by the precision and scale rules that you defined. Wouldn't it be invalid to have a precision = 10, but a scale = -20? Wouldn't that always result in a null value, or am I still completely interpreting precision and scale incorrectly?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request: [SPARK-10648] Proposed bug fix when oracle ret...

Posted by travishegner <gi...@git.apache.org>.

Github user travishegner commented on the pull request:

    https://github.com/apache/spark/pull/8780#issuecomment-140913028
  
    Yes, that is the intention. Is this the proper way to address that issue?
    
    On Wed, Sep 16, 2015, 5:14 PM Holden Karau <no...@github.com> wrote:
    
    > So I understand, the goal of this patch is that if an invalid value is
    > returned (e.g. a precision or scale <= 0), then the defaults are used yes?
    >
    > —
    > Reply to this email directly or view it on GitHub
    > <https://github.com/apache/spark/pull/8780#issuecomment-140896861>.
    >



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request: [SPARK-10648] Proposed bug fix when oracle ret...

Posted by SparkQA <gi...@git.apache.org>.

Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/8780#issuecomment-154004331
  
    **[Test build #1986 has finished](https://amplab.cs.berkeley.edu/jenkins/job/NewSparkPullRequestBuilder/1986/consoleFull)** for PR 8780 at commit [`d11141c`](https://github.com/apache/spark/commit/d11141ca6558c9022131d849190843350088e3a1).
     * This patch **fails Scala style tests**.
     * This patch merges cleanly.
     * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request: [SPARK-10648] Proposed bug fix when oracle ret...

Posted by travishegner <gi...@git.apache.org>.

Github user travishegner closed the pull request at:

    https://github.com/apache/spark/pull/8780


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org