You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@spark.apache.org by travishegner <gi...@git.apache.org> on 2015/11/05 16:10:20 UTC

[GitHub] spark pull request: Oracle dialect to handle nonspecific numeric t...

GitHub user travishegner opened a pull request:

    https://github.com/apache/spark/pull/9495

    Oracle dialect to handle nonspecific numeric types

    This is the alternative/agreed upon solution to PR #8780.
    
    Creating an OracleDialect to handle the nonspecific numeric types that can be defined in oracle.

You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/travishegner/spark OracleDialect

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/spark/pull/9495.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #9495
    
----
commit cc764d47d25eb35ae6c38c1caf1a87d65bdd10fc
Author: Travis Hegner <th...@trilliumit.com>
Date:   2015-11-05T14:53:43Z

    initial attempt

commit 3157ed5a71047ed85e8d80d4ef92c34273c38f1e
Author: Travis Hegner <th...@trilliumit.com>
Date:   2015-11-05T15:04:09Z

    adding canHandle override, and registration of OracleDialect

commit 839bcb582e138144b2d0757c9471de3a7cacc2ac
Author: Travis Hegner <th...@trilliumit.com>
Date:   2015-11-05T15:06:19Z

    more attribution

----


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-10648] Oracle dialect to handle nonspec...

Posted by yhuai <gi...@git.apache.org>.
Github user yhuai commented on a diff in the pull request:

    https://github.com/apache/spark/pull/9495#discussion_r44035938
  
    --- Diff: sql/core/src/main/scala/org/apache/spark/sql/jdbc/JdbcDialects.scala ---
    @@ -315,3 +316,27 @@ case object DerbyDialect extends JdbcDialect {
     
     }
     
    +/**
    + * :: DeveloperApi ::
    + * Default Oracle dialect, mapping a nonspecific
    + * numeric type to a general decimal type.
    + * Solution by @cloud-fan and @bdolbeare (github.com)
    + */
    +@DeveloperApi
    +case object OracleDialect extends JdbcDialect {
    +  override def canHandle(url: String): Boolean = url.startsWith("jdbc:oracle")
    +  override def getCatalystType(
    +      sqlType: Int, typeName: String, size: Int, md: MetadataBuilder): Option[DataType] = {
    +    // Handle NUMBER fields that have no precision/scale in special way
    +    // because JDBC ResultSetMetaData converts this to 0 procision and -127 scale
    +    if (sqlType == Types.NUMERIC && size == 0) {
    +      // This is sub-optimal as we have to pick a precision/scale in advance whereas the data
    +      //  in Oracle is allowed to have different precision/scale for each value.
    +      //  This conversion works in our domain for now though we need a more durable solution.
    +      //  Look into changing JDBCRDD (line 406):
    +      //    FROM:  mutableRow.update(i, Decimal(decimalVal, p, s))
    +      //    TO:  mutableRow.update(i, Decimal(decimalVal))
    +      Some(DecimalType(DecimalType.MAX_PRECISION, 10))
    +    } else None
    --- End diff --
    
    `} else {
      None
    }`


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-10648] Oracle dialect to handle nonspec...

Posted by yhuai <gi...@git.apache.org>.
Github user yhuai commented on a diff in the pull request:

    https://github.com/apache/spark/pull/9495#discussion_r44035571
  
    --- Diff: sql/core/src/main/scala/org/apache/spark/sql/jdbc/JdbcDialects.scala ---
    @@ -315,3 +316,27 @@ case object DerbyDialect extends JdbcDialect {
     
     }
     
    +/**
    + * :: DeveloperApi ::
    + * Default Oracle dialect, mapping a nonspecific
    + * numeric type to a general decimal type.
    + * Solution by @cloud-fan and @bdolbeare (github.com)
    --- End diff --
    
    @travishegner To provide more information, you can provide the links to their comments.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-10648] Oracle dialect to handle nonspec...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/9495#issuecomment-154100299
  
     Merged build triggered.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-10648] Oracle dialect to handle nonspec...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/9495#issuecomment-154102734
  
    **[Test build #45118 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/45118/consoleFull)** for PR 9495 at commit [`839bcb5`](https://github.com/apache/spark/commit/839bcb582e138144b2d0757c9471de3a7cacc2ac).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-10648] Oracle dialect to handle nonspec...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/9495#issuecomment-154106491
  
    Merged build started.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-10648] Oracle dialect to handle nonspec...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/9495#issuecomment-154154560
  
    Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-10648] Oracle dialect to handle nonspec...

Posted by dsdinter <gi...@git.apache.org>.
Github user dsdinter commented on the pull request:

    https://github.com/apache/spark/pull/9495#issuecomment-173553569
  
    It seems this issue in OJDBC and started to happen after Oracle 11g:
    http://stackoverflow.com/questions/2133679/why-would-number-columns-scale-and-or-precision-differ-in-jdbc-from-oracle-10-t



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-10648] Oracle dialect to handle nonspec...

Posted by asfgit <gi...@git.apache.org>.
Github user asfgit closed the pull request at:

    https://github.com/apache/spark/pull/9495


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-10648] Oracle dialect to handle nonspec...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/9495#issuecomment-154103401
  
    Test FAILed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/45118/
    Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-10648] Oracle dialect to handle nonspec...

Posted by yhuai <gi...@git.apache.org>.
Github user yhuai commented on the pull request:

    https://github.com/apache/spark/pull/9495#issuecomment-154171003
  
    @travishegner I backported your change to branch 1.4 and branch 1.5 (see https://github.com/apache/spark/pull/9498) with a minor change on comments. Once you update your PR, I will merge it to master. Thanks! 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-10648] Oracle dialect to handle nonspec...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/9495#issuecomment-154106392
  
     Merged build triggered.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-10648] Oracle dialect to handle nonspec...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/9495#issuecomment-154154564
  
    Test PASSed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/45122/
    Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-10648] Oracle dialect to handle nonspec...

Posted by travishegner <gi...@git.apache.org>.
Github user travishegner commented on the pull request:

    https://github.com/apache/spark/pull/9495#issuecomment-154423022
  
    Thanks @yhuai for taking care of this!


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-10648] Oracle dialect to handle nonspec...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/9495#issuecomment-154110975
  
    **[Test build #45122 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/45122/consoleFull)** for PR 9495 at commit [`a1370a7`](https://github.com/apache/spark/commit/a1370a7a93e1bc4e2f908be980165223d0f67d58).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-10648] Oracle dialect to handle nonspec...

Posted by cloud-fan <gi...@git.apache.org>.
Github user cloud-fan commented on the pull request:

    https://github.com/apache/spark/pull/9495#issuecomment-154099275
  
    ok to test


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-10648] Oracle dialect to handle nonspec...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/9495#issuecomment-154103394
  
    **[Test build #45118 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/45118/consoleFull)** for PR 9495 at commit [`839bcb5`](https://github.com/apache/spark/commit/839bcb582e138144b2d0757c9471de3a7cacc2ac).
     * This patch **fails Scala style tests**.
     * This patch merges cleanly.
     * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-10648] Oracle dialect to handle nonspec...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/9495#issuecomment-154154389
  
    **[Test build #45122 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/45122/consoleFull)** for PR 9495 at commit [`a1370a7`](https://github.com/apache/spark/commit/a1370a7a93e1bc4e2f908be980165223d0f67d58).
     * This patch passes all tests.
     * This patch merges cleanly.
     * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-10648] Oracle dialect to handle nonspec...

Posted by yhuai <gi...@git.apache.org>.
Github user yhuai commented on a diff in the pull request:

    https://github.com/apache/spark/pull/9495#discussion_r44035882
  
    --- Diff: sql/core/src/main/scala/org/apache/spark/sql/jdbc/JdbcDialects.scala ---
    @@ -315,3 +316,27 @@ case object DerbyDialect extends JdbcDialect {
     
     }
     
    +/**
    + * :: DeveloperApi ::
    + * Default Oracle dialect, mapping a nonspecific
    + * numeric type to a general decimal type.
    + * Solution by @cloud-fan and @bdolbeare (github.com)
    + */
    +@DeveloperApi
    +case object OracleDialect extends JdbcDialect {
    +  override def canHandle(url: String): Boolean = url.startsWith("jdbc:oracle")
    +  override def getCatalystType(
    +      sqlType: Int, typeName: String, size: Int, md: MetadataBuilder): Option[DataType] = {
    +    // Handle NUMBER fields that have no precision/scale in special way
    +    // because JDBC ResultSetMetaData converts this to 0 procision and -127 scale
    +    if (sqlType == Types.NUMERIC && size == 0) {
    +      // This is sub-optimal as we have to pick a precision/scale in advance whereas the data
    +      //  in Oracle is allowed to have different precision/scale for each value.
    +      //  This conversion works in our domain for now though we need a more durable solution.
    +      //  Look into changing JDBCRDD (line 406):
    --- End diff --
    
    I think putting the line number at here is not a very robust way. Since we already described the problem, we can simply explain our workaround at here.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-10648] Oracle dialect to handle nonspec...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/9495#issuecomment-154103397
  
    Merged build finished. Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-10648] Oracle dialect to handle nonspec...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/9495#issuecomment-154100332
  
    Merged build started.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-10648] Oracle dialect to handle nonspec...

Posted by yhuai <gi...@git.apache.org>.
Github user yhuai commented on the pull request:

    https://github.com/apache/spark/pull/9495#issuecomment-154182427
  
    I am merging it to master and branch 1.6. I will update the comments.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-10648] Oracle dialect to handle nonspec...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/9495#issuecomment-154088191
  
    Can one of the admins verify this patch?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org