You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@spark.apache.org by rtreffer <gi...@git.apache.org> on 2015/04/13 23:50:12 UTC

[GitHub] spark pull request: Export driver quirks

GitHub user rtreffer opened a pull request:

    https://github.com/apache/spark/pull/5498

    Export driver quirks

    Make it possible to (temporary) overwrite the driver quirks. This
    can be used to overcome problems with specific schemas or to
    add new jdbc driver support on the fly.
    
    A very simple implementation to dump the loading can be done like this (spark-shell)
    ```
    class DumpQuirk extends org.apache.spark.sql.jdbc.DriverQuirks {
      def canHandle(url : String): Boolean = true
      def getCatalystType(sqlType: Int, typeName: String, size: Int, md: org.apache.spark.sql.types.MetadataBuilder): org.apache.spark.sql.types.DataType = {
        println("" + (sqlType, typeName, size, md))
        null
      }
      def getJDBCType(dt: org.apache.spark.sql.types.DataType): (String, Option[Int]) = (null, None)
    }
    org.apache.spark.sql.jdbc.DriverQuirks.registerQuirks(new DumpQuirk())
    ```
    
    Not that this pull request is against 1.3 - I could not create a distribution with the current master.

You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/rtreffer/spark export-driver-quirks

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/spark/pull/5498.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #5498
    
----
commit dca9372fc32e3222afc25e11394b1e620c115efa
Author: Rene Treffer <tr...@measite.de>
Date:   2015-04-13T21:38:59Z

    Export driver quirks
    
    Make it possible to (temporary) overwrite the driver quirks. This
    can be used to overcome problems with specific schemas or to
    add new jdbc driver support on the fly.

----


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-6888][SQL] Export driver quirks

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/5498#issuecomment-93317149
  
      [Test build #30332 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/30332/consoleFull) for   PR 5498 at commit [`9ca66d9`](https://github.com/apache/spark/commit/9ca66d9bc62db3519276cfe5c88d20ccaab69ada).
     * This patch **passes all tests**.
     * This patch merges cleanly.
     * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-6888][SQL] Export driver quirks

Posted by marmbrus <gi...@git.apache.org>.
Github user marmbrus commented on a diff in the pull request:

    https://github.com/apache/spark/pull/5498#discussion_r28470099
  
    --- Diff: sql/core/src/main/scala/org/apache/spark/sql/jdbc/DriverQuirks.scala ---
    @@ -39,33 +39,68 @@ import java.sql.Types
      * if `getJDBCType` returns `(null, None)`, the default type handling is used
      * for the given Catalyst type.
      */
    -private[sql] abstract class DriverQuirks {
    +abstract class DriverQuirks {
    +  def canHandle(url : String): Boolean
    --- End diff --
    
    Add scala doc to describe the contract for each of these methods.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-6888][SQL] Export driver quirks

Posted by marmbrus <gi...@git.apache.org>.
Github user marmbrus commented on the pull request:

    https://github.com/apache/spark/pull/5498#issuecomment-93825214
  
    `Dialect` seems reasonable to me.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: Export driver quirks

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/5498#issuecomment-92511404
  
    Can one of the admins verify this patch?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-6888][SQL] Export driver quirks

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/5498#issuecomment-93572314
  
      [Test build #30375 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/30375/consoleFull) for   PR 5498 at commit [`7f23484`](https://github.com/apache/spark/commit/7f234842f2eb12dabfa5cd6b9145f678a26a0b0a).
     * This patch merges cleanly.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-6888][SQL] Export driver quirks

Posted by rtreffer <gi...@git.apache.org>.
Github user rtreffer commented on the pull request:

    https://github.com/apache/spark/pull/5498#issuecomment-93272238
  
    @liancheng thank you, will updaste the patch.
    Just one question: Should I sqash/amend the fixes or should I add a second commit?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: Export driver quirks

Posted by srowen <gi...@git.apache.org>.
Github user srowen commented on the pull request:

    https://github.com/apache/spark/pull/5498#issuecomment-92511690
  
    This needs a JIRA -- see https://cwiki.apache.org/confluence/display/SPARK/Contributing+to+Spark


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-6888][SQL] Export driver quirks

Posted by liancheng <gi...@git.apache.org>.
Github user liancheng commented on a diff in the pull request:

    https://github.com/apache/spark/pull/5498#discussion_r28403475
  
    --- Diff: sql/core/src/main/scala/org/apache/spark/sql/jdbc/DriverQuirks.scala ---
    @@ -39,33 +39,70 @@ import java.sql.Types
      * if `getJDBCType` returns `(null, None)`, the default type handling is used
      * for the given Catalyst type.
      */
    -private[sql] abstract class DriverQuirks {
    +abstract class DriverQuirks {
    +  def canHandle(url : String): Boolean
       def getCatalystType(sqlType: Int, typeName: String, size: Int, md: MetadataBuilder): DataType
       def getJDBCType(dt: DataType): (String, Option[Int])
     }
     
    -private[sql] object DriverQuirks {
    +object DriverQuirks {
    +
    +  private var quirks = List[DriverQuirks]()
    +
    +  def registerQuirks(quirk: DriverQuirks) {
    +    quirks = quirk :: quirks
    +  }
    +
    +  def unregisterQuirks(quirk : DriverQuirks) {
    +    quirks = quirks.filterNot(_ == quirk)
    +  }
    +
    +  registerQuirks(new MySQLQuirks())
    +  registerQuirks(new PostgresQuirks())
    +
       /**
        * Fetch the DriverQuirks class corresponding to a given database url.
        */
       def get(url: String): DriverQuirks = {
    -    if (url.substring(0, 10).equals("jdbc:mysql")) {
    -      new MySQLQuirks()
    -    } else if (url.substring(0, 15).equals("jdbc:postgresql")) {
    -      new PostgresQuirks()
    -    } else {
    -      new NoQuirks()
    +    val matchingQuirks = quirks.filter(_.canHandle(url))
    +    matchingQuirks.length match {
    +      case 0 => new NoQuirks()
    +      case 1 => matchingQuirks.head
    +      case _ => new AggregatedQuirks(matchingQuirks)
         }
       }
     }
     
    -private[sql] class NoQuirks extends DriverQuirks {
    +class AggregatedQuirks(quirks: List[DriverQuirks]) extends DriverQuirks {
    +  def canHandle(url : String): Boolean =
    +    quirks.foldLeft(true)((l,r) => l && r.canHandle(url))
    --- End diff --
    
    Since `quirks` is always non-empty (we should add an assertion at the beginning of the constructor), we can do this:
    
    ```scala
    quirks.map(_.canHandle(url)).reduce(_ && _)
    ```


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-6888][SQL] Export driver quirks

Posted by liancheng <gi...@git.apache.org>.
Github user liancheng commented on a diff in the pull request:

    https://github.com/apache/spark/pull/5498#discussion_r28403486
  
    --- Diff: sql/core/src/main/scala/org/apache/spark/sql/jdbc/DriverQuirks.scala ---
    @@ -39,33 +39,70 @@ import java.sql.Types
      * if `getJDBCType` returns `(null, None)`, the default type handling is used
      * for the given Catalyst type.
      */
    -private[sql] abstract class DriverQuirks {
    +abstract class DriverQuirks {
    +  def canHandle(url : String): Boolean
       def getCatalystType(sqlType: Int, typeName: String, size: Int, md: MetadataBuilder): DataType
       def getJDBCType(dt: DataType): (String, Option[Int])
     }
     
    -private[sql] object DriverQuirks {
    +object DriverQuirks {
    +
    +  private var quirks = List[DriverQuirks]()
    +
    +  def registerQuirks(quirk: DriverQuirks) {
    +    quirks = quirk :: quirks
    +  }
    +
    +  def unregisterQuirks(quirk : DriverQuirks) {
    +    quirks = quirks.filterNot(_ == quirk)
    +  }
    +
    +  registerQuirks(new MySQLQuirks())
    +  registerQuirks(new PostgresQuirks())
    +
       /**
        * Fetch the DriverQuirks class corresponding to a given database url.
        */
       def get(url: String): DriverQuirks = {
    -    if (url.substring(0, 10).equals("jdbc:mysql")) {
    -      new MySQLQuirks()
    -    } else if (url.substring(0, 15).equals("jdbc:postgresql")) {
    -      new PostgresQuirks()
    -    } else {
    -      new NoQuirks()
    +    val matchingQuirks = quirks.filter(_.canHandle(url))
    +    matchingQuirks.length match {
    +      case 0 => new NoQuirks()
    +      case 1 => matchingQuirks.head
    +      case _ => new AggregatedQuirks(matchingQuirks)
         }
       }
     }
     
    -private[sql] class NoQuirks extends DriverQuirks {
    +class AggregatedQuirks(quirks: List[DriverQuirks]) extends DriverQuirks {
    +  def canHandle(url : String): Boolean =
    +    quirks.foldLeft(true)((l,r) => l && r.canHandle(url))
    --- End diff --
    
    Add a newline after this line.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-6888][SQL] Export driver quirks

Posted by liancheng <gi...@git.apache.org>.
Github user liancheng commented on the pull request:

    https://github.com/apache/spark/pull/5498#issuecomment-93266403
  
    ok to test.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-6888][SQL] Export driver quirks

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/5498#issuecomment-93593653
  
    Test PASSed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/30375/
    Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-6888][SQL] Export driver quirks

Posted by marmbrus <gi...@git.apache.org>.
Github user marmbrus commented on the pull request:

    https://github.com/apache/spark/pull/5498#issuecomment-93588708
  
    If we are going to make this a public API we should consider a clearer name.  Perhaps. `JDBCTypeMapping`?  You will also need to reopen the PR against master as we don't want to add new APIs in a maintenance branch.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-6888][SQL] Export driver quirks

Posted by liancheng <gi...@git.apache.org>.
Github user liancheng commented on a diff in the pull request:

    https://github.com/apache/spark/pull/5498#discussion_r28402888
  
    --- Diff: sql/core/src/main/scala/org/apache/spark/sql/jdbc/DriverQuirks.scala ---
    @@ -39,33 +39,70 @@ import java.sql.Types
      * if `getJDBCType` returns `(null, None)`, the default type handling is used
      * for the given Catalyst type.
      */
    -private[sql] abstract class DriverQuirks {
    +abstract class DriverQuirks {
    +  def canHandle(url : String): Boolean
       def getCatalystType(sqlType: Int, typeName: String, size: Int, md: MetadataBuilder): DataType
       def getJDBCType(dt: DataType): (String, Option[Int])
     }
     
    -private[sql] object DriverQuirks {
    +object DriverQuirks {
    +
    +  private var quirks = List[DriverQuirks]()
    +
    +  def registerQuirks(quirk: DriverQuirks) {
    +    quirks = quirk :: quirks
    +  }
    +
    +  def unregisterQuirks(quirk : DriverQuirks) {
    --- End diff --
    
    Same as above.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: Export driver quirks

Posted by rtreffer <gi...@git.apache.org>.
Github user rtreffer commented on the pull request:

    https://github.com/apache/spark/pull/5498#issuecomment-92521352
  
    Added a ticket: https://issues.apache.org/jira/browse/SPARK-6888
    Will add that to the commit after some sleep .zZzZzZ


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-6888][SQL] Export driver quirks

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/5498#issuecomment-93593641
  
      [Test build #30375 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/30375/consoleFull) for   PR 5498 at commit [`7f23484`](https://github.com/apache/spark/commit/7f234842f2eb12dabfa5cd6b9145f678a26a0b0a).
     * This patch **passes all tests**.
     * This patch merges cleanly.
     * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-6888][SQL] Export driver quirks

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/5498#issuecomment-93268869
  
      [Test build #30332 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/30332/consoleFull) for   PR 5498 at commit [`9ca66d9`](https://github.com/apache/spark/commit/9ca66d9bc62db3519276cfe5c88d20ccaab69ada).
     * This patch merges cleanly.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-6888][SQL] Export driver quirks

Posted by marmbrus <gi...@git.apache.org>.
Github user marmbrus commented on a diff in the pull request:

    https://github.com/apache/spark/pull/5498#discussion_r28470253
  
    --- Diff: sql/core/src/main/scala/org/apache/spark/sql/jdbc/DriverQuirks.scala ---
    @@ -39,33 +39,68 @@ import java.sql.Types
      * if `getJDBCType` returns `(null, None)`, the default type handling is used
      * for the given Catalyst type.
      */
    -private[sql] abstract class DriverQuirks {
    +abstract class DriverQuirks {
    +  def canHandle(url : String): Boolean
       def getCatalystType(sqlType: Int, typeName: String, size: Int, md: MetadataBuilder): DataType
       def getJDBCType(dt: DataType): (String, Option[Int])
     }
     
    -private[sql] object DriverQuirks {
    +object DriverQuirks {
    +
    +  private var quirks = List[DriverQuirks]()
    +
    +  def registerQuirks(quirk: DriverQuirks) : Unit = {
    +    quirks = quirk :: quirks
    +  }
    +
    +  def unregisterQuirks(quirk : DriverQuirks) : Unit = {
    +    quirks = quirks.filterNot(_ == quirk)
    +  }
    +
    +  registerQuirks(new MySQLQuirks())
    +  registerQuirks(new PostgresQuirks())
    +
       /**
        * Fetch the DriverQuirks class corresponding to a given database url.
        */
       def get(url: String): DriverQuirks = {
    --- End diff --
    
    Does this need to be public?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-6888][SQL] Export driver quirks

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/5498#issuecomment-93909418
  
      [Test build #30463 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/30463/consoleFull) for   PR 5498 at commit [`22d65ca`](https://github.com/apache/spark/commit/22d65cac9bb22a9cdda5019042acca0c66e46270).
     * This patch merges cleanly.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-6888][SQL] Export driver quirks

Posted by rtreffer <gi...@git.apache.org>.
Github user rtreffer closed the pull request at:

    https://github.com/apache/spark/pull/5498


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-6888][SQL] Export driver quirks

Posted by marmbrus <gi...@git.apache.org>.
Github user marmbrus commented on the pull request:

    https://github.com/apache/spark/pull/5498#issuecomment-93825281
  
    We will also want to mark all of these `@DeveloperApi`


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-6888][SQL] Export driver quirks

Posted by liancheng <gi...@git.apache.org>.
Github user liancheng commented on a diff in the pull request:

    https://github.com/apache/spark/pull/5498#discussion_r28403924
  
    --- Diff: sql/core/src/main/scala/org/apache/spark/sql/jdbc/DriverQuirks.scala ---
    @@ -39,33 +39,70 @@ import java.sql.Types
      * if `getJDBCType` returns `(null, None)`, the default type handling is used
      * for the given Catalyst type.
      */
    -private[sql] abstract class DriverQuirks {
    +abstract class DriverQuirks {
    +  def canHandle(url : String): Boolean
       def getCatalystType(sqlType: Int, typeName: String, size: Int, md: MetadataBuilder): DataType
       def getJDBCType(dt: DataType): (String, Option[Int])
     }
     
    -private[sql] object DriverQuirks {
    +object DriverQuirks {
    +
    +  private var quirks = List[DriverQuirks]()
    +
    +  def registerQuirks(quirk: DriverQuirks) {
    +    quirks = quirk :: quirks
    +  }
    +
    +  def unregisterQuirks(quirk : DriverQuirks) {
    +    quirks = quirks.filterNot(_ == quirk)
    +  }
    +
    +  registerQuirks(new MySQLQuirks())
    +  registerQuirks(new PostgresQuirks())
    +
       /**
        * Fetch the DriverQuirks class corresponding to a given database url.
        */
       def get(url: String): DriverQuirks = {
    -    if (url.substring(0, 10).equals("jdbc:mysql")) {
    -      new MySQLQuirks()
    -    } else if (url.substring(0, 15).equals("jdbc:postgresql")) {
    -      new PostgresQuirks()
    -    } else {
    -      new NoQuirks()
    +    val matchingQuirks = quirks.filter(_.canHandle(url))
    +    matchingQuirks.length match {
    +      case 0 => new NoQuirks()
    +      case 1 => matchingQuirks.head
    +      case _ => new AggregatedQuirks(matchingQuirks)
         }
       }
     }
     
    -private[sql] class NoQuirks extends DriverQuirks {
    +class AggregatedQuirks(quirks: List[DriverQuirks]) extends DriverQuirks {
    +  def canHandle(url : String): Boolean =
    +    quirks.foldLeft(true)((l,r) => l && r.canHandle(url))
    +  def getCatalystType(sqlType: Int, typeName: String, size: Int, md: MetadataBuilder) : DataType =
    +    quirks.foldLeft(null.asInstanceOf[DataType])((l,r) =>
    +      if (l != null) {
    +        l
    +      } else {
    +        r.getCatalystType(sqlType, typeName, size, md)
    +      }
    +    )
    +  def getJDBCType(dt: DataType): (String, Option[Int]) =
    +    quirks.foldLeft(null.asInstanceOf[(String, Option[Int])])((l,r) =>
    +      if (l != null) {
    +        l
    +      } else {
    +        r.getJDBCType(dt)
    +      }
    +    )
    --- End diff --
    
    Both `l` and `r` are always non-null pairs. Only `l._1` and `r._1` are possible to be null.
    
    ```scala
    quirks.map(_.getJDBCType(dt)).collectFirst {
      case p @ (typeName, _) if typeName != null => p
    }.getOrElse((null, None))
    ```


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-6888][SQL] Export driver quirks

Posted by marmbrus <gi...@git.apache.org>.
Github user marmbrus commented on a diff in the pull request:

    https://github.com/apache/spark/pull/5498#discussion_r28470212
  
    --- Diff: sql/core/src/main/scala/org/apache/spark/sql/jdbc/DriverQuirks.scala ---
    @@ -39,33 +39,68 @@ import java.sql.Types
      * if `getJDBCType` returns `(null, None)`, the default type handling is used
      * for the given Catalyst type.
      */
    -private[sql] abstract class DriverQuirks {
    +abstract class DriverQuirks {
    +  def canHandle(url : String): Boolean
       def getCatalystType(sqlType: Int, typeName: String, size: Int, md: MetadataBuilder): DataType
       def getJDBCType(dt: DataType): (String, Option[Int])
     }
     
    -private[sql] object DriverQuirks {
    +object DriverQuirks {
    +
    +  private var quirks = List[DriverQuirks]()
    +
    +  def registerQuirks(quirk: DriverQuirks) : Unit = {
    +    quirks = quirk :: quirks
    +  }
    +
    +  def unregisterQuirks(quirk : DriverQuirks) : Unit = {
    +    quirks = quirks.filterNot(_ == quirk)
    +  }
    +
    +  registerQuirks(new MySQLQuirks())
    +  registerQuirks(new PostgresQuirks())
    +
       /**
        * Fetch the DriverQuirks class corresponding to a given database url.
        */
       def get(url: String): DriverQuirks = {
    -    if (url.substring(0, 10).equals("jdbc:mysql")) {
    -      new MySQLQuirks()
    -    } else if (url.substring(0, 15).equals("jdbc:postgresql")) {
    -      new PostgresQuirks()
    -    } else {
    -      new NoQuirks()
    +    val matchingQuirks = quirks.filter(_.canHandle(url))
    +    matchingQuirks.length match {
    +      case 0 => new NoQuirks()
    +      case 1 => matchingQuirks.head
    +      case _ => new AggregatedQuirks(matchingQuirks)
         }
       }
     }
     
    -private[sql] class NoQuirks extends DriverQuirks {
    +class AggregatedQuirks(quirks: List[DriverQuirks]) extends DriverQuirks {
    +
    +  require(!quirks.isEmpty)
    +
    +  def canHandle(url : String): Boolean =
    +    quirks.map(_.canHandle(url)).reduce(_ && _)
    +
    +  def getCatalystType(sqlType: Int, typeName: String, size: Int, md: MetadataBuilder): DataType =
    +    quirks.map(_.getCatalystType(sqlType, typeName, size, md)).collectFirst {
    +      case dataType if dataType != null => dataType
    +    }.orNull
    +
    +  def getJDBCType(dt: DataType): (String, Option[Int]) =
    +    quirks.map(_.getJDBCType(dt)).collectFirst {
    +      case t @ (typeName,sqlType) if typeName != null || sqlType.isDefined => t
    +    }.getOrElse((null, None))
    +
    +}
    +
    +class NoQuirks extends DriverQuirks {
    +  def canHandle(url : String): Boolean = true
       def getCatalystType(sqlType: Int, typeName: String, size: Int, md: MetadataBuilder): DataType =
         null
       def getJDBCType(dt: DataType): (String, Option[Int]) = (null, None)
     }
     
    -private[sql] class PostgresQuirks extends DriverQuirks {
    +class PostgresQuirks extends DriverQuirks {
    --- End diff --
    
    Should these be `case object`s?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-6888][SQL] Export driver quirks

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/5498#issuecomment-93317170
  
    Test PASSed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/30332/
    Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-6888][SQL] Export driver quirks

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/5498#issuecomment-93909594
  
      [Test build #30463 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/30463/consoleFull) for   PR 5498 at commit [`22d65ca`](https://github.com/apache/spark/commit/22d65cac9bb22a9cdda5019042acca0c66e46270).
     * This patch **fails Scala style tests**.
     * This patch merges cleanly.
     * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-6888][SQL] Export driver quirks

Posted by liancheng <gi...@git.apache.org>.
Github user liancheng commented on a diff in the pull request:

    https://github.com/apache/spark/pull/5498#discussion_r28403918
  
    --- Diff: sql/core/src/main/scala/org/apache/spark/sql/jdbc/DriverQuirks.scala ---
    @@ -39,33 +39,70 @@ import java.sql.Types
      * if `getJDBCType` returns `(null, None)`, the default type handling is used
      * for the given Catalyst type.
      */
    -private[sql] abstract class DriverQuirks {
    +abstract class DriverQuirks {
    +  def canHandle(url : String): Boolean
       def getCatalystType(sqlType: Int, typeName: String, size: Int, md: MetadataBuilder): DataType
       def getJDBCType(dt: DataType): (String, Option[Int])
     }
     
    -private[sql] object DriverQuirks {
    +object DriverQuirks {
    +
    +  private var quirks = List[DriverQuirks]()
    +
    +  def registerQuirks(quirk: DriverQuirks) {
    +    quirks = quirk :: quirks
    +  }
    +
    +  def unregisterQuirks(quirk : DriverQuirks) {
    +    quirks = quirks.filterNot(_ == quirk)
    +  }
    +
    +  registerQuirks(new MySQLQuirks())
    +  registerQuirks(new PostgresQuirks())
    +
       /**
        * Fetch the DriverQuirks class corresponding to a given database url.
        */
       def get(url: String): DriverQuirks = {
    -    if (url.substring(0, 10).equals("jdbc:mysql")) {
    -      new MySQLQuirks()
    -    } else if (url.substring(0, 15).equals("jdbc:postgresql")) {
    -      new PostgresQuirks()
    -    } else {
    -      new NoQuirks()
    +    val matchingQuirks = quirks.filter(_.canHandle(url))
    +    matchingQuirks.length match {
    +      case 0 => new NoQuirks()
    +      case 1 => matchingQuirks.head
    +      case _ => new AggregatedQuirks(matchingQuirks)
         }
       }
     }
     
    -private[sql] class NoQuirks extends DriverQuirks {
    +class AggregatedQuirks(quirks: List[DriverQuirks]) extends DriverQuirks {
    +  def canHandle(url : String): Boolean =
    +    quirks.foldLeft(true)((l,r) => l && r.canHandle(url))
    +  def getCatalystType(sqlType: Int, typeName: String, size: Int, md: MetadataBuilder) : DataType =
    +    quirks.foldLeft(null.asInstanceOf[DataType])((l,r) =>
    +      if (l != null) {
    +        l
    +      } else {
    +        r.getCatalystType(sqlType, typeName, size, md)
    +      }
    +    )
    --- End diff --
    
    How about this:
    
    ```scala
    quirks.map(_.getCatalystType(sqlType, typeName, size, md)).collectFirst {
      case dataType if dataType != null => dataType
    }.orNull
    ```


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-6888][SQL] Export driver quirks

Posted by liancheng <gi...@git.apache.org>.
Github user liancheng commented on the pull request:

    https://github.com/apache/spark/pull/5498#issuecomment-93282009
  
    You may just add new commits to this PR. Also, would you please add tests for this feature?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-6888][SQL] Export driver quirks

Posted by liancheng <gi...@git.apache.org>.
Github user liancheng commented on a diff in the pull request:

    https://github.com/apache/spark/pull/5498#discussion_r28403496
  
    --- Diff: sql/core/src/main/scala/org/apache/spark/sql/jdbc/DriverQuirks.scala ---
    @@ -39,33 +39,70 @@ import java.sql.Types
      * if `getJDBCType` returns `(null, None)`, the default type handling is used
      * for the given Catalyst type.
      */
    -private[sql] abstract class DriverQuirks {
    +abstract class DriverQuirks {
    +  def canHandle(url : String): Boolean
       def getCatalystType(sqlType: Int, typeName: String, size: Int, md: MetadataBuilder): DataType
       def getJDBCType(dt: DataType): (String, Option[Int])
     }
     
    -private[sql] object DriverQuirks {
    +object DriverQuirks {
    +
    +  private var quirks = List[DriverQuirks]()
    +
    +  def registerQuirks(quirk: DriverQuirks) {
    +    quirks = quirk :: quirks
    +  }
    +
    +  def unregisterQuirks(quirk : DriverQuirks) {
    +    quirks = quirks.filterNot(_ == quirk)
    +  }
    +
    +  registerQuirks(new MySQLQuirks())
    +  registerQuirks(new PostgresQuirks())
    +
       /**
        * Fetch the DriverQuirks class corresponding to a given database url.
        */
       def get(url: String): DriverQuirks = {
    -    if (url.substring(0, 10).equals("jdbc:mysql")) {
    -      new MySQLQuirks()
    -    } else if (url.substring(0, 15).equals("jdbc:postgresql")) {
    -      new PostgresQuirks()
    -    } else {
    -      new NoQuirks()
    +    val matchingQuirks = quirks.filter(_.canHandle(url))
    +    matchingQuirks.length match {
    +      case 0 => new NoQuirks()
    +      case 1 => matchingQuirks.head
    +      case _ => new AggregatedQuirks(matchingQuirks)
         }
       }
     }
     
    -private[sql] class NoQuirks extends DriverQuirks {
    +class AggregatedQuirks(quirks: List[DriverQuirks]) extends DriverQuirks {
    +  def canHandle(url : String): Boolean =
    +    quirks.foldLeft(true)((l,r) => l && r.canHandle(url))
    +  def getCatalystType(sqlType: Int, typeName: String, size: Int, md: MetadataBuilder) : DataType =
    --- End diff --
    
    Remove the space before the last `:`


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-6888][SQL] Export driver quirks

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/5498#issuecomment-93909599
  
    Test FAILed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/30463/
    Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-6888][SQL] Export driver quirks

Posted by marmbrus <gi...@git.apache.org>.
Github user marmbrus commented on a diff in the pull request:

    https://github.com/apache/spark/pull/5498#discussion_r28470179
  
    --- Diff: sql/core/src/main/scala/org/apache/spark/sql/jdbc/DriverQuirks.scala ---
    @@ -39,33 +39,68 @@ import java.sql.Types
      * if `getJDBCType` returns `(null, None)`, the default type handling is used
      * for the given Catalyst type.
      */
    -private[sql] abstract class DriverQuirks {
    +abstract class DriverQuirks {
    +  def canHandle(url : String): Boolean
       def getCatalystType(sqlType: Int, typeName: String, size: Int, md: MetadataBuilder): DataType
       def getJDBCType(dt: DataType): (String, Option[Int])
     }
     
    -private[sql] object DriverQuirks {
    +object DriverQuirks {
    +
    +  private var quirks = List[DriverQuirks]()
    +
    +  def registerQuirks(quirk: DriverQuirks) : Unit = {
    --- End diff --
    
    scala doc here as well.  We should probably cover how president works if multiple implementations claim they `canHandle` a given URL.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-6888][SQL] Export driver quirks

Posted by rtreffer <gi...@git.apache.org>.
Github user rtreffer commented on the pull request:

    https://github.com/apache/spark/pull/5498#issuecomment-93963311
  
    Replaced by https://github.com/apache/spark/pull/5555


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-6888][SQL] Export driver quirks

Posted by liancheng <gi...@git.apache.org>.
Github user liancheng commented on a diff in the pull request:

    https://github.com/apache/spark/pull/5498#discussion_r28402880
  
    --- Diff: sql/core/src/main/scala/org/apache/spark/sql/jdbc/DriverQuirks.scala ---
    @@ -39,33 +39,70 @@ import java.sql.Types
      * if `getJDBCType` returns `(null, None)`, the default type handling is used
      * for the given Catalyst type.
      */
    -private[sql] abstract class DriverQuirks {
    +abstract class DriverQuirks {
    +  def canHandle(url : String): Boolean
       def getCatalystType(sqlType: Int, typeName: String, size: Int, md: MetadataBuilder): DataType
       def getJDBCType(dt: DataType): (String, Option[Int])
     }
     
    -private[sql] object DriverQuirks {
    +object DriverQuirks {
    +
    +  private var quirks = List[DriverQuirks]()
    +
    +  def registerQuirks(quirk: DriverQuirks) {
    --- End diff --
    
    Please add return type explicitly for all public methods.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-6888][SQL] Export driver quirks

Posted by rtreffer <gi...@git.apache.org>.
Github user rtreffer commented on the pull request:

    https://github.com/apache/spark/pull/5498#issuecomment-93677337
  
    @marmbrus thank you, I'll fix those issues and open a new one when done.
    
    Regarding naming/api:
    
    It is quite common that there is one class per sql/jdbc dialect. Often called that way (e.g. MySQLDialect on hibernate). I've found quite some projects that use the same naming (via github search).
    Anyway, in this case I'd like to match those namings and add a default implementation per method (returning the neutral element).
    
    On the other hand it's currently just doing type mapping. So JDBCTypeMapping would be a very valid name, too. It would restrict the use case more (can be good or bad).
    
    I guess you know better what would suite spark :-)


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-6888][SQL] Export driver quirks

Posted by rtreffer <gi...@git.apache.org>.
Github user rtreffer commented on the pull request:

    https://github.com/apache/spark/pull/5498#issuecomment-93571593
  
    I still have to write tests for AggregatedQuirks.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org