You are viewing a plain text version of this content. The canonical link for it is here.

Posted to reviews@spark.apache.org by naveenminchu <gi...@git.apache.org> on 2015/12/20 16:40:59 UTC

[GitHub] spark pull request: [SPARK-12437][SQL] Encapsulate the name in quo...

GitHub user naveenminchu opened a pull request:

    https://github.com/apache/spark/pull/10403

    [SPARK-12437][SQL] Encapsulate the name in quotes

    

You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/naveenminchu/spark fix-SPARK-Issue

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/spark/pull/10403.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #10403
    
----
commit 6328632ed793ba52c56fad7ee3665daa67f2925a
Author: naveenminchu <na...@gmail.com>
Date:   2015-12-20T15:38:47Z

    [SPARK-12437][SQL] Encapsulate the name in quotes

----


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request: [SPARK-12437][SQL] [WIP] Encapsulate the table...

Posted by naveenminchu <gi...@git.apache.org>.

Github user naveenminchu commented on a diff in the pull request:

    https://github.com/apache/spark/pull/10403#discussion_r48224194
  
    --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/jdbc/JdbcUtils.scala ---
    @@ -219,13 +219,21 @@ object JdbcUtils extends Logging {
         val sb = new StringBuilder()
         val dialect = JdbcDialects.get(url)
         df.schema.fields foreach { field => {
    -      val name = field.name
    +      val name = dialect.quoteIdentifier(field.name)
           val typ: String = getJdbcType(field.dataType, dialect).databaseTypeDefinition
           val nullable = if (field.nullable) "" else "NOT NULL"
           sb.append(s", $name $typ $nullable")
         }}
         if (sb.length < 2) "" else sb.substring(2)
       }
    +  
    +  /**
    +   * Compute the table name string for this RDD.
    +   */
    +  def tableString(table: String, url: String): String = {
    --- End diff --
    
    @JoshRosen Are you suggesting to build similar logic as spark-redshift has some fancy logic to handle all of these corner-cases; that you highlighted databricks/spark-redshift#102 as to how the TableName wrapper works.
    
    Please let me know ur thoughts



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request: [SPARK-12437][SQL] Encapsulate the table and c...

Posted by rxin <gi...@git.apache.org>.

Github user rxin commented on a diff in the pull request:

    https://github.com/apache/spark/pull/10403#discussion_r48222538
  
    --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/jdbc/JdbcUtils.scala ---
    @@ -219,13 +219,21 @@ object JdbcUtils extends Logging {
         val sb = new StringBuilder()
         val dialect = JdbcDialects.get(url)
         df.schema.fields foreach { field => {
    -      val name = field.name
    +      val name = dialect.quoteIdentifier(field.name)
           val typ: String = getJdbcType(field.dataType, dialect).databaseTypeDefinition
           val nullable = if (field.nullable) "" else "NOT NULL"
           sb.append(s", $name $typ $nullable")
         }}
         if (sb.length < 2) "" else sb.substring(2)
       }
    +  
    +  /**
    +   * Compute the table name string for this RDD.
    +   */
    +  def tableString(table: String, url: String): String = {
    --- End diff --
    
    since this is only used once, can you just remove this function and inline it?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request: [SPARK-12437][SQL] Encapsulate the table and c...

Posted by hvanhovell <gi...@git.apache.org>.

Github user hvanhovell commented on the pull request:

    https://github.com/apache/spark/pull/10403#issuecomment-166145343
  
    @naveenminchu is there anyway sensible way to test this? If so, could you add a test?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request: [SPARK-12437][SQL] Encapsulate the table and c...

Posted by JoshRosen <gi...@git.apache.org>.

Github user JoshRosen commented on a diff in the pull request:

    https://github.com/apache/spark/pull/10403#discussion_r50641229
  
    --- Diff: sql/core/src/main/scala/org/apache/spark/sql/jdbc/MySQLDialect.scala ---
    @@ -40,6 +40,34 @@ private case object MySQLDialect extends JdbcDialect {
       override def quoteIdentifier(colName: String): String = {
         s"`$colName`"
       }
    +  
    +  /**
    +   * Process table name in case of containing special characters like dot seperating database name
    +   * followed by table name (eg "some database"."some-table-name") or 
    +   * in case it contains characters that require quotes (e.g. space).
    +   */
    +  override def schemaQualifiedTableName(tableName: String): String = {
    +    //Removing quotes so that we can add them correctly.
    +    val tableNameWithoutQuotes = tableName.replace("\"", "").replace("\'", "")
    +    
    +    //If block for addressing the case of . (eg "some database"."some-table-name")
    +    if (tableNameWithoutQuotes.contains(".")) {
    +      val tableNameList = tableNameWithoutQuotes.split('.')
    +      tableNameList.foldLeft("") { (leftStr, rightStr) =>
    --- End diff --
    
    Ping. Any updates here?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request: [SPARK-12437][SQL] Encapsulate the table and c...

Posted by naveenminchu <gi...@git.apache.org>.

Github user naveenminchu commented on a diff in the pull request:

    https://github.com/apache/spark/pull/10403#discussion_r49275051
  
    --- Diff: sql/core/src/main/scala/org/apache/spark/sql/jdbc/MySQLDialect.scala ---
    @@ -41,6 +41,26 @@ private case object MySQLDialect extends JdbcDialect {
       override def quoteIdentifier(colName: String): String = {
         s"`$colName`"
       }
    +  
    +  override def parseTableName(tableName: String): String = {
    +    val tableName1 = tableName.replace("\"", "").replace("\'", "")
    +    if (tableName1.contains(".")) {
    +      val tableNameList = tableName1.split('.')
    +      tableNameList.foldLeft("") { (leftStr, rightStr) =>
    +        if (!"".equals(rightStr.trim())) {
    +          if ("".equals(leftStr.trim())) {
    +            leftStr + s"`$rightStr`"
    +          } else {
    +            leftStr + "." + s"`$rightStr`"
    +          }
    +        } else {
    +          leftStr
    +        }
    +      }
    +    } else {
    +      s"`$tableName1`"
    --- End diff --
    
    Yes its to return quoted identifiers


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request: [SPARK-12437][SQL] Encapsulate the table and c...

Posted by JoshRosen <gi...@git.apache.org>.

Github user JoshRosen commented on a diff in the pull request:

    https://github.com/apache/spark/pull/10403#discussion_r49266410
  
    --- Diff: sql/core/src/main/scala/org/apache/spark/sql/jdbc/MySQLDialect.scala ---
    @@ -41,6 +41,26 @@ private case object MySQLDialect extends JdbcDialect {
       override def quoteIdentifier(colName: String): String = {
         s"`$colName`"
       }
    +  
    +  override def parseTableName(tableName: String): String = {
    +    val tableName1 = tableName.replace("\"", "").replace("\'", "")
    --- End diff --
    
    Also, `tableName1` is a bad name; if you were going to go with this approach, I'd call it something like `tableNameWithoutQuotes` (but I'm not sure that this unconditional quote-stripping is correct).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request: [SPARK-12437][SQL] Encapsulate the table and c...

Posted by JoshRosen <gi...@git.apache.org>.

Github user JoshRosen commented on a diff in the pull request:

    https://github.com/apache/spark/pull/10403#discussion_r49266392
  
    --- Diff: sql/core/src/main/scala/org/apache/spark/sql/jdbc/MySQLDialect.scala ---
    @@ -41,6 +41,26 @@ private case object MySQLDialect extends JdbcDialect {
       override def quoteIdentifier(colName: String): String = {
         s"`$colName`"
       }
    +  
    +  override def parseTableName(tableName: String): String = {
    +    val tableName1 = tableName.replace("\"", "").replace("\'", "")
    +    if (tableName1.contains(".")) {
    +      val tableNameList = tableName1.split('.')
    +      tableNameList.foldLeft("") { (leftStr, rightStr) =>
    +        if (!"".equals(rightStr.trim())) {
    +          if ("".equals(leftStr.trim())) {
    +            leftStr + s"`$rightStr`"
    +          } else {
    +            leftStr + "." + s"`$rightStr`"
    +          }
    +        } else {
    +          leftStr
    +        }
    +      }
    +    } else {
    +      s"`$tableName1`"
    --- End diff --
    
    Is this supposed to return quoted identifiers? Or should that be the responsibility of another function? Just want to make sure that we're not double-quoting.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request: [SPARK-12437][SQL] [WIP] Encapsulate the table...

Posted by naveenminchu <gi...@git.apache.org>.

Github user naveenminchu commented on the pull request:

    https://github.com/apache/spark/pull/10403#issuecomment-166203008
  
    @rxin Agree 100%


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request: [SPARK-12437][SQL] [WIP] Encapsulate the table...

Posted by naveenminchu <gi...@git.apache.org>.

Github user naveenminchu commented on the pull request:

    https://github.com/apache/spark/pull/10403#issuecomment-166705003
  
    @rxin & @JoshRosen Please take a quick look of changes


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request: [SPARK-12437][SQL] Encapsulate the table and c...

Posted by JoshRosen <gi...@git.apache.org>.

Github user JoshRosen commented on the pull request:

    https://github.com/apache/spark/pull/10403#issuecomment-166444070
  
    Can you add a test for this in the MySQL docker integration tests suite (see the `docker-integration-tests` subproject)?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request: [SPARK-12437][SQL] [WIP] Encapsulate the table...

Posted by naveenminchu <gi...@git.apache.org>.

Github user naveenminchu commented on the pull request:

    https://github.com/apache/spark/pull/10403#issuecomment-166529691
  
    @rxin Yes as u suggested I have used JdbcDialect
    
    Will also look in to test part 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request: [SPARK-12437][SQL] [WIP]Encapsulate the table ...

Posted by naveenminchu <gi...@git.apache.org>.

Github user naveenminchu commented on a diff in the pull request:

    https://github.com/apache/spark/pull/10403#discussion_r51382164
  
    --- Diff: sql/core/src/main/scala/org/apache/spark/sql/jdbc/MySQLDialect.scala ---
    @@ -40,6 +40,34 @@ private case object MySQLDialect extends JdbcDialect {
       override def quoteIdentifier(colName: String): String = {
         s"`$colName`"
       }
    +  
    +  /**
    +   * Process table name in case of containing special characters like dot seperating database name
    +   * followed by table name (eg "some database"."some-table-name") or 
    +   * in case it contains characters that require quotes (e.g. space).
    +   */
    +  override def schemaQualifiedTableName(tableName: String): String = {
    +    //Removing quotes so that we can add them correctly.
    +    val tableNameWithoutQuotes = tableName.replace("\"", "").replace("\'", "")
    +    
    +    //If block for addressing the case of . (eg "some database"."some-table-name")
    +    if (tableNameWithoutQuotes.contains(".")) {
    +      val tableNameList = tableNameWithoutQuotes.split('.')
    +      tableNameList.foldLeft("") { (leftStr, rightStr) =>
    --- End diff --
    
    Will address it these week sometime


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request: [SPARK-12437][SQL] Encapsulate the table and c...

Posted by JoshRosen <gi...@git.apache.org>.

Github user JoshRosen commented on a diff in the pull request:

    https://github.com/apache/spark/pull/10403#discussion_r49266387
  
    --- Diff: sql/core/src/main/scala/org/apache/spark/sql/jdbc/MySQLDialect.scala ---
    @@ -41,6 +41,26 @@ private case object MySQLDialect extends JdbcDialect {
       override def quoteIdentifier(colName: String): String = {
         s"`$colName`"
       }
    +  
    +  override def parseTableName(tableName: String): String = {
    +    val tableName1 = tableName.replace("\"", "").replace("\'", "")
    --- End diff --
    
    Why is it correct to just drop quotes here? What if the schema name also contains a dot? Don't you need to keep track of quotes during the string processing in order to figure out where the schema name ends and the table name begins?
    
    Note that both the schema and table names could contain dots (AFAIK, if not please comment with a citation explaining why not), so simply grabbing the last component after a dot might not be correct.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request: [SPARK-12437][SQL] Encapsulate the table and c...

Posted by pjfanning <gi...@git.apache.org>.

Github user pjfanning commented on a diff in the pull request:

    https://github.com/apache/spark/pull/10403#discussion_r48481306
  
    --- Diff: docker-integration-tests/src/test/scala/org/apache/spark/sql/jdbc/MySQLIntegrationSuite.scala ---
    @@ -150,4 +150,91 @@ class MySQLIntegrationSuite extends DockerJDBCIntegrationSuite {
         df2.write.jdbc(jdbcUrl, "datescopy", new Properties)
         df3.write.jdbc(jdbcUrl, "stringscopy", new Properties)
       }
    +  
    +  test("Write test with SaveMode set to overwrite") {
    --- End diff --
    
    Would it be possible to add a table or tables to the dataPreparation function that have column names that are key words? For example:
    create table `escaped names` (`key` BIGINT, `long description` TEXT);


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request: [SPARK-12437][SQL] [WIP] Encapsulate the table...

Posted by rxin <gi...@git.apache.org>.

Github user rxin commented on the pull request:

    https://github.com/apache/spark/pull/10403#issuecomment-166525444
  
    BTW if I understand this correctly it only quotes table names, but not column names? Are column names already quoted?



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request: [SPARK-12437][SQL] Encapsulate the table and c...

Posted by naveenminchu <gi...@git.apache.org>.

Github user naveenminchu commented on a diff in the pull request:

    https://github.com/apache/spark/pull/10403#discussion_r49276239
  
    --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/jdbc/JdbcUtils.scala ---
    @@ -219,13 +219,21 @@ object JdbcUtils extends Logging {
         val sb = new StringBuilder()
         val dialect = JdbcDialects.get(url)
         df.schema.fields foreach { field => {
    -      val name = field.name
    +      val name = dialect.quoteIdentifier(field.name)
           val typ: String = getJdbcType(field.dataType, dialect).databaseTypeDefinition
           val nullable = if (field.nullable) "" else "NOT NULL"
           sb.append(s", $name $typ $nullable")
         }}
         if (sb.length < 2) "" else sb.substring(2)
       }
    +  
    +  /**
    +   * Parse the table name string for this RDD.
    --- End diff --
    
    Adjusted as suggested


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request: [SPARK-12437][SQL] [WIP]Encapsulate the table ...

Posted by asfgit <gi...@git.apache.org>.

Github user asfgit closed the pull request at:

    https://github.com/apache/spark/pull/10403


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request: [SPARK-12437][SQL] Encapsulate the table and c...

Posted by pjfanning <gi...@git.apache.org>.

Github user pjfanning commented on the pull request:

    https://github.com/apache/spark/pull/10403#issuecomment-170256145
  
    Hi Naveen - your new tests look good to me. I do notice though that you probably need to resync with master due to conflicts (there is a 'This branch has conflicts that must be resolved' warning).
    https://help.github.com/articles/syncing-a-fork/


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request: [SPARK-12437][SQL] Encapsulate the table and c...

Posted by JoshRosen <gi...@git.apache.org>.

Github user JoshRosen commented on a diff in the pull request:

    https://github.com/apache/spark/pull/10403#discussion_r49266355
  
    --- Diff: sql/core/src/main/scala/org/apache/spark/sql/jdbc/MySQLDialect.scala ---
    @@ -41,6 +41,26 @@ private case object MySQLDialect extends JdbcDialect {
       override def quoteIdentifier(colName: String): String = {
         s"`$colName`"
       }
    +  
    +  override def parseTableName(tableName: String): String = {
    +    val tableName1 = tableName.replace("\"", "").replace("\'", "")
    +    if (tableName1.contains(".")) {
    +      val tableNameList = tableName1.split('.')
    +      tableNameList.foldLeft("") { (leftStr, rightStr) =>
    --- End diff --
    
    Woah, this is kinda dense and confusing. Can you please rewrite this in a simpler fashion and add comments? This is just my drive-by first impression.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request: [SPARK-12437][SQL] Encapsulate the tbale and c...

Posted by AmplabJenkins <gi...@git.apache.org>.

Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/10403#issuecomment-166130675
  
    Can one of the admins verify this patch?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request: [SPARK-12437][SQL] Encapsulate the table and c...

Posted by JoshRosen <gi...@git.apache.org>.

Github user JoshRosen commented on a diff in the pull request:

    https://github.com/apache/spark/pull/10403#discussion_r49642754
  
    --- Diff: sql/core/src/main/scala/org/apache/spark/sql/jdbc/MySQLDialect.scala ---
    @@ -40,6 +40,34 @@ private case object MySQLDialect extends JdbcDialect {
       override def quoteIdentifier(colName: String): String = {
         s"`$colName`"
       }
    +  
    +  /**
    +   * Process table name in case of containing special characters like dot seperating database name
    +   * followed by table name (eg "some database"."some-table-name") or 
    +   * in case it contains characters that require quotes (e.g. space).
    +   */
    +  override def schemaQualifiedTableName(tableName: String): String = {
    +    //Removing quotes so that we can add them correctly.
    +    val tableNameWithoutQuotes = tableName.replace("\"", "").replace("\'", "")
    +    
    +    //If block for addressing the case of . (eg "some database"."some-table-name")
    +    if (tableNameWithoutQuotes.contains(".")) {
    +      val tableNameList = tableNameWithoutQuotes.split('.')
    +      tableNameList.foldLeft("") { (leftStr, rightStr) =>
    --- End diff --
    
    I'm still not a fan of this `foldLeft`. Can you please write this in a more clear way?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request: [SPARK-12437][SQL] Encapsulate the table and c...

Posted by JoshRosen <gi...@git.apache.org>.

Github user JoshRosen commented on a diff in the pull request:

    https://github.com/apache/spark/pull/10403#discussion_r49266453
  
    --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/jdbc/JdbcUtils.scala ---
    @@ -219,13 +219,21 @@ object JdbcUtils extends Logging {
         val sb = new StringBuilder()
         val dialect = JdbcDialects.get(url)
         df.schema.fields foreach { field => {
    -      val name = field.name
    +      val name = dialect.quoteIdentifier(field.name)
           val typ: String = getJdbcType(field.dataType, dialect).databaseTypeDefinition
           val nullable = if (field.nullable) "" else "NOT NULL"
           sb.append(s", $name $typ $nullable")
         }}
         if (sb.length < 2) "" else sb.substring(2)
       }
    +  
    +  /**
    +   * Parse the table name string for this RDD.
    --- End diff --
    
    This is confusing, since the variable `table` already sounds like it would be the table name.
    
    Can you call it `schemaQualifiedTableName` or something more explicit?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request: [SPARK-12437][SQL] Encapsulate the table and c...

Posted by rxin <gi...@git.apache.org>.

Github user rxin commented on the pull request:

    https://github.com/apache/spark/pull/10403#issuecomment-166507361
  
    And +1 on adding a test.



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request: [SPARK-12437][SQL] Encapsulate the table and c...

Posted by JoshRosen <gi...@git.apache.org>.

Github user JoshRosen commented on the pull request:

    https://github.com/apache/spark/pull/10403#issuecomment-166444605
  
    Also, quick question: what if the name is qualified with a schema and also contains special characters? e.g. what if my user-supplied table name is something like `default."table with spaces in name"`? Will this handle that case properly? What about `"some database"."some-table-name"`? My concern is that a naive, unconditional quoting will mishandle these cases.
    
    `spark-redshift` has some fancy logic to handle all of these corner-cases; see https://github.com/databricks/spark-redshift/pull/102 and pay attention to how the `TableName` wrapper works.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request: [SPARK-12437][SQL] Encapsulate the table and c...

Posted by hvanhovell <gi...@git.apache.org>.

Github user hvanhovell commented on the pull request:

    https://github.com/apache/spark/pull/10403#issuecomment-166148352
  
    @naveenminchu I tried triggering a build, but it seems that I do not have sufficient rights for this (I do have them for my own PRs).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request: [SPARK-12437][SQL] Encapsulate the table and c...

Posted by rxin <gi...@git.apache.org>.

Github user rxin commented on a diff in the pull request:

    https://github.com/apache/spark/pull/10403#discussion_r48222531
  
    --- Diff: sql/core/src/main/scala/org/apache/spark/sql/DataFrameWriter.scala ---
    @@ -311,7 +312,7 @@ final class DataFrameWriter private[sql](df: DataFrame) {
        * This is equivalent to:
        * {{{
        *   format("json").save(path)
    -   * }}}
    +   * }}}$table
    --- End diff --
    
    what's this change for?



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request: [SPARK-12437][SQL] Encapsulate the table and c...

Posted by JoshRosen <gi...@git.apache.org>.

Github user JoshRosen commented on a diff in the pull request:

    https://github.com/apache/spark/pull/10403#discussion_r49266372
  
    --- Diff: sql/core/src/main/scala/org/apache/spark/sql/jdbc/MySQLDialect.scala ---
    @@ -41,6 +41,26 @@ private case object MySQLDialect extends JdbcDialect {
       override def quoteIdentifier(colName: String): String = {
         s"`$colName`"
       }
    +  
    +  override def parseTableName(tableName: String): String = {
    +    val tableName1 = tableName.replace("\"", "").replace("\'", "")
    +    if (tableName1.contains(".")) {
    +      val tableNameList = tableName1.split('.')
    +      tableNameList.foldLeft("") { (leftStr, rightStr) =>
    --- End diff --
    
    Also, can you add unit tests for this function? _Not_ end-to-end tests, but a test which exercises this method in isolation, similar to the ones that I have in `spark-redshift`?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request: [SPARK-12437][SQL] Encapsulate the table and c...

Posted by naveenminchu <gi...@git.apache.org>.

Github user naveenminchu commented on a diff in the pull request:

    https://github.com/apache/spark/pull/10403#discussion_r49264644
  
    --- Diff: docker-integration-tests/src/test/scala/org/apache/spark/sql/jdbc/MySQLIntegrationSuite.scala ---
    @@ -150,4 +150,91 @@ class MySQLIntegrationSuite extends DockerJDBCIntegrationSuite {
         df2.write.jdbc(jdbcUrl, "datescopy", new Properties)
         df3.write.jdbc(jdbcUrl, "stringscopy", new Properties)
       }
    +  
    +  test("Write test with SaveMode set to overwrite") {
    --- End diff --
    
    @pjfanning Added test as suggested 
    For example:
    create table `escaped names` (`key` BIGINT, `long description` TEXT);


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request: [SPARK-12437][SQL] Encapsulate the table and c...

Posted by hvanhovell <gi...@git.apache.org>.

Github user hvanhovell commented on the pull request:

    https://github.com/apache/spark/pull/10403#issuecomment-166144664
  
    ok to test


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request: [SPARK-12437][SQL] Encapsulate the table and c...

Posted by rxin <gi...@git.apache.org>.

Github user rxin commented on the pull request:

    https://github.com/apache/spark/pull/10403#issuecomment-166151351
  
    Different databases have different quoting syntax. You need to use the JDBCDialect's quoting function to quote the table/column names.



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request: [SPARK-12437][SQL] Encapsulate the table and c...

Posted by JoshRosen <gi...@git.apache.org>.

Github user JoshRosen commented on a diff in the pull request:

    https://github.com/apache/spark/pull/10403#discussion_r49266438
  
    --- Diff: sql/core/src/main/scala/org/apache/spark/sql/jdbc/JdbcDialects.scala ---
    @@ -90,6 +90,15 @@ abstract class JdbcDialect extends Serializable {
       }
     
       /**
    +   * Parses a table name in case of containing special characters like . seperating database name
    +   * followed by table name (eg "some database"."some-table-name") or 
    +   * in case it contains characters that require quotes (e.g. space).
    +   */
    +  def parseTableName(tableName: String): String = {
    --- End diff --
    
    The method contract here is confusing to me: is this supposed to introduce an additional layer of quotes? I find it confusing to couple both the responsibilities of quoting identifiers and parsing table names from a schema-qualified name.
    
    I'd prefer to have this return just the table name, unquoted, and leave it up to later components to escape it. If you _do_ think we should do quoting here, please explain that part of the API contract in the Scaladoc.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org