You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@spark.apache.org by "Francisco Miguel Biete Banon (JIRA)" <ji...@apache.org> on 2019/01/31 16:26:00 UTC
[jira] [Updated] (SPARK-26800) JDBC - MySQL nullable option is ignored

     [ https://issues.apache.org/jira/browse/SPARK-26800?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Francisco Miguel Biete Banon updated SPARK-26800:
-------------------------------------------------
    Description: 
Spark 2.4.0
MySQL 5.7.21 (docker official MySQL image running with default config)

Writing a dataframe with optionally null fields result in a table with NOT NULL attributes in MySQL.

{code:java}
import org.apache.spark.sql.types._
import org.apache.spark.sql.{Row, SaveMode}
import java.sql.Timestamp

val data = Seq[Row](Row(1, null, "Boston"), Row(2, null, "New York"))
val schema = StructType(
StructField("id", IntegerType, true) ::
StructField("when", TimestampType, true) ::
StructField("city", StringType, true) :: Nil)
println(schema.toDDL)

val df = spark.createDataFrame(spark.sparkContext.parallelize(data), schema)

df.write.mode(SaveMode.Overwrite).jdbc(jdbcUrl, "temp_bug", jdbcProperties){code}

Produces

{code}
CREATE TABLE `temp_bug` (
  `id` int(11) DEFAULT NULL,
  `when` timestamp NOT NULL DEFAULT CURRENT_TIMESTAMP ON UPDATE CURRENT_TIMESTAMP,
  `city` text
) ENGINE=InnoDB DEFAULT CHARSET=latin1;
{code}

I would expect "when" column to be defined as nullable.

  was:
Spark 2.4.0
MySQL 5.7.21 (docker official MySQL image running with default config)

Writing a dataframe with optionally null fields result in a table with NOT NULL attributes in MySQL.

{code:java}
import org.apache.spark.sql.types._
import org.apache.spark.sql.{Row, SaveMode}
import java.sql.Timestamp

val data = Seq[Row](Row(1, null, "Boston"), Row(2, null, "New York"))
val schema = StructType(
StructField("id", IntegerType) ::
StructField("when", TimestampType, true) ::
StructField("city", StringType) :: Nil)
println(schema.toDDL)

val df = spark.createDataFrame(spark.sparkContext.parallelize(data), schema)

df.write.mode(SaveMode.Overwrite).jdbc(jdbcUrl, "temp_bug", jdbcProperties){code}

Produces

{code}
CREATE TABLE `temp_bug` (
  `id` int(11) DEFAULT NULL,
  `when` timestamp NOT NULL DEFAULT CURRENT_TIMESTAMP ON UPDATE CURRENT_TIMESTAMP,
  `city` text
) ENGINE=InnoDB DEFAULT CHARSET=latin1;
{code}

I would expect "when" column to be defined as nullable.


> JDBC - MySQL nullable option is ignored
> ---------------------------------------
>
>                 Key: SPARK-26800
>                 URL: https://issues.apache.org/jira/browse/SPARK-26800
>             Project: Spark
>          Issue Type: Bug
>          Components: Spark Core
>    Affects Versions: 2.4.0
>            Reporter: Francisco Miguel Biete Banon
>            Priority: Minor
>
> Spark 2.4.0
> MySQL 5.7.21 (docker official MySQL image running with default config)
> Writing a dataframe with optionally null fields result in a table with NOT NULL attributes in MySQL.
> {code:java}
> import org.apache.spark.sql.types._
> import org.apache.spark.sql.{Row, SaveMode}
> import java.sql.Timestamp
> val data = Seq[Row](Row(1, null, "Boston"), Row(2, null, "New York"))
> val schema = StructType(
> StructField("id", IntegerType, true) ::
> StructField("when", TimestampType, true) ::
> StructField("city", StringType, true) :: Nil)
> println(schema.toDDL)
> val df = spark.createDataFrame(spark.sparkContext.parallelize(data), schema)
> df.write.mode(SaveMode.Overwrite).jdbc(jdbcUrl, "temp_bug", jdbcProperties){code}
> Produces
> {code}
> CREATE TABLE `temp_bug` (
>   `id` int(11) DEFAULT NULL,
>   `when` timestamp NOT NULL DEFAULT CURRENT_TIMESTAMP ON UPDATE CURRENT_TIMESTAMP,
>   `city` text
> ) ENGINE=InnoDB DEFAULT CHARSET=latin1;
> {code}
> I would expect "when" column to be defined as nullable.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org