You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@spark.apache.org by Pei-Lun Lee <pl...@appier.com> on 2015/03/19 05:20:59 UTC

SparkSQL 1.3.0 JDBC data source issues

Hi,

I am trying jdbc data source in spark sql 1.3.0 and found some issues.

First, the syntax "where str_col='value'" will give error for both
postgresql and mysql:

psql> create table foo(id int primary key,name text,age int);
bash> SPARK_CLASSPATH=postgresql-9.4-1201-jdbc41.jar spark/bin/spark-shell
scala>
sqlContext.load("jdbc",Map("url"->"jdbc:postgresql://XXX","dbtable"->"foo")).registerTempTable("foo")
scala> sql("select * from foo where name='bar'").collect
org.postgresql.util.PSQLException: ERROR: operator does not exist: text =
bar
  Hint: No operator matches the given name and argument type(s). You might
need to add explicit type casts.
  Position: 40
scala> sql("select * from foo where name like '%foo'").collect

bash> SPARK_CLASSPATH=mysql-connector-java-5.1.34.jar spark/bin/spark-shell
scala>
sqlContext.load("jdbc",Map("url"->"jdbc:mysql://XXX","dbtable"->"foo")).registerTempTable("foo")
scala> sql("select * from foo where name='bar'").collect
com.mysql.jdbc.exceptions.jdbc4.MySQLSyntaxErrorException: Unknown column
'bar' in 'where clause'



Second, postgresql table with json data type does not work:

psql> create table foo(id int primary key, data json);
scala>
sqlContext.load("jdbc",Map("url"->"jdbc:mysql://XXX","dbtable"->"foo")).registerTempTable("foo")
java.sql.SQLException: Unsupported type 1111



Not sure these are bug in spark sql or jdbc. I can file JIRA ticket if
needed.

Thanks,
--
Pei-Lun

Re: SparkSQL 1.3.0 JDBC data source issues

Posted by Pei-Lun Lee <pl...@appier.com>.
JIRA and PR for first issue:
https://issues.apache.org/jira/browse/SPARK-6408
https://github.com/apache/spark/pull/5087

On Thu, Mar 19, 2015 at 12:20 PM, Pei-Lun Lee <pl...@appier.com> wrote:

> Hi,
>
> I am trying jdbc data source in spark sql 1.3.0 and found some issues.
>
> First, the syntax "where str_col='value'" will give error for both
> postgresql and mysql:
>
> psql> create table foo(id int primary key,name text,age int);
> bash> SPARK_CLASSPATH=postgresql-9.4-1201-jdbc41.jar spark/bin/spark-shell
> scala>
> sqlContext.load("jdbc",Map("url"->"jdbc:postgresql://XXX","dbtable"->"foo")).registerTempTable("foo")
> scala> sql("select * from foo where name='bar'").collect
> org.postgresql.util.PSQLException: ERROR: operator does not exist: text =
> bar
>   Hint: No operator matches the given name and argument type(s). You might
> need to add explicit type casts.
>   Position: 40
> scala> sql("select * from foo where name like '%foo'").collect
>
> bash> SPARK_CLASSPATH=mysql-connector-java-5.1.34.jar spark/bin/spark-shell
> scala>
> sqlContext.load("jdbc",Map("url"->"jdbc:mysql://XXX","dbtable"->"foo")).registerTempTable("foo")
> scala> sql("select * from foo where name='bar'").collect
> com.mysql.jdbc.exceptions.jdbc4.MySQLSyntaxErrorException: Unknown column
> 'bar' in 'where clause'
>
>
>
> Second, postgresql table with json data type does not work:
>
> psql> create table foo(id int primary key, data json);
> scala>
> sqlContext.load("jdbc",Map("url"->"jdbc:mysql://XXX","dbtable"->"foo")).registerTempTable("foo")
> java.sql.SQLException: Unsupported type 1111
>
>
>
> Not sure these are bug in spark sql or jdbc. I can file JIRA ticket if
> needed.
>
> Thanks,
> --
> Pei-Lun
>
>

Re: SparkSQL 1.3.0 JDBC data source issues

Posted by Pei-Lun Lee <pl...@appier.com>.
JIRA and PR for first issue:
https://issues.apache.org/jira/browse/SPARK-6408
https://github.com/apache/spark/pull/5087

On Thu, Mar 19, 2015 at 12:20 PM, Pei-Lun Lee <pl...@appier.com> wrote:

> Hi,
>
> I am trying jdbc data source in spark sql 1.3.0 and found some issues.
>
> First, the syntax "where str_col='value'" will give error for both
> postgresql and mysql:
>
> psql> create table foo(id int primary key,name text,age int);
> bash> SPARK_CLASSPATH=postgresql-9.4-1201-jdbc41.jar spark/bin/spark-shell
> scala>
> sqlContext.load("jdbc",Map("url"->"jdbc:postgresql://XXX","dbtable"->"foo")).registerTempTable("foo")
> scala> sql("select * from foo where name='bar'").collect
> org.postgresql.util.PSQLException: ERROR: operator does not exist: text =
> bar
>   Hint: No operator matches the given name and argument type(s). You might
> need to add explicit type casts.
>   Position: 40
> scala> sql("select * from foo where name like '%foo'").collect
>
> bash> SPARK_CLASSPATH=mysql-connector-java-5.1.34.jar spark/bin/spark-shell
> scala>
> sqlContext.load("jdbc",Map("url"->"jdbc:mysql://XXX","dbtable"->"foo")).registerTempTable("foo")
> scala> sql("select * from foo where name='bar'").collect
> com.mysql.jdbc.exceptions.jdbc4.MySQLSyntaxErrorException: Unknown column
> 'bar' in 'where clause'
>
>
>
> Second, postgresql table with json data type does not work:
>
> psql> create table foo(id int primary key, data json);
> scala>
> sqlContext.load("jdbc",Map("url"->"jdbc:mysql://XXX","dbtable"->"foo")).registerTempTable("foo")
> java.sql.SQLException: Unsupported type 1111
>
>
>
> Not sure these are bug in spark sql or jdbc. I can file JIRA ticket if
> needed.
>
> Thanks,
> --
> Pei-Lun
>
>