You are viewing a plain text version of this content. The canonical link for it is here.

Posted to dev@spark.apache.org by Nico Pappagianis <ni...@salesforce.com> on 2016/10/10 19:49:59 UTC

Quotes within a table name (phoenix table) getting failure: identifier expected at Spark level parsing

Hello,

*Some context:*
I have a Phoenix tenant-specific view named CUSTOM_ENTITY."z02" (Phoenix
tables can have quotes to specify case-sensitivity). I am attempting to
write to this table using Spark via a scala script. I am performing the
following read successfully:

val table = """CUSTOM_ENTITY."z02""""
val tenantId = "myTenantId"
val urlWithTenant =
"jdbc:phoenix:myZKHost1, myZKHost1, myZKHost2,
myZKHost3:2181;TenantId=myTenantId"
val driver = "org.apache.phoenix.jdbc.PhoenixDriver"

val readOptions = Map(driver" -> driver, "url" -> urlWithTenant, "dbtable"
-> table
)

val df = sqlContext.read.format("jdbc").options(readOptions).load

This gives me the dataframe with data successfully read from my tenant view.

Now when I try to write back with this dataframe:

df.write.format("jdbc").insertInto(table)


I am getting the following exception:

java.lang.RuntimeException: [1.15] failure: identifier expected

CUSTOM_ENTITY."z02"

                              ^

(caret is pointing under the '.' before "z02")

at scala.sys.package$.error(package.scala:27)

at
org.apache.spark.sql.catalyst.SqlParser$.parseTableIdentifier(SqlParser.scala:56)

at
org.apache.spark.sql.DataFrameWriter.insertInto(DataFrameWriter.scala:164)

Looking at the stack trace it appears that Spark doesn't know what to do
with the quotes around z02. I've tried escaping them in every way I could
think of but to no avail.

Is there a way to have Spark not complain about the quotes and correctly
pass them along?

Thanks

Re: Quotes within a table name (phoenix table) getting failure: identifier expected at Spark level parsing

Posted by Xiao Li <ga...@gmail.com>.

Hi, Nico,

It sounds like you hit a bug in Phoenix Connector. Our general JDBC
connector already fixed it, I think.

Thanks,

Xiao

2016-10-10 15:29 GMT-07:00 Nico Pappagianis <nico.pappagianis@salesforce.com
>:

> Hi Xiao, when I try that it gets past spark's sql parser then errors out
> at the phoenix sql parser.
>
> org.apache.phoenix.exception.PhoenixParserException: ERROR 601 (42P00):
> Syntax error. Unexpected char: '`'
>
> at org.apache.phoenix.exception.PhoenixParserException.newException(
> PhoenixParserException.java:33)
>
> at org.apache.phoenix.parse.SQLParser.parseStatement(SQLParser.java:118)
>
> at org.apache.phoenix.jdbc.PhoenixStatement$PhoenixStatementParser.
> parseStatement(PhoenixStatement.java:1228)
>
> at org.apache.phoenix.jdbc.PhoenixStatement.parseStatement(
> PhoenixStatement.java:1311)
>
> at org.apache.phoenix.jdbc.PhoenixPreparedStatement.<init>(
> PhoenixPreparedStatement.java:94)
>
> at org.apache.phoenix.jdbc.PhoenixConnection.prepareStatement(
> PhoenixConnection.java:714)
>
>
> It appears that Phoenix and Spark's query parsers are in disagreement.
>
> Any ideas?
>
>
> Thanks!
>
> On Mon, Oct 10, 2016 at 3:10 PM, Xiao Li <ga...@gmail.com> wrote:
>
>> HI, Nico,
>>
>> We use back ticks to quote it. For example,
>>
>> CUSTOM_ENTITY.`z02`
>>
>> Thanks,
>>
>> Xiao Li
>>
>> 2016-10-10 12:49 GMT-07:00 Nico Pappagianis <
>> nico.pappagianis@salesforce.com>:
>>
>>> Hello,
>>>
>>> *Some context:*
>>> I have a Phoenix tenant-specific view named CUSTOM_ENTITY."z02" (Phoenix
>>> tables can have quotes to specify case-sensitivity). I am attempting to
>>> write to this table using Spark via a scala script. I am performing the
>>> following read successfully:
>>>
>>> val table = """CUSTOM_ENTITY."z02""""
>>> val tenantId = "myTenantId"
>>> val urlWithTenant = "jdbc:phoenix:myZKHost1, myZKH
>>> ost1, myZKHost2, myZKHost3:2181;TenantId=myTenantId"
>>> val driver = "org.apache.phoenix.jdbc.PhoenixDriver"
>>>
>>> val readOptions = Map(driver" -> driver, "url" -> urlWithTenant,
>>> "dbtable" -> table
>>> )
>>>
>>> val df = sqlContext.read.format("jdbc").options(readOptions).load
>>>
>>> This gives me the dataframe with data successfully read from my tenant
>>> view.
>>>
>>> Now when I try to write back with this dataframe:
>>>
>>> df.write.format("jdbc").insertInto(table)
>>>
>>>
>>> I am getting the following exception:
>>>
>>> java.lang.RuntimeException: [1.15] failure: identifier expected
>>>
>>> CUSTOM_ENTITY."z02"
>>>
>>>                               ^
>>>
>>> (caret is pointing under the '.' before "z02")
>>>
>>> at scala.sys.package$.error(package.scala:27)
>>>
>>> at org.apache.spark.sql.catalyst.SqlParser$.parseTableIdentifie
>>> r(SqlParser.scala:56)
>>>
>>> at org.apache.spark.sql.DataFrameWriter.insertInto(DataFrameWri
>>> ter.scala:164)
>>>
>>> Looking at the stack trace it appears that Spark doesn't know what to do
>>> with the quotes around z02. I've tried escaping them in every way I could
>>> think of but to no avail.
>>>
>>> Is there a way to have Spark not complain about the quotes and correctly
>>> pass them along?
>>>
>>> Thanks
>>>
>>
>>
>

Re: Quotes within a table name (phoenix table) getting failure: identifier expected at Spark level parsing

Posted by Xiao Li <ga...@gmail.com>.

HI, Nico,

We use back ticks to quote it. For example,

CUSTOM_ENTITY.`z02`

Thanks,

Xiao Li

2016-10-10 12:49 GMT-07:00 Nico Pappagianis <nico.pappagianis@salesforce.com
>:

> Hello,
>
> *Some context:*
> I have a Phoenix tenant-specific view named CUSTOM_ENTITY."z02" (Phoenix
> tables can have quotes to specify case-sensitivity). I am attempting to
> write to this table using Spark via a scala script. I am performing the
> following read successfully:
>
> val table = """CUSTOM_ENTITY."z02""""
> val tenantId = "myTenantId"
> val urlWithTenant = "jdbc:phoenix:myZKHost1, myZKHost1, myZKHost2,
> myZKHost3:2181;TenantId=myTenantId"
> val driver = "org.apache.phoenix.jdbc.PhoenixDriver"
>
> val readOptions = Map(driver" -> driver, "url" -> urlWithTenant, "dbtable"
> -> table
> )
>
> val df = sqlContext.read.format("jdbc").options(readOptions).load
>
> This gives me the dataframe with data successfully read from my tenant
> view.
>
> Now when I try to write back with this dataframe:
>
> df.write.format("jdbc").insertInto(table)
>
>
> I am getting the following exception:
>
> java.lang.RuntimeException: [1.15] failure: identifier expected
>
> CUSTOM_ENTITY."z02"
>
>                               ^
>
> (caret is pointing under the '.' before "z02")
>
> at scala.sys.package$.error(package.scala:27)
>
> at org.apache.spark.sql.catalyst.SqlParser$.parseTableIdentifier(
> SqlParser.scala:56)
>
> at org.apache.spark.sql.DataFrameWriter.insertInto(
> DataFrameWriter.scala:164)
>
> Looking at the stack trace it appears that Spark doesn't know what to do
> with the quotes around z02. I've tried escaping them in every way I could
> think of but to no avail.
>
> Is there a way to have Spark not complain about the quotes and correctly
> pass them along?
>
> Thanks
>