You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@spark.apache.org by "Vieira, Thiago" <Th...@adidas.com.INVALID> on 2023/01/04 10:41:35 UTC

[BUG?] How to handle with special characters or scape them on spark version 3.3.0?

Hello everyone,

I’ve already raised this question on stack overflow, but to be honest I truly believe this is a bug at new spark version, so I am also sending this email.

Previously I was using spark version 3.2.1 to read data from SAP database by JDBC connector, I had no issues to perform the following steps:

df_1 = spark.read.format("jdbc") \
.option("url", "URL_LINK") \
.option("dbtable", 'DATABASE."/ABC/TABLE"') \
.option("user", "USER_HERE") \
.option("password", "PW_HERE") \
.option("driver", "com.sap.db.jdbc.Driver") \
.load()

display(df_1)

df_2 = df_1.filter("`/ABC/COLUMN` = 'ID_HERE'")

display(df_2)


This code above runs as it should, returning expected rows.

Since I updated my spark version to 3.3.0, because I need to have the new trigger 'availableNow' (trigger from streaming process), this process above started to fail, does not run at all.

Please follow the error message bellow.


-----------------------------------------------------------------

----------

ParseException                            Traceback (most recent

call last)

<command-963568451378752> in <cell line: 3>()

      1 df_2 = df_1.filter("`/ABC/COLUMN` = 'ID_HERE'")

      2

----> 3 display(df_2)



/databricks/python_shell/dbruntime/display.py in display(self,

input, *args, **kwargs)

     81                     raise Exception('Triggers can only be

set for streaming queries.')

     82

---> 83                 self.add_custom_display_data("table",

input._jdf)

     84

     85         elif isinstance(input, list):



/databricks/python_shell/dbruntime/display.py in

add_custom_display_data(self, data_type, data)

     34     def add_custom_display_data(self, data_type, data):

     35         custom_display_key = str(uuid.uuid4())

---> 36         return_code =

self.entry_point.addCustomDisplayData(custom_display_key,

data_type, data)

     37         ip_display({

     38             "application/vnd.databricks.v1+display":

custom_display_key,



/databricks/spark/python/lib/py4j-0.10.9.5-

src.zip/py4j/java_gateway.py in __call__(self, *args)

   1319

   1320         answer =

self.gateway_client.send_command(command)

-> 1321         return_value = get_return_value(

   1322             answer, self.gateway_client, self.target_id,

self.name)

   1323



/databricks/spark/python/pyspark/sql/utils.py in deco(*a, **kw)

    200                 # Hide where the exception came from that

shows a non-Pythonic

    201                 # JVM exception message.

--> 202                 raise converted from None

    203             else:

    204                 raise



ParseException:

[PARSE_SYNTAX_ERROR] Syntax error at or near '/': extra input

'/'(line 1, pos 0)



== SQL ==

/ABC/COLUMN

^^^

I've already tried to format in so many different ways, following the instructions on: https://spark.apache.org/docs/latest/sql-ref-literals.html . I've already tried to use function string by previously formatting the string, also tried raw string, but nothing seems to work as supposed.

Another important information, I've tried to create a dummy code for you to be able to replicate the issue, but when I create those tables with slashes '/ABC/TABLE' containing columns with slashes '/ABC/COLUMN' directly on pyspark, instead of using JDBC connector, it actually works, I was able to filter, so I believe this error is related to SQL / JDBC, I am not able to space special characters at spark 3.3.0 anymore.


Regards,

Thiago Vieira
Data Engineer

This e-mail and any attachments contain privileged and confidential information intended only for the use of the addressee(s). If you are not an intended recipient of this e-mail, you are hereby notified that any dissemination, copying or use of information within it is strictly prohibited. If you received this e-mail in error or without authorization, please notify us immediately by reply e-mail and delete the e-mail from your system. Thank you in advance.

Este correo electrónico y, en su caso, cualquier fichero anexo al mismo, contiene información de carácter confidencial exclusivamente dirigida a su(s) destinatario(s). En el caso de haber recibido este correo electrónico por error, se ruega notificar inmediatamente esta circunstancia mediante reenvío a la dirección electrónica del remitente y el borrado del mismo, y se informa que cualquier transmisión, copia o uso de esta información está estrictamente prohibida. Muchas gracias

Este e-mail e quaisquer anexos seus podem conter informação confidencial para uso exclusivo do destinatário. Se não for o destinatário, não deverá usar, distribuir ou copiar este e-mail, devendo proceder à sua eliminação e informar o emissor. Obrigado