You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@beam.apache.org by "Yueyang Qiu (Jira)" <ji...@apache.org> on 2020/01/25 00:33:00 UTC
[jira] [Assigned] (BEAM-9180) [ZetaSQL] Support 4-byte unicode in
literal string unparsing
[ https://issues.apache.org/jira/browse/BEAM-9180?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Yueyang Qiu reassigned BEAM-9180:
---------------------------------
Assignee: Yueyang Qiu
> [ZetaSQL] Support 4-byte unicode in literal string unparsing
> ------------------------------------------------------------
>
> Key: BEAM-9180
> URL: https://issues.apache.org/jira/browse/BEAM-9180
> Project: Beam
> Issue Type: Improvement
> Components: dsl-sql-zetasql
> Reporter: Kirill Kozlov
> Assignee: Yueyang Qiu
> Priority: Major
>
> When unprasing literal strings we need to escape special symbols (ex: `\n`, `\r`, `\u0012`).
> ZetaSQL supports for some 4-byte (or 8 hex digit) unicode via `\Uhhhhhhhh`.
> As of [now|[https://github.com/apache/beam/blob/8a35f408f640d04c38ad6e2a497d30410b3bff32/sdks/java/extensions/sql/src/main/java/org/apache/beam/sdk/extensions/sql/meta/provider/bigquery/BeamSqlUnparseContext.java#L59]] only 2-byte (or 4 hex digit) unicode is supported by escaping it via `\u`.
>
> More about escape sequences here (need to scroll down a little):
> https://cloud.google.com/bigquery/docs/reference/standard-sql/lexical
--
This message was sent by Atlassian Jira
(v8.3.4#803005)