You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@calcite.apache.org by "James Starr (Jira)" <ji...@apache.org> on 2021/04/17 05:04:00 UTC
[jira] [Updated] (CALCITE-4493) Postgres String Literal Equality
Creates incorrect results.
[ https://issues.apache.org/jira/browse/CALCITE-4493?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
James Starr updated CALCITE-4493:
---------------------------------
Description:
Calcite always interprets String Literals as CHAR type. This is normally not an issue since if are literal is being compared to VARCHAR column, then the literal is cast to a VARCHAR. VARCHAR and CHAR equality have slight different behavior. CHAR ignores trailing whitespace when comparing, where VARCHAR does not. Postgres defaults the a strings literals type as unkown in a given expression, but will default them to text if there evaluated to row.
Postgres treats string literals types as unknown or TEXT. Postgres TEXT follows VARCHAR semantics.
RelToSqlConverterTest.java
{code:java}
@Test void testDefaultTypeStringLiteralType() {
RuleSet rules = RuleSets.ofList(CoreRules.PROJECT_VALUES_MERGE, CoreRules.PROJECT_REDUCE_EXPRESSIONS);
HepProgramBuilder builder = new HepProgramBuilder();
builder.addRuleClass(ProjectToWindowRule.class);
HepPlanner hepPlanner = new HepPlanner(builder.build());
sql("SELECT 'foo ' = 'foo'")
.optimize(rules, hepPlanner)
.withPostgresql().ok("SELECT *\nFROM (VALUES (TRUE)) AS \"t\" (\"EXPR$0\")");
}
{code}
This is generating incorrect results since postgres would treat the literals TEXT which follow the VARCHAR semantics and treats trailing whitespace as significant.
{code:sql}
postgres=# SELECT pg_typeof(a), pg_typeof('b') FROM (VALUES('foo')) AS t(a);
pg_typeof | pg_typeof
-----------+-----------
text | unknown
(1 row)
{code}
{code:sql}
postgres=# SELECT 'foo ' = 'foo';
?column?
----------
f
(1 row)
{code}
Concerning other dialects, I ran the following query:
{code:sql}
SELECT
case when t.a = t.b then 'true' else 'false' end as example1, --example 1 No Type Value literal equality
case when 'foo' = 'foo ' then 'true' else 'false' end as example2, --example 2 No Type literal comparison
case when t.a = 'foo ' then 'true' else 'false' end as example3, --example 3 No Type Value to No Type Literal
case when t.c = 'foo ' then 'true' else 'false' end as example4, --example 4 Typed Value to No Type Literal
CASE WHEN t.b = CAST('foo' AS CHAR(3)) then 'true' else 'false' end as example5, --example 5 No Type Value to Typed Literal
'|' || CAST('a ' AS VARCHAR(2)) || '|' as example6
FROM (
select 'foo' a, 'foo ' b, CAST('foo' AS CHAR(3)) c --FROM DUAL
) t
;
{code}
DB
a = b
‘foo’ = ‘foo ‘
a = ‘foo ‘
c = ‘foo ‘
b = CAST(‘foo’ AS CHAR(3))
CAST(‘a ‘ TO VARCHAR(2)
Dremio
false
false
false
false
false
|a |
Postgres
false
false
false
true
false
|a |
Oracle
false
false
false
true
true
|a |
MySQL
false
false
false
false
false
|a |
SQL Server
true
true
true
true
true
|a |
Snowflake
false
false
false
false
false
|a |
ISO SQL
true
true
true
true
true
|a|
was:
Calcite always interprets String Literals as CHAR type. This is normally not an issue since if are literal is being compared to VARCHAR column, then the literal is cast to a VARCHAR. VARCHAR and CHAR equality have slight different behavior. CHAR ignores trailing whitespace when comparing, where VARCHAR does not. Postgres defaults the a strings literals type as unkown in a given expression, but will default them to text if there evaluated to row.
Postgres treats string literals types as unknown or TEXT. Postgres TEXT follows VARCHAR semantics.
RelToSqlConverterTest.java
{code:java}
@Test void testDefaultTypeStringLiteralType() {
RuleSet rules = RuleSets.ofList(CoreRules.PROJECT_VALUES_MERGE, CoreRules.PROJECT_REDUCE_EXPRESSIONS);
HepProgramBuilder builder = new HepProgramBuilder();
builder.addRuleClass(ProjectToWindowRule.class);
HepPlanner hepPlanner = new HepPlanner(builder.build());
sql("SELECT 'foo ' = 'foo'")
.optimize(rules, hepPlanner)
.withPostgresql().ok("SELECT *\nFROM (VALUES (TRUE)) AS \"t\" (\"EXPR$0\")");
}
{code}
This is generating incorrect results since postgres would treat the literals TEXT which follow the VARCHAR semantics and treats trailing whitespace as significant.
{code:sql}
postgres=# SELECT pg_typeof(a), pg_typeof('b') FROM (VALUES('foo')) AS t(a);
pg_typeof | pg_typeof
-----------+-----------
text | unknown
(1 row)
{code}
{code:sql}
postgres=# SELECT 'foo ' = 'foo';
?column?
----------
f
(1 row)
{code}
> Postgres String Literal Equality Creates incorrect results.
> -----------------------------------------------------------
>
> Key: CALCITE-4493
> URL: https://issues.apache.org/jira/browse/CALCITE-4493
> Project: Calcite
> Issue Type: Bug
> Reporter: James Starr
> Priority: Major
>
> Calcite always interprets String Literals as CHAR type. This is normally not an issue since if are literal is being compared to VARCHAR column, then the literal is cast to a VARCHAR. VARCHAR and CHAR equality have slight different behavior. CHAR ignores trailing whitespace when comparing, where VARCHAR does not. Postgres defaults the a strings literals type as unkown in a given expression, but will default them to text if there evaluated to row.
> Postgres treats string literals types as unknown or TEXT. Postgres TEXT follows VARCHAR semantics.
> RelToSqlConverterTest.java
> {code:java}
> @Test void testDefaultTypeStringLiteralType() {
> RuleSet rules = RuleSets.ofList(CoreRules.PROJECT_VALUES_MERGE, CoreRules.PROJECT_REDUCE_EXPRESSIONS);
> HepProgramBuilder builder = new HepProgramBuilder();
> builder.addRuleClass(ProjectToWindowRule.class);
> HepPlanner hepPlanner = new HepPlanner(builder.build());
> sql("SELECT 'foo ' = 'foo'")
> .optimize(rules, hepPlanner)
> .withPostgresql().ok("SELECT *\nFROM (VALUES (TRUE)) AS \"t\" (\"EXPR$0\")");
> }
> {code}
> This is generating incorrect results since postgres would treat the literals TEXT which follow the VARCHAR semantics and treats trailing whitespace as significant.
> {code:sql}
> postgres=# SELECT pg_typeof(a), pg_typeof('b') FROM (VALUES('foo')) AS t(a);
> pg_typeof | pg_typeof
> -----------+-----------
> text | unknown
> (1 row)
> {code}
>
> {code:sql}
> postgres=# SELECT 'foo ' = 'foo';
> ?column?
> ----------
> f
> (1 row)
> {code}
> Concerning other dialects, I ran the following query:
> {code:sql}
> SELECT
> case when t.a = t.b then 'true' else 'false' end as example1, --example 1 No Type Value literal equality
> case when 'foo' = 'foo ' then 'true' else 'false' end as example2, --example 2 No Type literal comparison
> case when t.a = 'foo ' then 'true' else 'false' end as example3, --example 3 No Type Value to No Type Literal
> case when t.c = 'foo ' then 'true' else 'false' end as example4, --example 4 Typed Value to No Type Literal
> CASE WHEN t.b = CAST('foo' AS CHAR(3)) then 'true' else 'false' end as example5, --example 5 No Type Value to Typed Literal
> '|' || CAST('a ' AS VARCHAR(2)) || '|' as example6
> FROM (
> select 'foo' a, 'foo ' b, CAST('foo' AS CHAR(3)) c --FROM DUAL
> ) t
> ;
> {code}
> DB
> a = b
> ‘foo’ = ‘foo ‘
> a = ‘foo ‘
> c = ‘foo ‘
> b = CAST(‘foo’ AS CHAR(3))
> CAST(‘a ‘ TO VARCHAR(2)
> Dremio
> false
> false
> false
> false
> false
> |a |
> Postgres
> false
> false
> false
> true
> false
> |a |
> Oracle
> false
> false
> false
> true
> true
> |a |
> MySQL
> false
> false
> false
> false
> false
> |a |
> SQL Server
> true
> true
> true
> true
> true
> |a |
> Snowflake
> false
> false
> false
> false
> false
> |a |
> ISO SQL
> true
> true
> true
> true
> true
> |a|
--
This message was sent by Atlassian Jira
(v8.3.4#803005)