You are viewing a plain text version of this content. The canonical link for it is here.
Posted to github@arrow.apache.org by GitBox <gi...@apache.org> on 2021/12/10 11:55:49 UTC
[GitHub] [arrow-datafusion] alamb opened a new issue #1432: Field names containing period such as `f.c1` cannot be named in SQL query
alamb opened a new issue #1432:
URL: https://github.com/apache/arrow-datafusion/issues/1432
**Describe the bug**
Field names containing period such as `f.c1` cannot be named in SQL query
**To Reproduce**
Adapted from https://github.com/influxdata/influxdb_iox/issues/3351 from @jacobmarble
```shell
echo "1" > /tmp/foo.csv
cargo run --bin datafusion-cli
```
```sql
❯ create external table foo("f.c1" int) stored as CSV location '/tmp/foo.csv';
0 rows in set. Query took 0.001 seconds.
❯ select * from foo;
+------+
| f.c1 |
+------+
| 1 |
+------+
1 row in set. Query took 0.009 seconds.
❯ select f.c1 from foo;
Plan("No field named 'f.c1'. Valid fields are 'foo.f.c1'.")
❯ select "f.c1" from foo;
Plan("No field named 'f.c1'. Valid fields are 'foo.f.c1'.")
❯ select "foo.f.c1" from foo;
Plan("Invalid identifier '#foo.f.c1' for schema foo.f.c1")
```
**Expected behavior**
I expect the query `select "f.c1" from foo;` to return `1` (note that "f.c1" is quoted)
**Additional context**
Found during testing of IOx https://github.com/influxdata/influxdb_iox/issues/3351
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [arrow-datafusion] liukun4515 edited a comment on issue #1432: Field names containing period such as `f.c1` cannot be named in SQL query
Posted by GitBox <gi...@apache.org>.
liukun4515 edited a comment on issue #1432:
URL: https://github.com/apache/arrow-datafusion/issues/1432#issuecomment-991051157
In my opinion, this `f1.c1` style is illegal for SQL syntax.
For example in Mysql and PG
```
mysql> create table t5("f.c1" int);
ERROR 1064 (42000): You have an error in your SQL syntax; check the manual that corresponds to your MySQL server version for the right syntax to use near '"f.c1" int)' at line 1
postgres=# create table t5("f.c1" int);
CREATE TABLE
```
In the SparkSQL
```
spark-sql> create table t5("f.c1" int);
Error in query:
no viable alternative at input '("f.c1"'(line 1, pos 16)
== SQL ==
create table t5("f.c1" int)
```
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [arrow-datafusion] alamb commented on issue #1432: Field names containing period such as `f.c1` cannot be named in SQL query
Posted by GitBox <gi...@apache.org>.
alamb commented on issue #1432:
URL: https://github.com/apache/arrow-datafusion/issues/1432#issuecomment-994115896
Proposed fix in https://github.com/apache/arrow-datafusion/pull/1449
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [arrow-datafusion] alamb commented on issue #1432: Field names containing period such as `f.c1` cannot be named in SQL query
Posted by GitBox <gi...@apache.org>.
alamb commented on issue #1432:
URL: https://github.com/apache/arrow-datafusion/issues/1432#issuecomment-992373241
Thanks @houqp -- I'll take a shot at this later today
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [arrow-datafusion] liukun4515 commented on issue #1432: Field names containing period such as `f.c1` cannot be named in SQL query
Posted by GitBox <gi...@apache.org>.
liukun4515 commented on issue #1432:
URL: https://github.com/apache/arrow-datafusion/issues/1432#issuecomment-991051157
In my opinion, this `f1.c1` style is illegal for SQL syntax.
For example in Mysql and PG
```
mysql> create table t4(f.c1 int);
ERROR 1064 (42000): You have an error in your SQL syntax; check the manual that corresponds to your MySQL server version for the right syntax to use near '.c1 int)' at line 1
postgres=# create table t4(f.c1 int);
ERROR: syntax error at or near "."
LINE 1: create table t4(f.c1 int);
```
In the SparkSQL
```
spark-sql> create table t4(f.c1 int);
Error in query:
extraneous input '.' expecting {'ADD', 'AFTER', 'ALL', 'ALTER', 'ANALYZE', 'AND', 'ANTI', 'ANY', 'ARCHIVE', 'ARRAY', 'AS', 'ASC', 'AT', 'AUTHORIZATION', 'BETWEEN', 'BOTH', 'BUCKET', 'BUCKETS', 'BY', 'CACHE', 'CASCADE', 'CASE', 'CAST', 'CHANGE', 'CHECK', 'CLEAR', 'CLUSTER', 'CLUSTERED', 'CODEGEN', 'COLLATE', 'COLLECTION', 'COLUMN', 'COLUMNS', 'COMMENT', 'COMMIT', 'COMPACT', 'COMPACTIONS', 'COMPUTE', 'CONCATENATE', 'CONSTRAINT', 'COST', 'CREATE', 'CROSS', 'CUBE', 'CURRENT', 'CURRENT_DATE', 'CURRENT_TIME', 'CURRENT_TIMESTAMP', 'CURRENT_USER', 'DAY', 'DATA', 'DATABASE', DATABASES, 'DBPROPERTIES', 'DEFINED', 'DELETE', 'DELIMITED', 'DESC', 'DESCRIBE', 'DFS', 'DIRECTORIES', 'DIRECTORY', 'DISTINCT', 'DISTRIBUTE', 'DIV', 'DROP', 'ELSE', 'END', 'ESCAPE', 'ESCAPED', 'EXCEPT', 'EXCHANGE', 'EXISTS', 'EXPLAIN', 'EXPORT', 'EXTENDED', 'EXTERNAL', 'EXTRACT', 'FALSE', 'FETCH', 'FIELDS', 'FILTER', 'FILEFORMAT', 'FIRST', 'FOLLOWING', 'FOR', 'FOREIGN', 'FORMAT', 'FORMATTED', 'FROM', 'FULL', 'FUNCTION'
, 'FUNCTIONS', 'GLOBAL', 'GRANT', 'GROUP', 'GROUPING', 'HAVING', 'HOUR', 'IF', 'IGNORE', 'IMPORT', 'IN', 'INDEX', 'INDEXES', 'INNER', 'INPATH', 'INPUTFORMAT', 'INSERT', 'INTERSECT', 'INTERVAL', 'INTO', 'IS', 'ITEMS', 'JOIN', 'KEYS', 'LAST', 'LATERAL', 'LAZY', 'LEADING', 'LEFT', 'LIKE', 'LIMIT', 'LINES', 'LIST', 'LOAD', 'LOCAL', 'LOCATION', 'LOCK', 'LOCKS', 'LOGICAL', 'MACRO', 'MAP', 'MATCHED', 'MERGE', 'MINUTE', 'MONTH', 'MSCK', 'NAMESPACE', 'NAMESPACES', 'NATURAL', 'NO', NOT, 'NULL', 'NULLS', 'OF', 'ON', 'ONLY', 'OPTION', 'OPTIONS', 'OR', 'ORDER', 'OUT', 'OUTER', 'OUTPUTFORMAT', 'OVER', 'OVERLAPS', 'OVERLAY', 'OVERWRITE', 'PARTITION', 'PARTITIONED', 'PARTITIONS', 'PERCENT', 'PIVOT', 'PLACING', 'POSITION', 'PRECEDING', 'PRIMARY', 'PRINCIPALS', 'PROPERTIES', 'PURGE', 'QUERY', 'RANGE', 'RECORDREADER', 'RECORDWRITER', 'RECOVER', 'REDUCE', 'REFERENCES', 'REFRESH', 'RENAME', 'REPAIR', 'REPLACE', 'RESET', 'RESPECT', 'RESTRICT', 'REVOKE', 'RIGHT', RLIKE, 'ROLE', 'ROLES', 'ROLLBACK', 'ROLLU
P', 'ROW', 'ROWS', 'SECOND', 'SCHEMA', 'SELECT', 'SEMI', 'SEPARATED', 'SERDE', 'SERDEPROPERTIES', 'SESSION_USER', 'SET', 'MINUS', 'SETS', 'SHOW', 'SKEWED', 'SOME', 'SORT', 'SORTED', 'START', 'STATISTICS', 'STORED', 'STRATIFY', 'STRUCT', 'SUBSTR', 'SUBSTRING', 'SYNC', 'TABLE', 'TABLES', 'TABLESAMPLE', 'TBLPROPERTIES', TEMPORARY, 'TERMINATED', 'THEN', 'TIME', 'TO', 'TOUCH', 'TRAILING', 'TRANSACTION', 'TRANSACTIONS', 'TRANSFORM', 'TRIM', 'TRUE', 'TRUNCATE', 'TRY_CAST', 'TYPE', 'UNARCHIVE', 'UNBOUNDED', 'UNCACHE', 'UNION', 'UNIQUE', 'UNKNOWN', 'UNLOCK', 'UNSET', 'UPDATE', 'USE', 'USER', 'USING', 'VALUES', 'VIEW', 'VIEWS', 'WHEN', 'WHERE', 'WINDOW', 'WITH', 'YEAR', 'ZONE', IDENTIFIER, BACKQUOTED_IDENTIFIER}(line 1, pos 17)
== SQL ==
create table t4(f.c1 int)
```
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [arrow-datafusion] alamb commented on issue #1432: Field names containing period such as `f.c1` cannot be named in SQL query
Posted by GitBox <gi...@apache.org>.
alamb commented on issue #1432:
URL: https://github.com/apache/arrow-datafusion/issues/1432#issuecomment-991073111
And indeed postgres supports such columns:
```
alamb=# create table foo ("f1.c1" int);
CREATE TABLE
alamb=# insert into foo values (1);
INSERT 0 1
alamb=# select * from foo;
f1.c1
-------
1
(1 row)
alamb=# select "f1.c1" from foo;
f1.c1
-------
1
(1 row)
```
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [arrow-datafusion] Dandandan commented on issue #1432: Field names containing period such as `f.c1` cannot be named in SQL query
Posted by GitBox <gi...@apache.org>.
Dandandan commented on issue #1432:
URL: https://github.com/apache/arrow-datafusion/issues/1432#issuecomment-991055788
Usually you can use quotes to achieve this.
In SparkSQL:
```
create table t1(`f.c1` int);
select "f.c1" from t1;
```
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [arrow-datafusion] liukun4515 commented on issue #1432: Field names containing period such as `f.c1` cannot be named in SQL query
Posted by GitBox <gi...@apache.org>.
liukun4515 commented on issue #1432:
URL: https://github.com/apache/arrow-datafusion/issues/1432#issuecomment-992062791
> indeed
agree, and we should add more test case in datafusion if we want to support this style of column identifier.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [arrow-datafusion] houqp commented on issue #1432: Field names containing period such as `f.c1` cannot be named in SQL query
Posted by GitBox <gi...@apache.org>.
houqp commented on issue #1432:
URL: https://github.com/apache/arrow-datafusion/issues/1432#issuecomment-992237763
The code was written under the incorrect assumption that column name with dot in it is considered invalid, I should have checked postgres behavior :P
This should be a simple fix if we can distinguish quoted and unquoted column reference in: https://github.com/apache/arrow-datafusion/blob/dc80c11954fa7855f79806cb29c9ea5283a98d01/datafusion/src/logical_plan/expr.rs#L60-L81
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [arrow-datafusion] alamb closed issue #1432: Field names containing period such as `f.c1` cannot be named in SQL query
Posted by GitBox <gi...@apache.org>.
alamb closed issue #1432:
URL: https://github.com/apache/arrow-datafusion/issues/1432
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [arrow-datafusion] alamb commented on issue #1432: Field names containing period such as `f.c1` cannot be named in SQL query
Posted by GitBox <gi...@apache.org>.
alamb commented on issue #1432:
URL: https://github.com/apache/arrow-datafusion/issues/1432#issuecomment-990910911
FYI @houqp
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org