You are viewing a plain text version of this content. The canonical link for it is here.
Posted to github@arrow.apache.org by GitBox <gi...@apache.org> on 2021/12/10 11:55:49 UTC

[GitHub] [arrow-datafusion] alamb opened a new issue #1432: Field names containing period such as `f.c1` cannot be named in SQL query

alamb opened a new issue #1432:
URL: https://github.com/apache/arrow-datafusion/issues/1432


   **Describe the bug**
   Field names containing period such as `f.c1` cannot be named in SQL query
   
   **To Reproduce**
   Adapted from https://github.com/influxdata/influxdb_iox/issues/3351 from @jacobmarble
   
   
   ```shell
   echo "1" > /tmp/foo.csv
   cargo run --bin datafusion-cli
   ```
   
   
   ```sql
   ❯ create external table foo("f.c1" int) stored as CSV location '/tmp/foo.csv';
   0 rows in set. Query took 0.001 seconds.
   ❯ select * from foo;
   +------+
   | f.c1 |
   +------+
   | 1    |
   +------+
   1 row in set. Query took 0.009 seconds.
   ❯ select f.c1 from foo;
   Plan("No field named 'f.c1'. Valid fields are 'foo.f.c1'.")
   ❯ select "f.c1" from foo;
   Plan("No field named 'f.c1'. Valid fields are 'foo.f.c1'.")
   ❯ select "foo.f.c1" from foo;
   Plan("Invalid identifier '#foo.f.c1' for schema foo.f.c1")
   ```
   
   
   **Expected behavior**
   
   I expect the query `select "f.c1" from foo;` to return `1` (note that "f.c1" is quoted)
   
   **Additional context**
   Found during testing of IOx https://github.com/influxdata/influxdb_iox/issues/3351


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow-datafusion] liukun4515 edited a comment on issue #1432: Field names containing period such as `f.c1` cannot be named in SQL query

Posted by GitBox <gi...@apache.org>.
liukun4515 edited a comment on issue #1432:
URL: https://github.com/apache/arrow-datafusion/issues/1432#issuecomment-991051157


   In my opinion, this `f1.c1` style is illegal for SQL syntax.
   For example in Mysql and PG
   ```
   mysql>  create table t5("f.c1" int);
   ERROR 1064 (42000): You have an error in your SQL syntax; check the manual that corresponds to your MySQL server version for the right syntax to use near '"f.c1" int)' at line 1
   
   
   postgres=# create table t5("f.c1" int);
   CREATE TABLE
   ```
   In the SparkSQL 
   ```
   spark-sql> create table t5("f.c1" int);
   Error in query:
   no viable alternative at input '("f.c1"'(line 1, pos 16)
   
   == SQL ==
   create table t5("f.c1" int)
   ```


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow-datafusion] alamb commented on issue #1432: Field names containing period such as `f.c1` cannot be named in SQL query

Posted by GitBox <gi...@apache.org>.
alamb commented on issue #1432:
URL: https://github.com/apache/arrow-datafusion/issues/1432#issuecomment-994115896


   Proposed fix in https://github.com/apache/arrow-datafusion/pull/1449


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow-datafusion] alamb commented on issue #1432: Field names containing period such as `f.c1` cannot be named in SQL query

Posted by GitBox <gi...@apache.org>.
alamb commented on issue #1432:
URL: https://github.com/apache/arrow-datafusion/issues/1432#issuecomment-992373241


   Thanks @houqp  -- I'll take a shot at this later today


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow-datafusion] liukun4515 commented on issue #1432: Field names containing period such as `f.c1` cannot be named in SQL query

Posted by GitBox <gi...@apache.org>.
liukun4515 commented on issue #1432:
URL: https://github.com/apache/arrow-datafusion/issues/1432#issuecomment-991051157


   In my opinion, this `f1.c1` style is illegal for SQL syntax.
   For example in Mysql and PG
   ```
   mysql> create table t4(f.c1 int);
   ERROR 1064 (42000): You have an error in your SQL syntax; check the manual that corresponds to your MySQL server version for the right syntax to use near '.c1 int)' at line 1
   
   postgres=# create table t4(f.c1 int);
   ERROR:  syntax error at or near "."
   LINE 1: create table t4(f.c1 int);
   ```
   In the SparkSQL 
   ```
   spark-sql> create table t4(f.c1 int);
   Error in query:
   extraneous input '.' expecting {'ADD', 'AFTER', 'ALL', 'ALTER', 'ANALYZE', 'AND', 'ANTI', 'ANY', 'ARCHIVE', 'ARRAY', 'AS', 'ASC', 'AT', 'AUTHORIZATION', 'BETWEEN', 'BOTH', 'BUCKET', 'BUCKETS', 'BY', 'CACHE', 'CASCADE', 'CASE', 'CAST', 'CHANGE', 'CHECK', 'CLEAR', 'CLUSTER', 'CLUSTERED', 'CODEGEN', 'COLLATE', 'COLLECTION', 'COLUMN', 'COLUMNS', 'COMMENT', 'COMMIT', 'COMPACT', 'COMPACTIONS', 'COMPUTE', 'CONCATENATE', 'CONSTRAINT', 'COST', 'CREATE', 'CROSS', 'CUBE', 'CURRENT', 'CURRENT_DATE', 'CURRENT_TIME', 'CURRENT_TIMESTAMP', 'CURRENT_USER', 'DAY', 'DATA', 'DATABASE', DATABASES, 'DBPROPERTIES', 'DEFINED', 'DELETE', 'DELIMITED', 'DESC', 'DESCRIBE', 'DFS', 'DIRECTORIES', 'DIRECTORY', 'DISTINCT', 'DISTRIBUTE', 'DIV', 'DROP', 'ELSE', 'END', 'ESCAPE', 'ESCAPED', 'EXCEPT', 'EXCHANGE', 'EXISTS', 'EXPLAIN', 'EXPORT', 'EXTENDED', 'EXTERNAL', 'EXTRACT', 'FALSE', 'FETCH', 'FIELDS', 'FILTER', 'FILEFORMAT', 'FIRST', 'FOLLOWING', 'FOR', 'FOREIGN', 'FORMAT', 'FORMATTED', 'FROM', 'FULL', 'FUNCTION'
 , 'FUNCTIONS', 'GLOBAL', 'GRANT', 'GROUP', 'GROUPING', 'HAVING', 'HOUR', 'IF', 'IGNORE', 'IMPORT', 'IN', 'INDEX', 'INDEXES', 'INNER', 'INPATH', 'INPUTFORMAT', 'INSERT', 'INTERSECT', 'INTERVAL', 'INTO', 'IS', 'ITEMS', 'JOIN', 'KEYS', 'LAST', 'LATERAL', 'LAZY', 'LEADING', 'LEFT', 'LIKE', 'LIMIT', 'LINES', 'LIST', 'LOAD', 'LOCAL', 'LOCATION', 'LOCK', 'LOCKS', 'LOGICAL', 'MACRO', 'MAP', 'MATCHED', 'MERGE', 'MINUTE', 'MONTH', 'MSCK', 'NAMESPACE', 'NAMESPACES', 'NATURAL', 'NO', NOT, 'NULL', 'NULLS', 'OF', 'ON', 'ONLY', 'OPTION', 'OPTIONS', 'OR', 'ORDER', 'OUT', 'OUTER', 'OUTPUTFORMAT', 'OVER', 'OVERLAPS', 'OVERLAY', 'OVERWRITE', 'PARTITION', 'PARTITIONED', 'PARTITIONS', 'PERCENT', 'PIVOT', 'PLACING', 'POSITION', 'PRECEDING', 'PRIMARY', 'PRINCIPALS', 'PROPERTIES', 'PURGE', 'QUERY', 'RANGE', 'RECORDREADER', 'RECORDWRITER', 'RECOVER', 'REDUCE', 'REFERENCES', 'REFRESH', 'RENAME', 'REPAIR', 'REPLACE', 'RESET', 'RESPECT', 'RESTRICT', 'REVOKE', 'RIGHT', RLIKE, 'ROLE', 'ROLES', 'ROLLBACK', 'ROLLU
 P', 'ROW', 'ROWS', 'SECOND', 'SCHEMA', 'SELECT', 'SEMI', 'SEPARATED', 'SERDE', 'SERDEPROPERTIES', 'SESSION_USER', 'SET', 'MINUS', 'SETS', 'SHOW', 'SKEWED', 'SOME', 'SORT', 'SORTED', 'START', 'STATISTICS', 'STORED', 'STRATIFY', 'STRUCT', 'SUBSTR', 'SUBSTRING', 'SYNC', 'TABLE', 'TABLES', 'TABLESAMPLE', 'TBLPROPERTIES', TEMPORARY, 'TERMINATED', 'THEN', 'TIME', 'TO', 'TOUCH', 'TRAILING', 'TRANSACTION', 'TRANSACTIONS', 'TRANSFORM', 'TRIM', 'TRUE', 'TRUNCATE', 'TRY_CAST', 'TYPE', 'UNARCHIVE', 'UNBOUNDED', 'UNCACHE', 'UNION', 'UNIQUE', 'UNKNOWN', 'UNLOCK', 'UNSET', 'UPDATE', 'USE', 'USER', 'USING', 'VALUES', 'VIEW', 'VIEWS', 'WHEN', 'WHERE', 'WINDOW', 'WITH', 'YEAR', 'ZONE', IDENTIFIER, BACKQUOTED_IDENTIFIER}(line 1, pos 17)
   
   == SQL ==
   create table t4(f.c1 int)
   ```


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow-datafusion] alamb commented on issue #1432: Field names containing period such as `f.c1` cannot be named in SQL query

Posted by GitBox <gi...@apache.org>.
alamb commented on issue #1432:
URL: https://github.com/apache/arrow-datafusion/issues/1432#issuecomment-991073111


   And indeed postgres supports such columns:
   
   ```
   alamb=# create table foo ("f1.c1" int);
   CREATE TABLE
   alamb=# insert into foo values (1);
   INSERT 0 1
   alamb=# select * from foo;
    f1.c1 
   -------
        1
   (1 row)
   
   alamb=# select "f1.c1" from foo;
    f1.c1 
   -------
        1
   (1 row)
   ```


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow-datafusion] Dandandan commented on issue #1432: Field names containing period such as `f.c1` cannot be named in SQL query

Posted by GitBox <gi...@apache.org>.
Dandandan commented on issue #1432:
URL: https://github.com/apache/arrow-datafusion/issues/1432#issuecomment-991055788


   Usually you can use quotes to achieve this.
   
   In SparkSQL:
   
   ```
   create table t1(`f.c1` int);
   select "f.c1" from t1;
   ```


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow-datafusion] liukun4515 commented on issue #1432: Field names containing period such as `f.c1` cannot be named in SQL query

Posted by GitBox <gi...@apache.org>.
liukun4515 commented on issue #1432:
URL: https://github.com/apache/arrow-datafusion/issues/1432#issuecomment-992062791


   > indeed
   
   agree, and we should add more test case in datafusion if we want to support this style  of  column identifier.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow-datafusion] houqp commented on issue #1432: Field names containing period such as `f.c1` cannot be named in SQL query

Posted by GitBox <gi...@apache.org>.
houqp commented on issue #1432:
URL: https://github.com/apache/arrow-datafusion/issues/1432#issuecomment-992237763


   The code was written under the incorrect assumption that column name with dot in it is considered invalid, I should have checked postgres behavior :P
   
   This should be a simple fix if we can distinguish quoted and unquoted column reference in: https://github.com/apache/arrow-datafusion/blob/dc80c11954fa7855f79806cb29c9ea5283a98d01/datafusion/src/logical_plan/expr.rs#L60-L81


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow-datafusion] alamb closed issue #1432: Field names containing period such as `f.c1` cannot be named in SQL query

Posted by GitBox <gi...@apache.org>.
alamb closed issue #1432:
URL: https://github.com/apache/arrow-datafusion/issues/1432


   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow-datafusion] alamb commented on issue #1432: Field names containing period such as `f.c1` cannot be named in SQL query

Posted by GitBox <gi...@apache.org>.
alamb commented on issue #1432:
URL: https://github.com/apache/arrow-datafusion/issues/1432#issuecomment-990910911


   FYI @houqp 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org