You are viewing a plain text version of this content. The canonical link for it is here.
Posted to github@arrow.apache.org by "andygrove (via GitHub)" <gi...@apache.org> on 2023/04/28 03:17:54 UTC

[GitHub] [arrow-datafusion] andygrove opened a new issue, #6147: Cannot query CSV file with non-standard extension from DataFusion CLI / Python bindings

andygrove opened a new issue, #6147:
URL: https://github.com/apache/arrow-datafusion/issues/6147

   ### Describe the bug
   
   I have CSV files with extension `.tbl` and it is not possible to query them through SQL. This is quite the barrier to running TPC-H benchmarks with the generated files.
   
   ### To Reproduce
   
   ## Querying a CSV file with `.csv` extension works
   
   ```
   $ echo bob > customer.csv
   $ datafusion-cli
   
   DataFusion CLI v23.0.0
   ❯ CREATE EXTERNAL TABLE customer STORED AS CSV LOCATION 'customer.csv';
   0 rows in set. Query took 0.008 seconds.
   ❯ SELECT * FROM customer;
   +----------+
   | column_1 |
   +----------+
   | bob      |
   +----------+
   1 row in set. Query took 0.001 seconds.
   ```
   
   :smile: 
   
   ## Cannot query CSV file with `.tbl` extension
   
   ```
   $ echo bob > customer.tbl
   $ datafusion-cli
   
   DataFusion CLI v23.0.0
   ❯ CREATE EXTERNAL TABLE customer STORED AS CSV LOCATION 'customer.tbl';
   0 rows in set. Query took 0.001 seconds.
   ❯ SELECT * FROM customer;
   0 rows in set. Query took 0.001 seconds.
   ```
   
   :cry: 
   
   ### Expected behavior
   
   _No response_
   
   ### Additional context
   
   _No response_


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow-datafusion] andygrove commented on issue #6147: Cannot query CSV file with non-standard extension from DataFusion CLI / Python bindings

Posted by "andygrove (via GitHub)" <gi...@apache.org>.
andygrove commented on issue #6147:
URL: https://github.com/apache/arrow-datafusion/issues/6147#issuecomment-1526927743

   @alamb fyi, would be great if we could resolve this to help with benchmarking. I am not sure how to work around this fro Python right now.
   
   Using low level Rust APIs it is possible to work around this by registering a file extension (with the object store, IIRC).


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow-datafusion] tustvold commented on issue #6147: Cannot query CSV file with non-standard extension from DataFusion CLI / Python bindings

Posted by "tustvold (via GitHub)" <gi...@apache.org>.
tustvold commented on issue #6147:
URL: https://github.com/apache/arrow-datafusion/issues/6147#issuecomment-1527516009

   I think this may be a duplicate of #1736


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow-datafusion] alamb commented on issue #6147: Cannot query CSV file with non-standard extension from DataFusion CLI / Python bindings

Posted by "alamb (via GitHub)" <gi...@apache.org>.
alamb commented on issue #6147:
URL: https://github.com/apache/arrow-datafusion/issues/6147#issuecomment-1529690523

   Marking this as a good first issue (though I agree it is likely the same as https://github.com/apache/arrow-datafusion/issues/1736 this description I think is cleaner)
   
   To anyone who finds this ticket, here is the suggested fix:  https://github.com/apache/arrow-datafusion/issues/1736#issuecomment-1527530466


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow-datafusion] alamb closed issue #6147: Cannot query CSV file with non-standard extension from DataFusion CLI / Python bindings

Posted by "alamb (via GitHub)" <gi...@apache.org>.
alamb closed issue #6147: Cannot query CSV file with non-standard extension from DataFusion CLI / Python bindings
URL: https://github.com/apache/arrow-datafusion/issues/6147


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org