You are viewing a plain text version of this content. The canonical link for it is here.
Posted to github@arrow.apache.org by "alamb (via GitHub)" <gi...@apache.org> on 2023/05/05 12:56:17 UTC
[GitHub] [arrow-datafusion] alamb opened a new issue, #6251: Support `CREATE TABLE` via SQL for infinite streams
alamb opened a new issue, #6251:
URL: https://github.com/apache/arrow-datafusion/issues/6251
### Is your feature request related to a problem or challenge?
Right now there is no (good) SQL syntax for creating a table via SQL that is marked as an infinite stream. The only way is to either create it programatically or (after https://github.com/apache/arrow-datafusion/pull/6235 is merged) some non user friendly option syntax (that is also purposely not documented)
I would like a real, user friendly, way to create such a table
### Describe the solution you'd like
Instead of
```sql
CREATE external table t ...
OPTIONS('infinite_source' 'true')
LOCATION 'tests/data/empty.csv';
```
I would like some sort of explicit syntax like
```sql
CREATE external table t ...
INFINITE
LOCATION 'tests/data/empty.csv';
```
### Describe alternatives you've considered
It would be good to do some research into what other systems (like Flink) that support streaming SQL call this concept / how they declare such tables
### Additional context
_No response_
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [arrow-datafusion] alamb commented on issue #6251: Support `CREATE TABLE` via SQL for infinite streams
Posted by "alamb (via GitHub)" <gi...@apache.org>.
alamb commented on issue #6251:
URL: https://github.com/apache/arrow-datafusion/issues/6251#issuecomment-1545695556
Those are some great suggestions. Thank you @ozankabak
I agree that
```sql
CREATE UNBOUNDED EXTERNAL TABLE ...
```
Looks very nice 👍
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [arrow-datafusion] aprimadi commented on issue #6251: Support `CREATE TABLE` via SQL for infinite streams
Posted by "aprimadi (via GitHub)" <gi...@apache.org>.
aprimadi commented on issue #6251:
URL: https://github.com/apache/arrow-datafusion/issues/6251#issuecomment-1545537576
Otherwise I'll just add a pull request to add INFINITE keyword on sqlparser and wait until sqlparser is updated before working on this.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [arrow-datafusion] ozankabak commented on issue #6251: Support `CREATE TABLE` via SQL for infinite streams
Posted by "ozankabak (via GitHub)" <gi...@apache.org>.
ozankabak commented on issue #6251:
URL: https://github.com/apache/arrow-datafusion/issues/6251#issuecomment-1545663982
What do you think about a syntax like `CREATE [UNBOUNDED] EXTERNAL TABLE ...` (or the same with `INFINITE`)? For some reason having a lonely INFINITE or UNBOUNDED keyword bothers me.
To summarize, my order of preference is:
1. `CREATE [UNBOUNDED/INFINITE] EXTERNAL TABLE ...`
2. `CREATE EXTERNAL TABLE ... OPTIONS('UNBOUNDED/INFINITE' 'TRUE')`
3. `CREATE EXTERNAL TABLE ... UNBOUNDED/INFINITE ... LOCATION 'tests/data/empty.csv'`
In terms of the UNBOUNDED vs INFINITE distinction, I think both are fine, and we use them interchangeably in code. However, in the API, we should stick to one and use it uniformly. At SQL API level, my tendency is towards UNBOUNDED.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [arrow-datafusion] alamb commented on issue #6251: Support `CREATE TABLE` via SQL for infinite streams
Posted by "alamb (via GitHub)" <gi...@apache.org>.
alamb commented on issue #6251:
URL: https://github.com/apache/arrow-datafusion/issues/6251#issuecomment-1545650626
@mustafasrepo or @ozankabak do you have any suggestions on naming? The DataFusion code uses the term infinte - is that the right term to expose to users? Or should we use a different term?
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [arrow-datafusion] aprimadi commented on issue #6251: Support `CREATE TABLE` via SQL for infinite streams
Posted by "aprimadi (via GitHub)" <gi...@apache.org>.
aprimadi commented on issue #6251:
URL: https://github.com/apache/arrow-datafusion/issues/6251#issuecomment-1545533052
@alamb what do you think using [UNBOUNDED](https://docs.rs/sqlparser/0.33.0/sqlparser/keywords/enum.Keyword.html#variant.UNBOUNDED) keyword?
That's the closest thing I can think of.
I can work on this on the weekend.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [arrow-datafusion] mustafasrepo closed issue #6251: Support `CREATE TABLE` via SQL for infinite streams
Posted by "mustafasrepo (via GitHub)" <gi...@apache.org>.
mustafasrepo closed issue #6251: Support `CREATE TABLE` via SQL for infinite streams
URL: https://github.com/apache/arrow-datafusion/issues/6251
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org