You are viewing a plain text version of this content. The canonical link for it is here.
Posted to github@arrow.apache.org by "alamb (via GitHub)" <gi...@apache.org> on 2023/05/05 12:56:17 UTC

[GitHub] [arrow-datafusion] alamb opened a new issue, #6251: Support `CREATE TABLE` via SQL for infinite streams

alamb opened a new issue, #6251:
URL: https://github.com/apache/arrow-datafusion/issues/6251

   ### Is your feature request related to a problem or challenge?
   
   Right now there is no (good) SQL syntax for creating a table via SQL that is marked as an infinite stream. The only way is to either create it programatically or (after https://github.com/apache/arrow-datafusion/pull/6235 is merged) some non user friendly option syntax (that is also purposely not documented)
   
   I would like a real, user friendly, way to create such a table
   
   ### Describe the solution you'd like
   
   Instead of 
   
   ```sql
   CREATE external table t ...
   OPTIONS('infinite_source' 'true')
   LOCATION 'tests/data/empty.csv';
   ```
   
   I would like some sort of explicit syntax like 
   
   ```sql
   CREATE external table t ...
   INFINITE
   LOCATION 'tests/data/empty.csv';
   ```
   
   
   
   ### Describe alternatives you've considered
   
   It would be good to do some research into what other systems (like Flink) that support streaming SQL call this concept / how they declare such tables
   
   ### Additional context
   
   _No response_


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow-datafusion] alamb commented on issue #6251: Support `CREATE TABLE` via SQL for infinite streams

Posted by "alamb (via GitHub)" <gi...@apache.org>.
alamb commented on issue #6251:
URL: https://github.com/apache/arrow-datafusion/issues/6251#issuecomment-1545695556

   Those are some great suggestions. Thank you @ozankabak 
   
   I agree that 
   
   ```sql
   CREATE UNBOUNDED EXTERNAL TABLE ...
   ```
   
   Looks very nice 👍 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow-datafusion] aprimadi commented on issue #6251: Support `CREATE TABLE` via SQL for infinite streams

Posted by "aprimadi (via GitHub)" <gi...@apache.org>.
aprimadi commented on issue #6251:
URL: https://github.com/apache/arrow-datafusion/issues/6251#issuecomment-1545537576

   Otherwise I'll just add a pull request to add INFINITE keyword on sqlparser and wait until sqlparser is updated before working on this.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow-datafusion] ozankabak commented on issue #6251: Support `CREATE TABLE` via SQL for infinite streams

Posted by "ozankabak (via GitHub)" <gi...@apache.org>.
ozankabak commented on issue #6251:
URL: https://github.com/apache/arrow-datafusion/issues/6251#issuecomment-1545663982

   What do you think about a syntax like `CREATE [UNBOUNDED] EXTERNAL TABLE ...` (or the same with `INFINITE`)? For some reason having a lonely INFINITE or UNBOUNDED keyword bothers me.
   
   To summarize, my order of preference is:
   1. `CREATE [UNBOUNDED/INFINITE] EXTERNAL TABLE ...`
   2. `CREATE EXTERNAL TABLE ... OPTIONS('UNBOUNDED/INFINITE' 'TRUE')`
   3. `CREATE EXTERNAL TABLE ... UNBOUNDED/INFINITE ... LOCATION 'tests/data/empty.csv'`
   
   In terms of the UNBOUNDED vs INFINITE distinction, I think both are fine, and we use them interchangeably in code. However, in the API, we should stick to one and use it uniformly. At SQL API level, my tendency is towards UNBOUNDED.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow-datafusion] alamb commented on issue #6251: Support `CREATE TABLE` via SQL for infinite streams

Posted by "alamb (via GitHub)" <gi...@apache.org>.
alamb commented on issue #6251:
URL: https://github.com/apache/arrow-datafusion/issues/6251#issuecomment-1545650626

   @mustafasrepo  or @ozankabak  do you have any suggestions on naming? The DataFusion code uses the term infinte - is that the right term to expose to users? Or should we use a different term?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow-datafusion] aprimadi commented on issue #6251: Support `CREATE TABLE` via SQL for infinite streams

Posted by "aprimadi (via GitHub)" <gi...@apache.org>.
aprimadi commented on issue #6251:
URL: https://github.com/apache/arrow-datafusion/issues/6251#issuecomment-1545533052

   @alamb what do you think using [UNBOUNDED](https://docs.rs/sqlparser/0.33.0/sqlparser/keywords/enum.Keyword.html#variant.UNBOUNDED) keyword?
   
   That's the closest thing I can think of.
   
   I can work on this on the weekend.
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow-datafusion] mustafasrepo closed issue #6251: Support `CREATE TABLE` via SQL for infinite streams

Posted by "mustafasrepo (via GitHub)" <gi...@apache.org>.
mustafasrepo closed issue #6251: Support  `CREATE TABLE` via SQL  for infinite streams 
URL: https://github.com/apache/arrow-datafusion/issues/6251


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org