You are viewing a plain text version of this content. The canonical link for it is here.
Posted to github@arrow.apache.org by "berkaysynnada (via GitHub)" <gi...@apache.org> on 2023/05/03 21:15:51 UTC

[GitHub] [arrow-datafusion] berkaysynnada opened a new pull request, #6219: Select Into support

berkaysynnada opened a new pull request, #6219:
URL: https://github.com/apache/arrow-datafusion/pull/6219

   # Which issue does this PR close?
   
   <!--
   We generally require a GitHub issue to be filed for all bug fixes and enhancements and this helps us generate change logs for our releases. You can link an issue to this PR using the GitHub syntax. For example `Closes #123` indicates that this PR will close issue #123.
   -->
   
   Closes #6148.
   
   # Rationale for this change
   
   <!--
    Why are you proposing this change? If this is already explained clearly in the issue then this section is not needed.
    Explaining clearly why changes are proposed helps reviewers understand your changes and offer better suggestions for fixes.  
   -->
   
   We can support the queries in the form of
   
   ```
   SELECT ts
   INTO table2
   FROM table1
   ```
   
   # What changes are included in this PR?
   
   <!--
   There is no need to duplicate the description in the issue here but it is sometimes worth providing a summary of the individual changes in this PR.
   -->
   
   `select_to_plan()` function that builds the select plans is extended by one step. At the end of the build, since the `CreateMemoryTable` plans are placed at the top of the other plans, `into` option is checked. If it exists, CreateMemoryTable logical plan is added.
   
   SELECT INTO statement does not copy constraints such as primary key; therefore, primary_key input is given empty to the constructor. SELECT INTO also should give error in case of creating a table having the same name with an existing table, so if_not_exists and or_replace option is given false.
   
   # Are these changes tested?
   
   <!--
   We typically require tests for all PRs in order to:
   1. Prevent the code from being accidentally broken by subsequent changes
   2. Serve as another way to document the expected behavior of the code
   
   If tests are not included in your PR, please explain why (for example, are they covered by existing tests)?
   -->
   
   Yes
   
   # Are there any user-facing changes?
   
   <!--
   If there are user-facing changes then we may require documentation to be updated before approving the PR.
   -->
   
   <!--
   If there are any breaking changes to public APIs, please add the `api change` label.
   -->


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow-datafusion] berkaysynnada commented on a diff in pull request #6219: Select Into support

Posted by "berkaysynnada (via GitHub)" <gi...@apache.org>.
berkaysynnada commented on code in PR #6219:
URL: https://github.com/apache/arrow-datafusion/pull/6219#discussion_r1185277821


##########
datafusion/core/tests/sqllogictests/test_files/ddl.slt:
##########
@@ -476,6 +511,15 @@ SELECT * FROM abc
 1 3 3
 5 5 6
 
+statement ok
+SELECT * INTO fgh FROM (VALUES (1,2,3), (4,5,6));

Review Comment:
   You are right, this test does the same. Thank you 



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow-datafusion] berkaysynnada commented on a diff in pull request #6219: Select Into support

Posted by "berkaysynnada (via GitHub)" <gi...@apache.org>.
berkaysynnada commented on code in PR #6219:
URL: https://github.com/apache/arrow-datafusion/pull/6219#discussion_r1185275251


##########
datafusion/core/tests/sqllogictests/test_files/ddl.slt:
##########
@@ -296,9 +320,20 @@ SELECT * FROM my_table;
 ----
 1 2 hello
 
+# select_into_with_schema_as_values
+statement ok
+SELECT * INTO new_table FROM my_table

Review Comment:
   Actually, my intention was to show that the datatypes are preserved. I renamed the test, but we can also delete it if you wish.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow-datafusion] ozankabak commented on pull request #6219: Select Into support

Posted by "ozankabak (via GitHub)" <gi...@apache.org>.
ozankabak commented on PR #6219:
URL: https://github.com/apache/arrow-datafusion/pull/6219#issuecomment-1535155293

   @berkaysynnada, seems like we have a minor Cargo.lock issue making CI fail


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow-datafusion] mustafasrepo merged pull request #6219: Select Into support

Posted by "mustafasrepo (via GitHub)" <gi...@apache.org>.
mustafasrepo merged PR #6219:
URL: https://github.com/apache/arrow-datafusion/pull/6219


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow-datafusion] alamb commented on a diff in pull request #6219: Select Into support

Posted by "alamb (via GitHub)" <gi...@apache.org>.
alamb commented on code in PR #6219:
URL: https://github.com/apache/arrow-datafusion/pull/6219#discussion_r1185058357


##########
datafusion/sql/src/select.rs:
##########
@@ -221,7 +219,23 @@ impl<'a, S: ContextProvider> SqlToRel<'a, S> {
                 .collect::<Result<Vec<_>>>()?;
             LogicalPlanBuilder::from(plan)
                 .repartition(Partitioning::DistributeBy(x))?
-                .build()
+                .build()?
+        } else {
+            plan
+        };
+
+        if let Some(select_into) = select.into {

Review Comment:
   I verified that this is consistent with postgres:
   
   ```sql
   
   postgres=# select * from test;
    column1 | column2
   ---------+---------
          1 |      10
          2 |      20
          3 |      30
   (3 rows)
   
   postgres=# select * into test2 from test;
   SELECT 3
   postgres=# select * into test2 from test;
   ERROR:  relation "test2" already exists
   postgres=# select * from test2;
    column1 | column2
   ---------+---------
          1 |      10
          2 |      20
          3 |      30
   (3 rows)
   ```



##########
datafusion/core/tests/sqllogictests/test_files/ddl.slt:
##########
@@ -296,9 +320,20 @@ SELECT * FROM my_table;
 ----
 1 2 hello
 
+# select_into_with_schema_as_values
+statement ok
+SELECT * INTO new_table FROM my_table

Review Comment:
   I don't understand how this test is different than the ones above. Is the intent to insert into a table in a schema? Like `SELECT * into my_schema.new_table from my_table`?



##########
datafusion/core/tests/sqllogictests/test_files/ddl.slt:
##########
@@ -476,6 +511,15 @@ SELECT * FROM abc
 1 3 3
 5 5 6
 
+statement ok
+SELECT * INTO fgh FROM (VALUES (1,2,3), (4,5,6));

Review Comment:
   likewise it isn't clear to me what other case this is testing



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow-datafusion] alamb commented on a diff in pull request #6219: Select Into support

Posted by "alamb (via GitHub)" <gi...@apache.org>.
alamb commented on code in PR #6219:
URL: https://github.com/apache/arrow-datafusion/pull/6219#discussion_r1185066134


##########
datafusion/core/tests/sqllogictests/test_files/ddl.slt:
##########
@@ -28,6 +28,18 @@ select * from users;
 1 2
 2 3
 
+statement ok
+drop table users
+
+statement ok

Review Comment:
   Can you also please add a negative test (aka that calling `select * INTO users ...` will error if `users` already exists?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow-datafusion] alamb commented on a diff in pull request #6219: Select Into support

Posted by "alamb (via GitHub)" <gi...@apache.org>.
alamb commented on code in PR #6219:
URL: https://github.com/apache/arrow-datafusion/pull/6219#discussion_r1185378729


##########
datafusion/core/tests/sqllogictests/test_files/aggregate.slt:
##########
@@ -371,12 +371,29 @@ select host, median(usage) from cpu group by host;
 host0 90.1
 host1 90.3
 
+statement ok
+drop table cpu;
+
+# this test is to show create table as and select into works in the same way
+statement ok

Review Comment:
   👍 



##########
datafusion/core/tests/sqllogictests/test_files/ddl.slt:
##########
@@ -296,9 +320,20 @@ SELECT * FROM my_table;
 ----
 1 2 hello
 
+# select_into_with_schema_as_values
+statement ok
+SELECT * INTO new_table FROM my_table

Review Comment:
   I think the comments have made the intent clear -- thank you



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org