You are viewing a plain text version of this content. The canonical link for it is here.
Posted to github@arrow.apache.org by GitBox <gi...@apache.org> on 2022/05/29 06:44:06 UTC

[GitHub] [arrow-datafusion] LiuYuHui opened a new pull request, #2642: Implement DESCRIBE
LiuYuHui opened a new pull request, #2642:
URL: https://github.com/apache/arrow-datafusion/pull/2642

   # Which issue does this PR close?
   Closes #2606.
   
    # Rationale for this change
   
   # What changes are included in this PR?
   When the user executes the `DESCRIBE <table>`, the output will be the following.
   ```sql
   ❯ create table foo as select * from (values (1), (3), (2), (10), (8));
   +---------+
   | column1 |
   +---------+
   | 1       |
   | 3       |
   | 2       |
   | 10      |
   | 8       |
   +---------+
   5 rows in set. Query took 0.013 seconds.
   ❯ describe foo;
   +-------------+-----------+-------------+
   | column_name | data_type | is_nullable |
   +-------------+-----------+-------------+
   | column1     | Int64     | YES         |
   +-------------+-----------+-------------+
   1 row in set. Query took 0.016 seconds.
   ```
   
   # Are there any user-facing changes?
   <!--
   If there are user-facing changes then we may require documentation to be updated before approving the PR.
   -->
   
   <!--
   If there are any breaking changes to public APIs, please add the `api change` label.
   -->
   
   # Does this PR break compatibility with Ballista?
   
   <!--
   The CI checks will attempt to build [arrow-ballista](https://github.com/apache/arrow-ballista) against this PR. If 
   this check fails then it indicates that this PR makes a breaking change to the DataFusion API.
   
   If possible, try to make the change in a way that is not a breaking API change. For example, if code has moved 
    around, try adding `pub use` from the original location to preserve the current API.
   
   If it is not possible to avoid a breaking change (such as when adding enum variants) then follow this process:
   
   - Make a corresponding PR against `arrow-ballista` with the changes required there
   - Update `dev/build-arrow-ballista.sh` to clone the appropriate `arrow-ballista` repo & branch
   - Merge this PR when CI passes
   - Merge the Ballista PR
   - Create a new PR here to reset `dev/build-arrow-ballista.sh` to point to `arrow-ballista` master again
   
   _If you would like to help improve this process, please see https://github.com/apache/arrow-datafusion/issues/2583_
   -->


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow-datafusion] andygrove commented on a diff in pull request #2642: Implement DESCRIBE

Posted by GitBox <gi...@apache.org>.
andygrove commented on code in PR #2642:
URL: https://github.com/apache/arrow-datafusion/pull/2642#discussion_r884315275


##########
datafusion/sql/src/planner.rs:
##########
@@ -139,6 +139,7 @@ impl<'a, S: ContextProvider> SqlToRel<'a, S> {
         match statement {
             DFStatement::CreateExternalTable(s) => self.external_table_to_plan(s),
             DFStatement::Statement(s) => self.sql_statement_to_plan(*s),
+            DFStatement::DescribeTable(s) => self.decrible_table_to_plan(s),

Review Comment:
   ```suggestion
               DFStatement::DescribeTable(s) => self.describe_table_to_plan(s),
   ```



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow-datafusion] LiuYuHui commented on pull request #2642: Implement DESCRIBE

Posted by GitBox <gi...@apache.org>.
LiuYuHui commented on PR #2642:
URL: https://github.com/apache/arrow-datafusion/pull/2642#issuecomment-1141836387

   @alamb, I have fixed the nonexistent table case and added the test, please take a look.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow-datafusion] alamb commented on pull request #2642: Implement DESCRIBE

Posted by GitBox <gi...@apache.org>.
alamb commented on PR #2642:
URL: https://github.com/apache/arrow-datafusion/pull/2642#issuecomment-1141107090

   >  Hi @alamb, thanks for your review, currently this PR only implements describe table_name, the describe table table_name is not implemented, this is why your test case has an error, should I also implement the describe table table_name?
   
   🤦  -- no sorry @LiuYuHui  I was mistaken on the syntax of `describe` -- no need to implement `describe table foo`
   
   I tried the correct syntax and it works great 👍 
   ```sql
   ❯ describe y;
   +-------------+-----------+-------------+
   | column_name | data_type | is_nullable |
   +-------------+-----------+-------------+
   | column1     | Int64     | YES         |
   | column2     | Int64     | YES         |
   +-------------+-----------+-------------+
   2 rows in set. Query took 0.006 seconds.
   ```
   
   However, I do think this PR needs a test. 
   
   Also, would it be possible to generate an error if the table didn't exist?
   
   For example, mysql does the following for a non existent table:
   
   ```sql
   mysql> describe ff;
   ERROR 1146 (42S02): Table 'foo.ff' doesn't exist
   ```
   
   But this PR returns an empty row:
   
   
   ```sql
   ❯ describe table;
   0 rows in set. Query took 0.020 seconds.
   ```
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow-datafusion] andygrove commented on a diff in pull request #2642: Implement DESCRIBE

Posted by GitBox <gi...@apache.org>.
andygrove commented on code in PR #2642:
URL: https://github.com/apache/arrow-datafusion/pull/2642#discussion_r884315197


##########
datafusion/sql/src/planner.rs:
##########
@@ -353,6 +354,24 @@ impl<'a, S: ContextProvider> SqlToRel<'a, S> {
         }
     }
 
+    pub fn decrible_table_to_plan(
+        &self,
+        statement: DescribeTable,
+    ) -> Result<LogicalPlan> {
+        if self.has_table("information_schema", "tables") {
+            let table_name = statement.table_name;
+            let sql = format!("SELECT column_name, data_type, is_nullable \
+                                FROM information_schema.columns WHERE table_name = '{table_name}';");
+            let mut rewrite = DFParser::parse_sql(&sql[..])?;
+            self.statement_to_plan(rewrite.pop_front().unwrap())
+        } else {
+            Err(DataFusionError::Plan(
+                "SHOW TABLES is not supported unless information_schema is enabled"

Review Comment:
   ```suggestion
                   "DESCRIBE TABLE is not supported unless information_schema is enabled"
   ```



##########
datafusion/sql/src/planner.rs:
##########
@@ -353,6 +354,24 @@ impl<'a, S: ContextProvider> SqlToRel<'a, S> {
         }
     }
 
+    pub fn decrible_table_to_plan(

Review Comment:
   ```suggestion
       pub fn decribe_table_to_plan(
   ```



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow-datafusion] andygrove commented on a diff in pull request #2642: Implement DESCRIBE

Posted by GitBox <gi...@apache.org>.
andygrove commented on code in PR #2642:
URL: https://github.com/apache/arrow-datafusion/pull/2642#discussion_r884315207


##########
datafusion/sql/src/planner.rs:
##########
@@ -353,6 +354,24 @@ impl<'a, S: ContextProvider> SqlToRel<'a, S> {
         }
     }
 
+    pub fn decrible_table_to_plan(

Review Comment:
   ```suggestion
       pub fn describe_table_to_plan(
   ```



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow-datafusion] LiuYuHui commented on pull request #2642: Implement DESCRIBE

Posted by GitBox <gi...@apache.org>.
LiuYuHui commented on PR #2642:
URL: https://github.com/apache/arrow-datafusion/pull/2642#issuecomment-1141021530

   Hi @alamb, thanks for your review, currently this PR only implements `describe table_name`, the `describe table table_name` is not implemented, this is why your test case has an error, should I also implement the `describe table table_name`?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow-datafusion] alamb merged pull request #2642: Implement DESCRIBE

Posted by GitBox <gi...@apache.org>.
alamb merged PR #2642:
URL: https://github.com/apache/arrow-datafusion/pull/2642


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org