You are viewing a plain text version of this content. The canonical link for it is here.
Posted to github@arrow.apache.org by GitBox <gi...@apache.org> on 2021/05/06 20:50:55 UTC

[GitHub] [arrow-datafusion] Dandandan opened a new pull request #280: Improve column naming by aliasing with expression name

Dandandan opened a new pull request #280:
URL: https://github.com/apache/arrow-datafusion/pull/280


   # Which issue does this PR close?
   
   
   Closes #279
   
    # Rationale for this change
    Why are you proposing this change? If this is already explained clearly in the issue then this section is not needed.
    Explaining clearly why changes are proposed helps reviewers understand your changes and offer better suggestions for fixes.  
   
   # What changes are included in this PR?
   
   * Wrap unnamed expressions in a `alias`.
   * Don't display alias in logical plan (as it otherwise will be added to every expression) - still not so sure about this one.
   
   # Are there any user-facing changes?
   
   Yes, column names will be different.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow-datafusion] alamb commented on a change in pull request #280: Improve column naming by aliasing with expression name

Posted by GitBox <gi...@apache.org>.
alamb commented on a change in pull request #280:
URL: https://github.com/apache/arrow-datafusion/pull/280#discussion_r628526892



##########
File path: datafusion/src/sql/planner.rs
##########
@@ -1517,7 +1520,7 @@ mod tests {
     fn select_no_relation() {
         quick_test(
             "SELECT 1",
-            "Projection: Int64(1)\
+            "Projection: Int64(1) AS 1\

Review comment:
       This seems like an improvement to me.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow-datafusion] codecov-commenter commented on pull request #280: Improve column naming by aliasing with expression name

Posted by GitBox <gi...@apache.org>.
codecov-commenter commented on pull request #280:
URL: https://github.com/apache/arrow-datafusion/pull/280#issuecomment-834254447


   # [Codecov](https://codecov.io/gh/apache/arrow-datafusion/pull/280?src=pr&el=h1&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) Report
   > Merging [#280](https://codecov.io/gh/apache/arrow-datafusion/pull/280?src=pr&el=desc&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) (8522633) into [master](https://codecov.io/gh/apache/arrow-datafusion/commit/b8805d4f44d4da8f16069d93ab342dc6f082ca07?el=desc&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) (b8805d4) will **decrease** coverage by `0.05%`.
   > The diff coverage is `100.00%`.
   
   [![Impacted file tree graph](https://codecov.io/gh/apache/arrow-datafusion/pull/280/graphs/tree.svg?width=650&height=150&src=pr&token=JXwWBKD3D9&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation)](https://codecov.io/gh/apache/arrow-datafusion/pull/280?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation)
   
   ```diff
   @@            Coverage Diff             @@
   ##           master     #280      +/-   ##
   ==========================================
   - Coverage   76.19%   76.13%   -0.06%     
   ==========================================
     Files         140      140              
     Lines       23595    23607      +12     
   ==========================================
   - Hits        17978    17974       -4     
   - Misses       5617     5633      +16     
   ```
   
   
   | [Impacted Files](https://codecov.io/gh/apache/arrow-datafusion/pull/280?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) | Coverage Δ | |
   |---|---|---|
   | [datafusion/src/execution/context.rs](https://codecov.io/gh/apache/arrow-datafusion/pull/280/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-ZGF0YWZ1c2lvbi9zcmMvZXhlY3V0aW9uL2NvbnRleHQucnM=) | `92.66% <100.00%> (+0.02%)` | :arrow_up: |
   | [datafusion/src/execution/dataframe\_impl.rs](https://codecov.io/gh/apache/arrow-datafusion/pull/280/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-ZGF0YWZ1c2lvbi9zcmMvZXhlY3V0aW9uL2RhdGFmcmFtZV9pbXBsLnJz) | `89.10% <100.00%> (+0.21%)` | :arrow_up: |
   | [datafusion/src/logical\_plan/expr.rs](https://codecov.io/gh/apache/arrow-datafusion/pull/280/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-ZGF0YWZ1c2lvbi9zcmMvbG9naWNhbF9wbGFuL2V4cHIucnM=) | `81.68% <100.00%> (-2.72%)` | :arrow_down: |
   | [datafusion/src/sql/planner.rs](https://codecov.io/gh/apache/arrow-datafusion/pull/280/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-ZGF0YWZ1c2lvbi9zcmMvc3FsL3BsYW5uZXIucnM=) | `83.65% <100.00%> (+0.02%)` | :arrow_up: |
   | [datafusion/src/sql/utils.rs](https://codecov.io/gh/apache/arrow-datafusion/pull/280/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-ZGF0YWZ1c2lvbi9zcmMvc3FsL3V0aWxzLnJz) | `48.12% <0.00%> (-0.63%)` | :arrow_down: |
   
   ------
   
   [Continue to review full report at Codecov](https://codecov.io/gh/apache/arrow-datafusion/pull/280?src=pr&el=continue&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation).
   > **Legend** - [Click here to learn more](https://docs.codecov.io/docs/codecov-delta?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation)
   > `Δ = absolute <relative> (impact)`, `ø = not affected`, `? = missing data`
   > Powered by [Codecov](https://codecov.io/gh/apache/arrow-datafusion/pull/280?src=pr&el=footer&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation). Last update [b8805d4...8522633](https://codecov.io/gh/apache/arrow-datafusion/pull/280?src=pr&el=lastupdated&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation). Read the [comment docs](https://docs.codecov.io/docs/pull-request-comments?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation).
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow-datafusion] alamb commented on pull request #280: Improve column naming by aliasing with expression name

Posted by GitBox <gi...@apache.org>.
alamb commented on pull request #280:
URL: https://github.com/apache/arrow-datafusion/pull/280#issuecomment-834779967


   This is an interesting idea -- and seems similar to what @wqc200  was working on in https://github.com/apache/arrow/pull/9600  


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow-datafusion] Dandandan closed pull request #280: Improve column naming by aliasing with expression name

Posted by GitBox <gi...@apache.org>.
Dandandan closed pull request #280:
URL: https://github.com/apache/arrow-datafusion/pull/280


   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow-datafusion] houqp commented on pull request #280: Improve column naming by aliasing with expression name

Posted by GitBox <gi...@apache.org>.
houqp commented on pull request #280:
URL: https://github.com/apache/arrow-datafusion/pull/280#issuecomment-835863662


   > I think we should make sure unnamed expressions can only be referenced by index, or with * and not by a generated name.
   
   +1, I did [some research](https://docs.google.com/document/d/1uviWavwEGD3qxwMk2AGkOgp6ENrvKGiMWQhHNbqPwhg/edit#heading=h.f80y77vmglud) on this as well, postgresql even name most of the unnamed expression `?column?` in their output. Users should either use index or manually create aliases.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow-datafusion] alamb commented on pull request #280: Improve column naming by aliasing with expression name

Posted by GitBox <gi...@apache.org>.
alamb commented on pull request #280:
URL: https://github.com/apache/arrow-datafusion/pull/280#issuecomment-835267036


   That makes sense, thank you for the clarification
   
   On Fri, May 7, 2021 at 5:53 PM Daniël Heres ***@***.***>
   wrote:
   
   > I don't really understand the concerns about column expressions and what
   > might be bad here. The alternate is probably to keep a "name" field on all
   > Exprs which was the approach in apache/arrow#9600
   > <https://github.com/apache/arrow/pull/9600>
   >
   > Happy to reconsider! My main concern is that introducing aliases for
   > unnamed expressions will cause other "strange" issues like using the name
   > of the expression, like:
   > select "1+1" from (select 1+1) t
   >
   > Which should fail but would return 2?
   >
   > I think we should make sure unnamed expressions can only be referenced by
   > index, or with * and not by a generated name.
   >
   > —
   > You are receiving this because your review was requested.
   > Reply to this email directly, view it on GitHub
   > <https://github.com/apache/arrow-datafusion/pull/280#issuecomment-834805975>,
   > or unsubscribe
   > <https://github.com/notifications/unsubscribe-auth/AADXZMJ5SN5I2FDE3TBHR2TTMRON7ANCNFSM44H4TNUA>
   > .
   >
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow-datafusion] Dandandan commented on pull request #280: Improve column naming by aliasing with expression name

Posted by GitBox <gi...@apache.org>.
Dandandan commented on pull request #280:
URL: https://github.com/apache/arrow-datafusion/pull/280#issuecomment-834752101


   I will do some more research into what's needed for column names, I think we have to avoid reusing alias for this and see if we should support unnamed columns better.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow-datafusion] Dandandan commented on pull request #280: Improve column naming by aliasing with expression name

Posted by GitBox <gi...@apache.org>.
Dandandan commented on pull request #280:
URL: https://github.com/apache/arrow-datafusion/pull/280#issuecomment-834805975


   > I don't really understand the concerns about column expressions and what might be bad here. The alternate is probably to keep a "name" field on all `Exprs` which was the approach in [apache/arrow#9600](https://github.com/apache/arrow/pull/9600)
   
   Happy to reconsider! My main concern is that introducing aliases for unnamed expressions will cause other "strange" issues like using the name of the expression, like:
   `select "1+1" from (select 1+1) t`
   
   Which should fail but would return 2?
   
   I think we should make sure unnamed expressions can only be referenced by index, or with `*` and not by a generated name.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org