You are viewing a plain text version of this content. The canonical link for it is here.
Posted to github@arrow.apache.org by "alamb (via GitHub)" <gi...@apache.org> on 2023/06/27 19:56:16 UTC

[GitHub] [arrow-datafusion] alamb opened a new issue, #6780: Blog post with DataFusion Jun - Sep 2023

alamb opened a new issue, #6780:
URL: https://github.com/apache/arrow-datafusion/issues/6780

   ### Is your feature request related to a problem or challenge?
   
   
   We have had good luck writing up quarterly updates for DataFusion, most recently:
   1. https://github.com/apache/arrow-datafusion/issues/5812
   2. https://arrow.apache.org/blog/2023/06/24/datafusion-25.0.0/
   
   
   
   ### Describe the solution you'd like
   
   It would be great to write another about what has happened in the last few months of DataFusion
   
   Things I expect will be good to highlight (🤞 ):
   * Improved Struct/array support (@izveigor ❤️ )
   * better group by performance with many distinct groups 
   * better insert performance
   
   Others?
   
   ### Describe alternatives you've considered
   
   _No response_
   
   ### Additional context
   
   _No response_


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow-datafusion] alamb commented on issue #6780: Blog post with DataFusion Jun - Sep 2023

Posted by "alamb (via GitHub)" <gi...@apache.org>.
alamb commented on issue #6780:
URL: https://github.com/apache/arrow-datafusion/issues/6780#issuecomment-1721096823

   Also https://github.com/apache/arrow-datafusion/pull/7400 spilling group by


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow-datafusion] Dandandan commented on issue #6780: Blog post with DataFusion Jun - Sep 2023

Posted by "Dandandan (via GitHub)" <gi...@apache.org>.
Dandandan commented on issue #6780:
URL: https://github.com/apache/arrow-datafusion/issues/6780#issuecomment-1637974213

   Improved join performance would maybe be another thing to highlight. Maybe we show a benchmark with improvements (TCP-H, ClickBench, ...) from version 25 -> 28.
   
   - https://github.com/apache/arrow-datafusion/pull/6724
   - https://github.com/apache/arrow-datafusion/pull/6679


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


Re: [I] Blog post with DataFusion Jun - Sep 2023 [arrow-datafusion]

Posted by "alamb (via GitHub)" <gi...@apache.org>.
alamb commented on issue #6780:
URL: https://github.com/apache/arrow-datafusion/issues/6780#issuecomment-1873311054

   This is going to have to be more like a 2023 retrospective 🤔 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow-datafusion] alamb commented on issue #6780: Blog post with DataFusion Jun - Sep 2023

Posted by "alamb (via GitHub)" <gi...@apache.org>.
alamb commented on issue #6780:
URL: https://github.com/apache/arrow-datafusion/issues/6780#issuecomment-1723499231

   Another topic: the new library user guide: https://arrow.apache.org/datafusion/library-user-guide/index.html


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


Re: [I] Blog post with DataFusion Jun - Sep 2023 [arrow-datafusion]

Posted by "alamb (via GitHub)" <gi...@apache.org>.
alamb commented on issue #6780:
URL: https://github.com/apache/arrow-datafusion/issues/6780#issuecomment-1794603293

   Realistically I am very tied up with https://github.com/apache/arrow-datafusion/issues/6782 and so won't have time to work on a blog post until after that is submitted (end of Nov). If someone else has time to work on this it would be very much apprecaited


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow-datafusion] alamb commented on issue #6780: Blog post with DataFusion Jun - Sep 2023

Posted by "alamb (via GitHub)" <gi...@apache.org>.
alamb commented on issue #6780:
URL: https://github.com/apache/arrow-datafusion/issues/6780#issuecomment-1637760203

   Ideas of Major items to include in this post
   
   1. User defined window functions: https://github.com/apache/arrow-datafusion/issues/6781
   2. faster aggregatge performance -- https://github.com/apache/arrow-datafusion/issues/4973
   3. Support for ARRAY / Lists -- https://github.com/apache/arrow-datafusion/issues/6863 etc (thanks @izveigor  and @jayzhan211 )


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


Re: [I] Blog post with DataFusion Jun - Sep 2023 [arrow-datafusion]

Posted by "alamb (via GitHub)" <gi...@apache.org>.
alamb commented on issue #6780:
URL: https://github.com/apache/arrow-datafusion/issues/6780#issuecomment-1880066796

   Here is a PR with a draft (still needs more work): https://github.com/apache/arrow-site/pull/457


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


Re: [I] Blog post with DataFusion Jun - Sep 2023 [arrow-datafusion]

Posted by "alamb (via GitHub)" <gi...@apache.org>.
alamb closed issue #6780: Blog post with DataFusion Jun - Sep 2023
URL: https://github.com/apache/arrow-datafusion/issues/6780


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


Re: [I] Blog post with DataFusion Jun - Sep 2023 [arrow-datafusion]

Posted by "alamb (via GitHub)" <gi...@apache.org>.
alamb commented on issue #6780:
URL: https://github.com/apache/arrow-datafusion/issues/6780#issuecomment-1995165251

   Let's capture other items to highlight here https://github.com/apache/arrow-datafusion/issues/9602


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow-datafusion] alamb commented on issue #6780: Blog post with DataFusion Jun - Sep 2023

Posted by "alamb (via GitHub)" <gi...@apache.org>.
alamb commented on issue #6780:
URL: https://github.com/apache/arrow-datafusion/issues/6780#issuecomment-1682273794

   There has been major work on `INSERT` and `COPY` as well, thanks to @devinjdangelo : https://github.com/apache/arrow-datafusion/issues/6569


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


Re: [I] Blog post with DataFusion Jun - Sep 2023 [arrow-datafusion]

Posted by "alamb (via GitHub)" <gi...@apache.org>.
alamb commented on issue #6780:
URL: https://github.com/apache/arrow-datafusion/issues/6780#issuecomment-1762797752

   FYI this is very much on my list, but I need to focus on the SIGMOD paper for a while. If someone else has the time and inclination to start a PR I would be most appreciative


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


Re: [I] Blog post with DataFusion Jun - Sep 2023 [arrow-datafusion]

Posted by "alamb (via GitHub)" <gi...@apache.org>.
alamb commented on issue #6780:
URL: https://github.com/apache/arrow-datafusion/issues/6780#issuecomment-1900317780

   The blog post is now published! https://arrow.apache.org/blog/2024/01/19/datafusion-34.0.0/


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


Re: [I] Blog post with DataFusion Jun - Sep 2023 [arrow-datafusion]

Posted by "alamb (via GitHub)" <gi...@apache.org>.
alamb commented on issue #6780:
URL: https://github.com/apache/arrow-datafusion/issues/6780#issuecomment-1876952151

   I am starting to draft this now


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org