You are viewing a plain text version of this content. The canonical link for it is here.
Posted to github@arrow.apache.org by GitBox <gi...@apache.org> on 2022/10/31 23:04:55 UTC

[GitHub] [arrow] westonpace commented on pull request #14491: ARROW-17923: [C++] Consider dictionary arrays for special fragment fields

westonpace commented on PR #14491:
URL: https://github.com/apache/arrow/pull/14491#issuecomment-1297795472

   Yes, it gets broadcast to an array before it is sent out.  Although your point is a valid one, a dictionary scalar is probably a smell of some kind.  I had not been thinking of it this way initially.  Perhaps a more general fix would be that, when we broadcast a scalar, if the type is a binary data type, we could always broadcast it into a dictionary array.
   
   I think it might be worth an attempt to play around with this idea a bit.  I suspect we might run into problems with columns that may or may not be dictionary.  For example, if a column happens to be a partition column, we can represent it as a scalar.  However, we don't necessarily know if a column is a partition column or not when we are constructing the plan, and we might bind kernels thinking the type is a normal type and then suddenly get a dictionary array.
   
   So __filename is a little bit special in that it is the only field that easily know will always be a scalar.
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org