You are viewing a plain text version of this content. The canonical link for it is here.
Posted to github@arrow.apache.org by "kazuyukitanimura (via GitHub)" <gi...@apache.org> on 2023/03/23 19:21:32 UTC

[GitHub] [arrow-datafusion] kazuyukitanimura opened a new pull request, #5709: Executing LocalLimitExec with no column should not return an Err

kazuyukitanimura opened a new pull request, #5709:
URL: https://github.com/apache/arrow-datafusion/pull/5709

   # Which issue does this PR close?
   Closes https://github.com/apache/arrow-datafusion/issues/5701
   
   # Rationale for this change
   Bug fix
   
   # What changes are included in this PR?
   Use `options` to explicitly set `row_count`
   
   # Are these changes tested?
   Yes
   
   # Are there any user-facing changes?
   No
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow-datafusion] alamb commented on a diff in pull request #5709: Executing LocalLimitExec with no column should not return an Err

Posted by "alamb (via GitHub)" <gi...@apache.org>.
alamb commented on code in PR #5709:
URL: https://github.com/apache/arrow-datafusion/pull/5709#discussion_r1146782560


##########
datafusion/core/src/physical_plan/limit.rs:
##########
@@ -462,7 +462,8 @@ impl LimitStream {
                 .iter()
                 .map(|col| col.slice(0, col.len().min(batch_rows)))
                 .collect();
-            Some(RecordBatch::try_new(batch.schema(), limited_columns).unwrap())
+            let options = RecordBatchOptions::new().with_row_count(Option::from(batch_rows));

Review Comment:
   I agree it is nice



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow-datafusion] kazuyukitanimura commented on pull request #5709: Executing LocalLimitExec with no column should not return an Err

Posted by "kazuyukitanimura (via GitHub)" <gi...@apache.org>.
kazuyukitanimura commented on PR #5709:
URL: https://github.com/apache/arrow-datafusion/pull/5709#issuecomment-1481840654

   cc @viirya @sunchao 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow-datafusion] comphead commented on a diff in pull request #5709: Executing LocalLimitExec with no column should not return an Err

Posted by "comphead (via GitHub)" <gi...@apache.org>.
comphead commented on code in PR #5709:
URL: https://github.com/apache/arrow-datafusion/pull/5709#discussion_r1146723703


##########
datafusion/core/src/physical_plan/limit.rs:
##########
@@ -462,7 +462,8 @@ impl LimitStream {
                 .iter()
                 .map(|col| col.slice(0, col.len().min(batch_rows)))
                 .collect();
-            Some(RecordBatch::try_new(batch.schema(), limited_columns).unwrap())
+            let options = RecordBatchOptions::new().with_row_count(Option::from(batch_rows));

Review Comment:
   Thanks @kazuyukitanimura 
   extending call `RecordBatch` to be called with options gives more flexibility



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow-datafusion] kazuyukitanimura commented on pull request #5709: Executing LocalLimitExec with no column should not return an Err

Posted by "kazuyukitanimura (via GitHub)" <gi...@apache.org>.
kazuyukitanimura commented on PR #5709:
URL: https://github.com/apache/arrow-datafusion/pull/5709#issuecomment-1483243724

   Can anyone help with merging this unless there are more feedbacks?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow-datafusion] viirya commented on pull request #5709: Executing LocalLimitExec with no column should not return an Err

Posted by "viirya (via GitHub)" <gi...@apache.org>.
viirya commented on PR #5709:
URL: https://github.com/apache/arrow-datafusion/pull/5709#issuecomment-1481995478

   To have end to end reproducer, you need to be able to project empty relation, not sure if DataFusion allows such thing in SQL/DataFrame API.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow-datafusion] kazuyukitanimura commented on pull request #5709: Executing LocalLimitExec with no column should not return an Err

Posted by "kazuyukitanimura (via GitHub)" <gi...@apache.org>.
kazuyukitanimura commented on PR #5709:
URL: https://github.com/apache/arrow-datafusion/pull/5709#issuecomment-1483244643

   Thank you @alamb @sunchao @viirya 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow-datafusion] alamb merged pull request #5709: Executing LocalLimitExec with no column should not return an Err

Posted by "alamb (via GitHub)" <gi...@apache.org>.
alamb merged PR #5709:
URL: https://github.com/apache/arrow-datafusion/pull/5709


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow-datafusion] kazuyukitanimura commented on pull request #5709: Executing LocalLimitExec with no column should not return an Err

Posted by "kazuyukitanimura (via GitHub)" <gi...@apache.org>.
kazuyukitanimura commented on PR #5709:
URL: https://github.com/apache/arrow-datafusion/pull/5709#issuecomment-1481863483

   > Thank you for the contribution @kazuyukitanimura (and the assist @comphead )
   > 
   > Would it be possible to make an end to end reproducer (either with SQL or with the DataFrame API) to show the problem? It would be nice to ensure there aren't other issues with creating zero column batches.
   
   Thanks @alamb that will have to be a separate work from this PR. I have some examples, but I need to think through how to remove some proprietary code from that. Additionally I suspect there will be other execs that have the same issue, so I would like make the reproducer generic. Would you mind approving this PR first given this is pretty much the same as https://github.com/apache/arrow-datafusion/pull/4912
   I will file a ticket for the end. to end reproducer.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org