You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@spark.apache.org by "Hyukjin Kwon (Jira)" <ji...@apache.org> on 2023/08/08 08:33:00 UTC

[jira] [Resolved] (SPARK-44657) Incorrect limit handling and config parsing in Arrow collect

     [ https://issues.apache.org/jira/browse/SPARK-44657?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Hyukjin Kwon resolved SPARK-44657.
----------------------------------
    Fix Version/s: 3.5.0
                   4.0.0
                   3.4.2
       Resolution: Fixed

Issue resolved by pull request 42321
[https://github.com/apache/spark/pull/42321]

> Incorrect limit handling and config parsing in Arrow collect
> ------------------------------------------------------------
>
>                 Key: SPARK-44657
>                 URL: https://issues.apache.org/jira/browse/SPARK-44657
>             Project: Spark
>          Issue Type: Bug
>          Components: Connect
>    Affects Versions: 3.4.2, 3.4.0, 3.4.1, 3.5.0
>            Reporter: Venkata Sai Akhil Gudesa
>            Assignee: Venkata Sai Akhil Gudesa
>            Priority: Major
>             Fix For: 3.5.0, 4.0.0, 3.4.2
>
>
> In the arrow writer [code|https://github.com/apache/spark/blob/6161bf44f40f8146ea4c115c788fd4eaeb128769/sql/core/src/main/scala/org/apache/spark/sql/execution/arrow/ArrowConverters.scala#L154-L163] , the conditions don’t seem to hold what the documentation says regd "{_}maxBatchSize and maxRecordsPerBatch, respect whatever smaller"{_} since it seems to actually respect the conf which is "larger" (i.e less restrictive) due to _||_ operator.
>  
> Further, when the `{_}CONNECT_GRPC_ARROW_MAX_BATCH_SIZE{_}` conf is read, the value is not converted to bytes from Mib ([example|https://github.com/apache/spark/blob/3e5203c64c06cc8a8560dfa0fb6f52e74589b583/connector/connect/server/src/main/scala/org/apache/spark/sql/connect/execution/SparkConnectPlanExecution.scala#L103]).



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org