You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@hudi.apache.org by "Matt Pouttu (Jira)" <ji...@apache.org> on 2021/05/27 14:46:00 UTC

[jira] [Commented] (HUDI-1873) collect() call causing issues with very large upserts

    [ https://issues.apache.org/jira/browse/HUDI-1873?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17352539#comment-17352539 ] 

Matt Pouttu commented on HUDI-1873:
-----------------------------------

This was PR approved and merged to master

> collect() call causing issues with very large upserts
> -----------------------------------------------------
>
>                 Key: HUDI-1873
>                 URL: https://issues.apache.org/jira/browse/HUDI-1873
>             Project: Apache Hudi
>          Issue Type: Bug
>          Components: Spark Integration
>    Affects Versions: 0.7.0, 0.8.0
>         Environment: EMR 5.28 Spark 11
>            Reporter: Matt Pouttu
>            Priority: Major
>              Labels: newbie, pull-request-available, sev:high
>             Fix For: 0.9.0
>
>   Original Estimate: 0h
>  Remaining Estimate: 0h
>
> A collect call causes resource issues with very large upserts, and is only used for reporting error messages that are already in the spark task logs. I replaced it with a .isEmpty() call and amended the error message to direct the user to the task logs.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)