You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@hudi.apache.org by "ASF GitHub Bot (Jira)" <ji...@apache.org> on 2021/08/24 07:20:00 UTC

[jira] [Commented] (HUDI-2351) Fix `Task not serializable` due to new APIs in FSUtils for marker mechanism

    [ https://issues.apache.org/jira/browse/HUDI-2351?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17403594#comment-17403594 ] 

ASF GitHub Bot commented on HUDI-2351:
--------------------------------------

yihua opened a new pull request #3529:
URL: https://github.com/apache/hudi/pull/3529


   ## What is the purpose of the pull request
   
   This PR extracts the common FS and IO util methods used by marker-related operations. 
   
   ## Brief change log
   
   - Adds new methods in `FSUtils` and `FileIOUtils`
     - `parallelizeSubPathProcess()`: a general method for going through sub paths in parallel using `HoodieEngineContext` with custom predicates and processing logic for each sub path.
     - `deleteDir()`: delete a directory with parallelism
     - `readAsUTFStringLines()`: read file content into lines
   - Uses the above methods in marker mechanisms wherever possible.
   
   ## Verify this pull request
   
   - Unit tests around `WriteMarkers` succeed
   - Manually runs spark jobs of bulk inserts with `direct` and `timeline-server-based` marker types.  Both of them succeed locally.
   
   ## Committer checklist
   
    - [ ] Has a corresponding JIRA in PR title & commit
    
    - [ ] Commit message is descriptive of the change
    
    - [ ] CI is green
   
    - [ ] Necessary doc changes done or have another open PR
          
    - [ ] For large changes, please consider breaking it into sub-tasks under an umbrella JIRA.
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


> Fix `Task not serializable` due to new APIs in FSUtils for marker mechanism
> ---------------------------------------------------------------------------
>
>                 Key: HUDI-2351
>                 URL: https://issues.apache.org/jira/browse/HUDI-2351
>             Project: Apache Hudi
>          Issue Type: Sub-task
>            Reporter: Ethan Guo
>            Priority: Major
>
> * Fix `Task not serializable` due to new APIs in FSUtils for recursive, level by level listing (`java.io.NotSerializableException: org.apache.hudi.common.fs.FSUtils$$Lambda$4224/1845791682`)



--
This message was sent by Atlassian Jira
(v8.3.4#803005)