You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@flink.apache.org by "wei (Jira)" <ji...@apache.org> on 2020/08/13 04:53:00 UTC

[jira] [Updated] (FLINK-18915) FIXED_PATH(dummy Hadoop Path) with WriterImpl cause ORC OOM

     [ https://issues.apache.org/jira/browse/FLINK-18915?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

wei updated FLINK-18915:
------------------------
    Summary: FIXED_PATH(dummy Hadoop Path) with WriterImpl cause ORC OOM  (was: FIXED_PATH(dummy Hadoop Path) with WriterImpl cause ORC )

> FIXED_PATH(dummy Hadoop Path) with WriterImpl cause ORC OOM
> -----------------------------------------------------------
>
>                 Key: FLINK-18915
>                 URL: https://issues.apache.org/jira/browse/FLINK-18915
>             Project: Flink
>          Issue Type: Bug
>          Components: Formats (JSON, Avro, Parquet, ORC, SequenceFile)
>    Affects Versions: 1.11.0, 1.11.1
>            Reporter: wei
>            Priority: Major
>
> # OrcBulkWriterFactory
> {code:java}
> @Override
> public BulkWriter<T> create(FSDataOutputStream out) throws IOException {
>    OrcFile.WriterOptions opts = getWriterOptions();
>    opts.physicalWriter(new PhysicalWriterImpl(out, opts));
>    return new OrcBulkWriter<>(vectorizer, new WriterImpl(null, FIXED_PATH, opts));
> }{code}
>  
> # MemoryManagerImpl
> {code:java}
> // 
> public void addWriter(Path path, long requestedAllocation,
>                             Callback callback) throws IOException {
>   checkOwner();
>   WriterInfo oldVal = writerList.get(path);
>   // this should always be null, but we handle the case where the memory
>   // manager wasn't told that a writer wasn't still in use and the task
>   // starts writing to the same path.
>   if (oldVal == null) {
>     oldVal = new WriterInfo(requestedAllocation, callback);
>     writerList.put(path, oldVal);
>     totalAllocation += requestedAllocation;
>   } else {
>     // handle a new writer that is writing to the same path
>     totalAllocation += requestedAllocation - oldVal.allocation;
>     oldVal.allocation = requestedAllocation;
>     oldVal.callback = callback;
>   }
>   updateScale(true);
> }
> {code}
> SinkTask may have multi BulkWriter create, FIXED_PATH will cause overlay the last writer callback;Last writer's WriterImpl#checkMemory will never called;
>  
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)