You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@flink.apache.org by "zhijiang (JIRA)" <ji...@apache.org> on 2019/06/04 02:59:00 UTC
[jira] [Comment Edited] (FLINK-12070) Make blocking result partitions consumable multiple times

    [ https://issues.apache.org/jira/browse/FLINK-12070?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16855252#comment-16855252 ] 

zhijiang edited comment on FLINK-12070 at 6/4/19 2:58 AM:
----------------------------------------------------------

Agree with Stephan and Stefan's opinions.

We could add the batch blocking case for micro benchmark, then it could verify the performance changes for every merged commit for flink-1.9.

I also think it is no need to write to mapped region as now. We actually do not confirm the mmap regions would be consumed immediately by downstream side after producer finishes, becuase it is up to scheduler decision and whether it has enough resource to schedule consumers.

It is necessary to maintain different ways for reading files. Based on my previous lucene index experience, it also provides three ways for reading index files.
 * Files.newByteChannel for simple way.
 * Java nio FileChannel way.
 * Mmap way for large files in 64bit system, and with more free physical memory for mmap as Stefan mentioned.

Then we could compare the behaviors for different ways and also provide more choices for users.


was (Author: zjwang):
Agree with Stephan and Stefan's opinions.

We could add the batch blocking case for micro benchmark, then it could verify the performance changes for every merged commit for flink-1.9.

I also think it is no need to write to mapped region as now, because we actually do not confirm the mmap regions would be consumed immediately by downstream side after producer finishes, and it is up to scheduler decision and whether it has enough resource to schedule consumers.

It is necessary to maintain different ways for reading files. Based on my previous lucene index experience, it also provides three ways for reading index files.
 * Files.newByteChannel for simple way.
 * Java nio FileChannel way.
 * Mmap way for large files in 64bit system, and with more free physical memory for mmap as Stefan mentioned.

Then we could compare the behaviors for different ways and also provide more choices for users.

> Make blocking result partitions consumable multiple times
> ---------------------------------------------------------
>
>                 Key: FLINK-12070
>                 URL: https://issues.apache.org/jira/browse/FLINK-12070
>             Project: Flink
>          Issue Type: Improvement
>          Components: Runtime / Network
>    Affects Versions: 1.9.0
>            Reporter: Till Rohrmann
>            Assignee: Stephan Ewen
>            Priority: Blocker
>              Labels: pull-request-available
>             Fix For: 1.9.0
>
>         Attachments: image-2019-04-18-17-38-24-949.png
>
>          Time Spent: 20m
>  Remaining Estimate: 0h
>
> In order to avoid writing produced results multiple times for multiple consumers and in order to speed up batch recoveries, we should make the blocking result partitions to be consumable multiple times. At the moment a blocking result partition will be released once the consumers has processed all data. Instead the result partition should be released once the next blocking result has been produced and all consumers of a blocking result partition have terminated. Moreover, blocking results should not hold on slot resources like network buffers or memory as it is currently the case with {{SpillableSubpartitions}}.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)