You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@ignite.apache.org by "Alexey Goncharuk (JIRA)" <ji...@apache.org> on 2018/08/15 13:55:00 UTC
[jira] [Updated] (IGNITE-9275) Introduce mechanism to fetch partition file via a p2p protocol

     [ https://issues.apache.org/jira/browse/IGNITE-9275?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Alexey Goncharuk updated IGNITE-9275:
-------------------------------------
    Description: 
As a first step to estimate how much faster the file-rebalancing may be, I suggest to implement a simple partition fetch procedure via the communication SPI extension: 
1) Node A sends a partition fetch request to node B 
2) Node B starts a checkpoint and creates a local copy of the partition. Note that during the partition copy there might be concurrent ongoing checkpoints, this must be handled properly
3) Node B establishes a new TCP connection on the TCP communication port (handshake and verification is assumed)
4) Node B calls transferFile (or native analogue, investigation needed) to send the partition file in the most effective way
5) Node A writes the file to a specified location on the local file system

After this mechanics is implemented, we need to hack the rebalance code and use partition fetch logic instead of regular rebalance to measure
1) How much faster (or slower) the new approach performs
2) How it affects the concurrent transactions in the grid

> Introduce mechanism to fetch partition file via a p2p protocol
> --------------------------------------------------------------
>
>                 Key: IGNITE-9275
>                 URL: https://issues.apache.org/jira/browse/IGNITE-9275
>             Project: Ignite
>          Issue Type: Sub-task
>            Reporter: Alexey Goncharuk
>            Priority: Major
>
> As a first step to estimate how much faster the file-rebalancing may be, I suggest to implement a simple partition fetch procedure via the communication SPI extension: 
> 1) Node A sends a partition fetch request to node B 
> 2) Node B starts a checkpoint and creates a local copy of the partition. Note that during the partition copy there might be concurrent ongoing checkpoints, this must be handled properly
> 3) Node B establishes a new TCP connection on the TCP communication port (handshake and verification is assumed)
> 4) Node B calls transferFile (or native analogue, investigation needed) to send the partition file in the most effective way
> 5) Node A writes the file to a specified location on the local file system
> After this mechanics is implemented, we need to hack the rebalance code and use partition fetch logic instead of regular rebalance to measure
> 1) How much faster (or slower) the new approach performs
> 2) How it affects the concurrent transactions in the grid



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)