You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@ignite.apache.org by "Ivan Veselovsky (JIRA)" <ji...@apache.org> on 2016/12/13 11:20:58 UTC

[jira] [Comment Edited] (IGNITE-3877) Clarify if IgfsFile -> FileStatus conversion should treat groupBlockSize as blockSize

    [ https://issues.apache.org/jira/browse/IGNITE-3877?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15744890#comment-15744890 ] 

Ivan Veselovsky edited comment on IGNITE-3877 at 12/13/16 11:20 AM:
--------------------------------------------------------------------

The problem is in mixing notions GroupBlockSize and BlockSize. 
As comment to class {code}org.apache.ignite.igfs.IgfsGroupDataBlocksKeyMapper{code} states, when {code}org.apache.hadoop.fs.FileSystem{code} lies upon IGFS, the Hadoop Fs block size equals to underlying IGFS *group* block size (which, in turn, is the block size multiplied by groupSize).
When we  have IGFS over Hadoop FileSystem, we also use the group block size as the block size of the created secondary Fs files (see {code}org.apache.ignite.internal.processors.igfs.IgfsSecondaryFileSystemCreateContext#create{code}).
This way, when reading files we should follow the same logic: IGFS *group* block size ==  Hadoop block size, and IGFS block size is just the configuration value ({code}org.apache.ignite.configuration.FileSystemConfiguration#getBlockSize{code}) . 
This way we make the logic consistent, and fix the assertion issue described above.




was (Author: iveselovskiy):
The problem is in mixing notions GroupBlockSize and BlockSize. 
As comment to class {code}org.apache.ignite.igfs.IgfsGroupDataBlocksKeyMapper{code} states, when {code}org.apache.hadoop.fs.FileSystem{code} lies upon IGFS, the Hadoop Fs block size equals to underlying IGFS group block size (which, in turn, is the block size multiplied by groupSize).
When we  have IGFS over Hadoop FileSystem, we also use the group block size as the block size of the created secondary Fs files (see {code}org.apache.ignite.internal.processors.igfs.IgfsSecondaryFileSystemCreateContext#create{code}).
This way, when reading files we should follow the same logic: IGFS *group* block size ==  Hadoop block size, and IGFS block size is just the configuration value ({code}org.apache.ignite.configuration.FileSystemConfiguration#getBlockSize{code}) . 
This way we make the logic consistent, and fix the assertion issue described above.



> Clarify if IgfsFile -> FileStatus conversion should treat groupBlockSize as blockSize
> -------------------------------------------------------------------------------------
>
>                 Key: IGNITE-3877
>                 URL: https://issues.apache.org/jira/browse/IGNITE-3877
>             Project: Ignite
>          Issue Type: Bug
>          Components: IGFS
>    Affects Versions: 1.6
>            Reporter: Ivan Veselovsky
>            Assignee: Vladimir Ozerov
>             Fix For: 2.0
>
>
> During Metrics tests repairing test org.apache.ignite.igfs.Hadoop1DualAbstractTest#testMetricsBlock revealed the following problem:
> org.apache.ignite.hadoop.fs.v1.IgniteHadoopFileSystem#convert(org.apache.ignite.igfs.IgfsFile) method treats groupBlockSize as blockSize for Hadoop FileStatus. groupBlockSize can be several times larger than blockSize, so blockSize in status gets different to that in original IgfsFile .
> changing file.groupBlockSize() to file.blockSize()  fixes problem in metrics tests, but creates problems in Hadoop tests that are bound to splits calculation, since split calculation related to blockSizes.
> Need to 
> 1) clarify if the treatment of groupBlcokSize was intentional.
> 2) fix either metrics tests or Hadoop tests. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)