You are viewing a plain text version of this content. The canonical link for it is here.
Posted to common-dev@hadoop.apache.org by "Konstantin Shvachko (JIRA)" <ji...@apache.org> on 2006/02/17 23:43:51 UTC

[jira] Created: (HADOOP-40) bufferSize argument is ignored in FileSystem.create(File, boolean, int)

bufferSize argument is ignored in FileSystem.create(File, boolean, int)
-----------------------------------------------------------------------

         Key: HADOOP-40
         URL: http://issues.apache.org/jira/browse/HADOOP-40
     Project: Hadoop
        Type: Bug
  Components: fs  
    Reporter: Konstantin Shvachko
    Priority: Minor


org.apache.hadoop.fs.FileSystem.create(File f, boolean overwrite, int bufferSize)

ignores the input parameter bufferSize.
It passes further down the internal configuration, which includes the buffer size, but not the parameter value.
This works fine within the file system, since everything that calls create extracts buffer size from the same config. 
MapReduce although is probably affected by that, see 

org.apache.hadoop.io.SequenceFile.Sorter.MergeQueue.MergeQueue(int size, String outName, boolean done)

The attached patch would fix it.

-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators:
   http://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see:
   http://www.atlassian.com/software/jira


[jira] Updated: (HADOOP-40) bufferSize argument is ignored in FileSystem.create(File, boolean, int)

Posted by "Konstantin Shvachko (JIRA)" <ji...@apache.org>.
     [ http://issues.apache.org/jira/browse/HADOOP-40?page=all ]

Konstantin Shvachko updated HADOOP-40:
--------------------------------------

    Attachment: BufferSize.patch

> bufferSize argument is ignored in FileSystem.create(File, boolean, int)
> -----------------------------------------------------------------------
>
>          Key: HADOOP-40
>          URL: http://issues.apache.org/jira/browse/HADOOP-40
>      Project: Hadoop
>         Type: Bug
>   Components: fs
>     Reporter: Konstantin Shvachko
>     Priority: Minor
>  Attachments: BufferSize.patch
>
> org.apache.hadoop.fs.FileSystem.create(File f, boolean overwrite, int bufferSize)
> ignores the input parameter bufferSize.
> It passes further down the internal configuration, which includes the buffer size, but not the parameter value.
> This works fine within the file system, since everything that calls create extracts buffer size from the same config. 
> MapReduce although is probably affected by that, see 
> org.apache.hadoop.io.SequenceFile.Sorter.MergeQueue.MergeQueue(int size, String outName, boolean done)
> The attached patch would fix it.

-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators:
   http://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see:
   http://www.atlassian.com/software/jira


[jira] Updated: (HADOOP-40) bufferSize argument is ignored in FileSystem.create(File, boolean, int)

Posted by "Doug Cutting (JIRA)" <ji...@apache.org>.
     [ http://issues.apache.org/jira/browse/HADOOP-40?page=all ]

Doug Cutting updated HADOOP-40:
-------------------------------

    Attachment:     (was: bufferSize.patch)

> bufferSize argument is ignored in FileSystem.create(File, boolean, int)
> -----------------------------------------------------------------------
>
>          Key: HADOOP-40
>          URL: http://issues.apache.org/jira/browse/HADOOP-40
>      Project: Hadoop
>         Type: Bug
>   Components: fs
>     Reporter: Konstantin Shvachko
>     Priority: Minor
>  Attachments: BufferSize.patch
>
> org.apache.hadoop.fs.FileSystem.create(File f, boolean overwrite, int bufferSize)
> ignores the input parameter bufferSize.
> It passes further down the internal configuration, which includes the buffer size, but not the parameter value.
> This works fine within the file system, since everything that calls create extracts buffer size from the same config. 
> MapReduce although is probably affected by that, see 
> org.apache.hadoop.io.SequenceFile.Sorter.MergeQueue.MergeQueue(int size, String outName, boolean done)
> The attached patch would fix it.

-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators:
   http://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see:
   http://www.atlassian.com/software/jira


[jira] Commented: (HADOOP-40) bufferSize argument is ignored in FileSystem.create(File, boolean, int)

Posted by "Konstantin Shvachko (JIRA)" <ji...@apache.org>.
    [ http://issues.apache.org/jira/browse/HADOOP-40?page=comments#action_12367738 ] 

Konstantin Shvachko commented on HADOOP-40:
-------------------------------------------

Yes. This is the right way of doing it.
It particularly makes sense, since the main file and the checksum file
do not necessarily need to share the same buffer size.
With the file buffer large the checksum buffer doesn't need to be large at all.

> bufferSize argument is ignored in FileSystem.create(File, boolean, int)
> -----------------------------------------------------------------------
>
>          Key: HADOOP-40
>          URL: http://issues.apache.org/jira/browse/HADOOP-40
>      Project: Hadoop
>         Type: Bug
>   Components: fs
>     Reporter: Konstantin Shvachko
>     Priority: Minor
>      Fix For: 0.1
>  Attachments: BufferSize.patch, bufferSize.patch
>
> org.apache.hadoop.fs.FileSystem.create(File f, boolean overwrite, int bufferSize)
> ignores the input parameter bufferSize.
> It passes further down the internal configuration, which includes the buffer size, but not the parameter value.
> This works fine within the file system, since everything that calls create extracts buffer size from the same config. 
> MapReduce although is probably affected by that, see 
> org.apache.hadoop.io.SequenceFile.Sorter.MergeQueue.MergeQueue(int size, String outName, boolean done)
> The attached patch would fix it.

-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators:
   http://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see:
   http://www.atlassian.com/software/jira


[jira] Updated: (HADOOP-40) bufferSize argument is ignored in FileSystem.create(File, boolean, int)

Posted by "Doug Cutting (JIRA)" <ji...@apache.org>.
     [ http://issues.apache.org/jira/browse/HADOOP-40?page=all ]

Doug Cutting updated HADOOP-40:
-------------------------------

    Attachment: bufferSize.patch

I don't think we should modify the configuration in this case, since that will affect code which uses this configuration that runs later.  SequenceFile uses very large buffers when sorting and merging, in order to minimize disk seeks, and we don't want everything to start using such large buffers.

So why not just pass the missing parameter down?  I've attached a patch that does this.  Does this look good to you?

> bufferSize argument is ignored in FileSystem.create(File, boolean, int)
> -----------------------------------------------------------------------
>
>          Key: HADOOP-40
>          URL: http://issues.apache.org/jira/browse/HADOOP-40
>      Project: Hadoop
>         Type: Bug
>   Components: fs
>     Reporter: Konstantin Shvachko
>     Priority: Minor
>  Attachments: BufferSize.patch
>
> org.apache.hadoop.fs.FileSystem.create(File f, boolean overwrite, int bufferSize)
> ignores the input parameter bufferSize.
> It passes further down the internal configuration, which includes the buffer size, but not the parameter value.
> This works fine within the file system, since everything that calls create extracts buffer size from the same config. 
> MapReduce although is probably affected by that, see 
> org.apache.hadoop.io.SequenceFile.Sorter.MergeQueue.MergeQueue(int size, String outName, boolean done)
> The attached patch would fix it.

-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators:
   http://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see:
   http://www.atlassian.com/software/jira


[jira] Resolved: (HADOOP-40) bufferSize argument is ignored in FileSystem.create(File, boolean, int)

Posted by "Doug Cutting (JIRA)" <ji...@apache.org>.
     [ http://issues.apache.org/jira/browse/HADOOP-40?page=all ]
     
Doug Cutting resolved HADOOP-40:
--------------------------------

    Fix Version: 0.1
     Resolution: Fixed

I just committed my patch for this.

> bufferSize argument is ignored in FileSystem.create(File, boolean, int)
> -----------------------------------------------------------------------
>
>          Key: HADOOP-40
>          URL: http://issues.apache.org/jira/browse/HADOOP-40
>      Project: Hadoop
>         Type: Bug
>   Components: fs
>     Reporter: Konstantin Shvachko
>     Priority: Minor
>      Fix For: 0.1
>  Attachments: BufferSize.patch, bufferSize.patch
>
> org.apache.hadoop.fs.FileSystem.create(File f, boolean overwrite, int bufferSize)
> ignores the input parameter bufferSize.
> It passes further down the internal configuration, which includes the buffer size, but not the parameter value.
> This works fine within the file system, since everything that calls create extracts buffer size from the same config. 
> MapReduce although is probably affected by that, see 
> org.apache.hadoop.io.SequenceFile.Sorter.MergeQueue.MergeQueue(int size, String outName, boolean done)
> The attached patch would fix it.

-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators:
   http://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see:
   http://www.atlassian.com/software/jira


[jira] Updated: (HADOOP-40) bufferSize argument is ignored in FileSystem.create(File, boolean, int)

Posted by "Doug Cutting (JIRA)" <ji...@apache.org>.
     [ http://issues.apache.org/jira/browse/HADOOP-40?page=all ]

Doug Cutting updated HADOOP-40:
-------------------------------

    Attachment: bufferSize.patch

> bufferSize argument is ignored in FileSystem.create(File, boolean, int)
> -----------------------------------------------------------------------
>
>          Key: HADOOP-40
>          URL: http://issues.apache.org/jira/browse/HADOOP-40
>      Project: Hadoop
>         Type: Bug
>   Components: fs
>     Reporter: Konstantin Shvachko
>     Priority: Minor
>  Attachments: BufferSize.patch, bufferSize.patch
>
> org.apache.hadoop.fs.FileSystem.create(File f, boolean overwrite, int bufferSize)
> ignores the input parameter bufferSize.
> It passes further down the internal configuration, which includes the buffer size, but not the parameter value.
> This works fine within the file system, since everything that calls create extracts buffer size from the same config. 
> MapReduce although is probably affected by that, see 
> org.apache.hadoop.io.SequenceFile.Sorter.MergeQueue.MergeQueue(int size, String outName, boolean done)
> The attached patch would fix it.

-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators:
   http://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see:
   http://www.atlassian.com/software/jira