You are viewing a plain text version of this content. The canonical link for it is here.
Posted to common-dev@hadoop.apache.org by "Arun C Murthy (JIRA)" <ji...@apache.org> on 2008/05/07 07:45:55 UTC

[jira] Created: (HADOOP-3356) SequenceFile.MergeQueue.merge inadvertently creates merge-outputs in the wrong FileSystem, at times in the InMemoryFileSystem

SequenceFile.MergeQueue.merge inadvertently creates merge-outputs in the wrong FileSystem, at times in the InMemoryFileSystem
-----------------------------------------------------------------------------------------------------------------------------

                 Key: HADOOP-3356
                 URL: https://issues.apache.org/jira/browse/HADOOP-3356
             Project: Hadoop Core
          Issue Type: Bug
          Components: io
    Affects Versions: 0.16.3
            Reporter: Arun C Murthy
            Assignee: Arun C Murthy
             Fix For: 0.18.0


The offending code is:

{code:title=SequenceFile.java}
            Path outputFile =  lDirAlloc.getLocalPathForWrite(
                                                tmpFilename.toString(),
                                                approxOutputSize, conf);
            LOG.debug("writing intermediate results to " + outputFile);
            Writer writer = cloneFileAttributes(
                                                fs.makeQualified(segmentsToMerge.get(0).segmentPathName), 
                                                fs.makeQualified(outputFile), null);
{code}

*fs* is InMemoryFileSystem when ReduceTask.ReduceCopier constructs it... so the wrong FileSystem is used during intermediate merges.
 


-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Issue Comment Edited: (HADOOP-3356) SequenceFile.MergeQueue.merge inadvertently creates merge-outputs in the wrong FileSystem, at times in the InMemoryFileSystem

Posted by "Devaraj Das (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-3356?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12594858#action_12594858 ] 

devaraj edited comment on HADOOP-3356 at 5/7/08 4:39 AM:
-------------------------------------------------------------

This part of the code must never be hit under normal circumstances for intermediate merges (during shuffle). We should only do single-level merges for the intermediate merges. I chatted with Arun offline and he agreed on this. 
Note that this part works as expected when it is supposed to be executed - for multi-level merges and that happens only at the end of the shuffle (when the fs is the localfs). 
We probably should fix this for completeness sake but it is definitely not a critical/major issue.

      was (Author: devaraj):
    This part of the code must never be hit under normal circumstances for intermediate merges (during shuffle). I chatted with Arun offline and he agreed on this. 
  
> SequenceFile.MergeQueue.merge inadvertently creates merge-outputs in the wrong FileSystem, at times in the InMemoryFileSystem
> -----------------------------------------------------------------------------------------------------------------------------
>
>                 Key: HADOOP-3356
>                 URL: https://issues.apache.org/jira/browse/HADOOP-3356
>             Project: Hadoop Core
>          Issue Type: Bug
>          Components: io
>    Affects Versions: 0.16.3
>            Reporter: Arun C Murthy
>            Assignee: Arun C Murthy
>            Priority: Minor
>             Fix For: 0.18.0
>
>
> The offending code is:
> {code:title=SequenceFile.java}
>             Path outputFile =  lDirAlloc.getLocalPathForWrite(
>                                                 tmpFilename.toString(),
>                                                 approxOutputSize, conf);
>             LOG.debug("writing intermediate results to " + outputFile);
>             Writer writer = cloneFileAttributes(
>                                                 fs.makeQualified(segmentsToMerge.get(0).segmentPathName), 
>                                                 fs.makeQualified(outputFile), null);
> {code}
> *fs* is InMemoryFileSystem when ReduceTask.ReduceCopier constructs it... so the wrong FileSystem is used during intermediate merges.
>  

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (HADOOP-3356) SequenceFile.MergeQueue.merge inadvertently creates merge-outputs in the wrong FileSystem, at times in the InMemoryFileSystem

Posted by "Mukund Madhugiri (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HADOOP-3356?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Mukund Madhugiri updated HADOOP-3356:
-------------------------------------

    Fix Version/s:     (was: 0.18.0)

> SequenceFile.MergeQueue.merge inadvertently creates merge-outputs in the wrong FileSystem, at times in the InMemoryFileSystem
> -----------------------------------------------------------------------------------------------------------------------------
>
>                 Key: HADOOP-3356
>                 URL: https://issues.apache.org/jira/browse/HADOOP-3356
>             Project: Hadoop Core
>          Issue Type: Bug
>          Components: io
>    Affects Versions: 0.16.3
>            Reporter: Arun C Murthy
>            Assignee: Arun C Murthy
>            Priority: Minor
>
> The offending code is:
> {code:title=SequenceFile.java}
>             Path outputFile =  lDirAlloc.getLocalPathForWrite(
>                                                 tmpFilename.toString(),
>                                                 approxOutputSize, conf);
>             LOG.debug("writing intermediate results to " + outputFile);
>             Writer writer = cloneFileAttributes(
>                                                 fs.makeQualified(segmentsToMerge.get(0).segmentPathName), 
>                                                 fs.makeQualified(outputFile), null);
> {code}
> *fs* is InMemoryFileSystem when ReduceTask.ReduceCopier constructs it... so the wrong FileSystem is used during intermediate merges.
>  

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (HADOOP-3356) SequenceFile.MergeQueue.merge inadvertently creates merge-outputs in the wrong FileSystem, at times in the InMemoryFileSystem

Posted by "Arun C Murthy (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HADOOP-3356?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Arun C Murthy updated HADOOP-3356:
----------------------------------

    Priority: Critical  (was: Major)

> SequenceFile.MergeQueue.merge inadvertently creates merge-outputs in the wrong FileSystem, at times in the InMemoryFileSystem
> -----------------------------------------------------------------------------------------------------------------------------
>
>                 Key: HADOOP-3356
>                 URL: https://issues.apache.org/jira/browse/HADOOP-3356
>             Project: Hadoop Core
>          Issue Type: Bug
>          Components: io
>    Affects Versions: 0.16.3
>            Reporter: Arun C Murthy
>            Assignee: Arun C Murthy
>            Priority: Critical
>             Fix For: 0.18.0
>
>
> The offending code is:
> {code:title=SequenceFile.java}
>             Path outputFile =  lDirAlloc.getLocalPathForWrite(
>                                                 tmpFilename.toString(),
>                                                 approxOutputSize, conf);
>             LOG.debug("writing intermediate results to " + outputFile);
>             Writer writer = cloneFileAttributes(
>                                                 fs.makeQualified(segmentsToMerge.get(0).segmentPathName), 
>                                                 fs.makeQualified(outputFile), null);
> {code}
> *fs* is InMemoryFileSystem when ReduceTask.ReduceCopier constructs it... so the wrong FileSystem is used during intermediate merges.
>  

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (HADOOP-3356) SequenceFile.MergeQueue.merge inadvertently creates merge-outputs in the wrong FileSystem, at times in the InMemoryFileSystem

Posted by "Devaraj Das (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HADOOP-3356?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Devaraj Das updated HADOOP-3356:
--------------------------------

    Priority: Minor  (was: Critical)

This part of the code must never be hit under normal circumstances for intermediate merges (during shuffle). I chatted with Arun offline and he agreed on this. 

> SequenceFile.MergeQueue.merge inadvertently creates merge-outputs in the wrong FileSystem, at times in the InMemoryFileSystem
> -----------------------------------------------------------------------------------------------------------------------------
>
>                 Key: HADOOP-3356
>                 URL: https://issues.apache.org/jira/browse/HADOOP-3356
>             Project: Hadoop Core
>          Issue Type: Bug
>          Components: io
>    Affects Versions: 0.16.3
>            Reporter: Arun C Murthy
>            Assignee: Arun C Murthy
>            Priority: Minor
>             Fix For: 0.18.0
>
>
> The offending code is:
> {code:title=SequenceFile.java}
>             Path outputFile =  lDirAlloc.getLocalPathForWrite(
>                                                 tmpFilename.toString(),
>                                                 approxOutputSize, conf);
>             LOG.debug("writing intermediate results to " + outputFile);
>             Writer writer = cloneFileAttributes(
>                                                 fs.makeQualified(segmentsToMerge.get(0).segmentPathName), 
>                                                 fs.makeQualified(outputFile), null);
> {code}
> *fs* is InMemoryFileSystem when ReduceTask.ReduceCopier constructs it... so the wrong FileSystem is used during intermediate merges.
>  

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.