You are viewing a plain text version of this content. The canonical link for it is here.
Posted to common-dev@hadoop.apache.org by "Mahadev konar (JIRA)" <ji...@apache.org> on 2007/01/10 20:38:27 UTC

[jira] Created: (HADOOP-878) reducer NONE does not work with multiple maps

reducer NONE does not work with multiple maps
---------------------------------------------

                 Key: HADOOP-878
                 URL: https://issues.apache.org/jira/browse/HADOOP-878
             Project: Hadoop
          Issue Type: Bug
          Components: contrib/streaming
            Reporter: Mahadev konar
         Assigned To: Sanjay Dahiya
            Priority: Minor


If you execute more than one maps with -reducer None output data seems to get lost. The number of lines output with identity reducer is greater than the number of lines output using reducer NONE.



-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators: https://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] Updated: (HADOOP-878) reducer NONE does not work with multiple maps

Posted by "Arun C Murthy (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HADOOP-878?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Arun C Murthy updated HADOOP-878:
---------------------------------

    Attachment: HADOOP-878_20070220_1.patch

PipeMapRed:getSideEffectFileName which would put ':' in the 'fileName' as <fileName:offset:length> (via FileInput.toString()); this multiple colons cause an URI exception while constructing the path.

I've fixed that to put '-' instead of ':' and seems to work... I'd appreciate any review/feedback since I'm not too conversant with streaming...

Mahadev, could you check this patch and let me know if it works? Thanks!

> reducer NONE does not work with multiple maps
> ---------------------------------------------
>
>                 Key: HADOOP-878
>                 URL: https://issues.apache.org/jira/browse/HADOOP-878
>             Project: Hadoop
>          Issue Type: Bug
>          Components: contrib/streaming
>            Reporter: Mahadev konar
>         Assigned To: Mahadev konar
>            Priority: Minor
>         Attachments: HADOOP-878_20070220_1.patch
>
>
> If you execute more than one maps with -reducer None output data seems to get lost. The number of lines output with identity reducer is greater than the number of lines output using reducer NONE.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (HADOOP-878) reducer NONE does not work with multiple maps

Posted by "Owen O'Malley (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HADOOP-878?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Owen O'Malley updated HADOOP-878:
---------------------------------

    Assignee: Mahadev konar  (was: Sanjay Dahiya)

Mahadev, can you please verify if this problem is resolved? Thanks!

> reducer NONE does not work with multiple maps
> ---------------------------------------------
>
>                 Key: HADOOP-878
>                 URL: https://issues.apache.org/jira/browse/HADOOP-878
>             Project: Hadoop
>          Issue Type: Bug
>          Components: contrib/streaming
>            Reporter: Mahadev konar
>         Assigned To: Mahadev konar
>            Priority: Minor
>
> If you execute more than one maps with -reducer None output data seems to get lost. The number of lines output with identity reducer is greater than the number of lines output using reducer NONE.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HADOOP-878) reducer NONE does not work with multiple maps

Posted by "Mahadev konar (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-878?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12472935 ] 

Mahadev konar commented on HADOOP-878:
--------------------------------------

the reducer none problem is still not fixed in the current trunk. It is still broken. The jobs fail consistently with -reducer NONE.

> reducer NONE does not work with multiple maps
> ---------------------------------------------
>
>                 Key: HADOOP-878
>                 URL: https://issues.apache.org/jira/browse/HADOOP-878
>             Project: Hadoop
>          Issue Type: Bug
>          Components: contrib/streaming
>            Reporter: Mahadev konar
>         Assigned To: Mahadev konar
>            Priority: Minor
>
> If you execute more than one maps with -reducer None output data seems to get lost. The number of lines output with identity reducer is greater than the number of lines output using reducer NONE.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Assigned: (HADOOP-878) reducer NONE does not work with multiple maps

Posted by "Arun C Murthy (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HADOOP-878?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Arun C Murthy reassigned HADOOP-878:
------------------------------------

    Assignee: Arun C Murthy  (was: Mahadev konar)

> reducer NONE does not work with multiple maps
> ---------------------------------------------
>
>                 Key: HADOOP-878
>                 URL: https://issues.apache.org/jira/browse/HADOOP-878
>             Project: Hadoop
>          Issue Type: Bug
>          Components: contrib/streaming
>            Reporter: Mahadev konar
>         Assigned To: Arun C Murthy
>            Priority: Minor
>         Attachments: HADOOP-878_20070220_1.patch
>
>
> If you execute more than one maps with -reducer None output data seems to get lost. The number of lines output with identity reducer is greater than the number of lines output using reducer NONE.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (HADOOP-878) reducer NONE does not work with multiple maps

Posted by "Doug Cutting (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HADOOP-878?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Doug Cutting updated HADOOP-878:
--------------------------------

       Resolution: Fixed
    Fix Version/s: 0.12.0
           Status: Resolved  (was: Patch Available)

I just committed this.  Thanks, Arun!

> reducer NONE does not work with multiple maps
> ---------------------------------------------
>
>                 Key: HADOOP-878
>                 URL: https://issues.apache.org/jira/browse/HADOOP-878
>             Project: Hadoop
>          Issue Type: Bug
>          Components: contrib/streaming
>            Reporter: Mahadev konar
>         Assigned To: Arun C Murthy
>            Priority: Minor
>             Fix For: 0.12.0
>
>         Attachments: HADOOP-878_20070220_1.patch, HADOOP-878_20070222_2.patch
>
>
> If you execute more than one maps with -reducer None output data seems to get lost. The number of lines output with identity reducer is greater than the number of lines output using reducer NONE.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HADOOP-878) reducer NONE does not work with multiple maps

Posted by "Arun C Murthy (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-878?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12475443 ] 

Arun C Murthy commented on HADOOP-878:
--------------------------------------

Looks like it was TestCheckPoint which failed, which obviously this patch doesn't affect... could we ignore the -1?

> reducer NONE does not work with multiple maps
> ---------------------------------------------
>
>                 Key: HADOOP-878
>                 URL: https://issues.apache.org/jira/browse/HADOOP-878
>             Project: Hadoop
>          Issue Type: Bug
>          Components: contrib/streaming
>            Reporter: Mahadev konar
>         Assigned To: Arun C Murthy
>            Priority: Minor
>         Attachments: HADOOP-878_20070220_1.patch, HADOOP-878_20070222_2.patch
>
>
> If you execute more than one maps with -reducer None output data seems to get lost. The number of lines output with identity reducer is greater than the number of lines output using reducer NONE.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HADOOP-878) reducer NONE does not work with multiple maps

Posted by "Hadoop QA (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-878?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12475287 ] 

Hadoop QA commented on HADOOP-878:
----------------------------------

-1, because 2 attempts failed to build and test the latest attachment (http://issues.apache.org/jira/secure/attachment/12351850/HADOOP-878_20070222_2.patch) against trunk revision http://svn.apache.org/repos/asf/lucene/hadoop/trunk/510644. Please note that this message is automatically generated and may represent a problem with the automation system and not the patch. Results are at http://lucene.zones.apache.org:8080/hudson/job/Hadoop-Patch

> reducer NONE does not work with multiple maps
> ---------------------------------------------
>
>                 Key: HADOOP-878
>                 URL: https://issues.apache.org/jira/browse/HADOOP-878
>             Project: Hadoop
>          Issue Type: Bug
>          Components: contrib/streaming
>            Reporter: Mahadev konar
>         Assigned To: Arun C Murthy
>            Priority: Minor
>         Attachments: HADOOP-878_20070220_1.patch, HADOOP-878_20070222_2.patch
>
>
> If you execute more than one maps with -reducer None output data seems to get lost. The number of lines output with identity reducer is greater than the number of lines output using reducer NONE.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (HADOOP-878) reducer NONE does not work with multiple maps

Posted by "Arun C Murthy (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HADOOP-878?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Arun C Murthy updated HADOOP-878:
---------------------------------

    Attachment: HADOOP-878_20070222_2.patch

Better tested post HADOOP-1029.

> reducer NONE does not work with multiple maps
> ---------------------------------------------
>
>                 Key: HADOOP-878
>                 URL: https://issues.apache.org/jira/browse/HADOOP-878
>             Project: Hadoop
>          Issue Type: Bug
>          Components: contrib/streaming
>            Reporter: Mahadev konar
>         Assigned To: Arun C Murthy
>            Priority: Minor
>         Attachments: HADOOP-878_20070220_1.patch, HADOOP-878_20070222_2.patch
>
>
> If you execute more than one maps with -reducer None output data seems to get lost. The number of lines output with identity reducer is greater than the number of lines output using reducer NONE.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HADOOP-878) reducer NONE does not work with multiple maps

Posted by "Mahadev konar (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-878?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12472946 ] 

Mahadev konar commented on HADOOP-878:
--------------------------------------

java.lang.IllegalArgumentException: java.net.URISyntaxException: Relative path in absolute URI: input_file:269+270
	at org.apache.hadoop.fs.Path.initialize(Path.java:128)
	at org.apache.hadoop.fs.Path.(Path.java:115)
	at org.apache.hadoop.fs.Path.(Path.java:44)
	at org.apache.hadoop.streaming.PipeMapRed.configure(PipeMapRed.java:268)
	at org.apache.hadoop.util.ReflectionUtils.setConf(ReflectionUtils.java:50)
	at org.apache.hadoop.util.ReflectionUtils.newInstance(ReflectionUtils.java:70)
	at org.apache.hadoop.mapred.MapRunner.configure(MapRunner.java:34)
	at org.apache.hadoop.util.ReflectionUtils.setConf(ReflectionUtils.java:50)
	at org.apache.hadoop.util.ReflectionUtils.newInstance(ReflectionUtils.java:70)
	at org.apache.hadoop.mapred.MapTask.run(MapTask.java:178)
	at org.apache.hadoop.mapred.TaskTracker$Child.main(TaskTracker.java:1396)
Caused by: java.net.URISyntaxException: Relative path in absolute URI: input.txt:269+270
	at java.net.URI.checkPath(URI.java:1787)
	at java.net.URI.(URI.java:735)
	at org.apache.hadoop.fs.Path.initialize(Path.java:125)
	... 10 more



This is the exception I get on running with -reducer NONE


> reducer NONE does not work with multiple maps
> ---------------------------------------------
>
>                 Key: HADOOP-878
>                 URL: https://issues.apache.org/jira/browse/HADOOP-878
>             Project: Hadoop
>          Issue Type: Bug
>          Components: contrib/streaming
>            Reporter: Mahadev konar
>         Assigned To: Mahadev konar
>            Priority: Minor
>
> If you execute more than one maps with -reducer None output data seems to get lost. The number of lines output with identity reducer is greater than the number of lines output using reducer NONE.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (HADOOP-878) reducer NONE does not work with multiple maps

Posted by "Arun C Murthy (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HADOOP-878?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Arun C Murthy updated HADOOP-878:
---------------------------------

    Status: Patch Available  (was: Open)

> reducer NONE does not work with multiple maps
> ---------------------------------------------
>
>                 Key: HADOOP-878
>                 URL: https://issues.apache.org/jira/browse/HADOOP-878
>             Project: Hadoop
>          Issue Type: Bug
>          Components: contrib/streaming
>            Reporter: Mahadev konar
>         Assigned To: Arun C Murthy
>            Priority: Minor
>         Attachments: HADOOP-878_20070220_1.patch, HADOOP-878_20070222_2.patch
>
>
> If you execute more than one maps with -reducer None output data seems to get lost. The number of lines output with identity reducer is greater than the number of lines output using reducer NONE.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HADOOP-878) reducer NONE does not work with multiple maps

Posted by "Sanjay Dahiya (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-878?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12471251 ] 

Sanjay Dahiya commented on HADOOP-878:
--------------------------------------

This seems to be working for multiple maps after recent patches  HADOOP-788 and HADOOP-929. Can you please reconfirm if you are still facing this issue. 

> reducer NONE does not work with multiple maps
> ---------------------------------------------
>
>                 Key: HADOOP-878
>                 URL: https://issues.apache.org/jira/browse/HADOOP-878
>             Project: Hadoop
>          Issue Type: Bug
>          Components: contrib/streaming
>            Reporter: Mahadev konar
>         Assigned To: Sanjay Dahiya
>            Priority: Minor
>
> If you execute more than one maps with -reducer None output data seems to get lost. The number of lines output with identity reducer is greater than the number of lines output using reducer NONE.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.