You are viewing a plain text version of this content. The canonical link for it is here.
Posted to common-dev@hadoop.apache.org by "Mahadev konar (JIRA)" <ji...@apache.org> on 2007/01/10 20:38:27 UTC
[jira] Created: (HADOOP-878) reducer NONE does not work with
multiple maps
reducer NONE does not work with multiple maps
---------------------------------------------
Key: HADOOP-878
URL: https://issues.apache.org/jira/browse/HADOOP-878
Project: Hadoop
Issue Type: Bug
Components: contrib/streaming
Reporter: Mahadev konar
Assigned To: Sanjay Dahiya
Priority: Minor
If you execute more than one maps with -reducer None output data seems to get lost. The number of lines output with identity reducer is greater than the number of lines output using reducer NONE.
--
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators: https://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] Updated: (HADOOP-878) reducer NONE does not work with
multiple maps
Posted by "Arun C Murthy (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/HADOOP-878?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Arun C Murthy updated HADOOP-878:
---------------------------------
Attachment: HADOOP-878_20070220_1.patch
PipeMapRed:getSideEffectFileName which would put ':' in the 'fileName' as <fileName:offset:length> (via FileInput.toString()); this multiple colons cause an URI exception while constructing the path.
I've fixed that to put '-' instead of ':' and seems to work... I'd appreciate any review/feedback since I'm not too conversant with streaming...
Mahadev, could you check this patch and let me know if it works? Thanks!
> reducer NONE does not work with multiple maps
> ---------------------------------------------
>
> Key: HADOOP-878
> URL: https://issues.apache.org/jira/browse/HADOOP-878
> Project: Hadoop
> Issue Type: Bug
> Components: contrib/streaming
> Reporter: Mahadev konar
> Assigned To: Mahadev konar
> Priority: Minor
> Attachments: HADOOP-878_20070220_1.patch
>
>
> If you execute more than one maps with -reducer None output data seems to get lost. The number of lines output with identity reducer is greater than the number of lines output using reducer NONE.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.
[jira] Updated: (HADOOP-878) reducer NONE does not work with
multiple maps
Posted by "Owen O'Malley (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/HADOOP-878?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Owen O'Malley updated HADOOP-878:
---------------------------------
Assignee: Mahadev konar (was: Sanjay Dahiya)
Mahadev, can you please verify if this problem is resolved? Thanks!
> reducer NONE does not work with multiple maps
> ---------------------------------------------
>
> Key: HADOOP-878
> URL: https://issues.apache.org/jira/browse/HADOOP-878
> Project: Hadoop
> Issue Type: Bug
> Components: contrib/streaming
> Reporter: Mahadev konar
> Assigned To: Mahadev konar
> Priority: Minor
>
> If you execute more than one maps with -reducer None output data seems to get lost. The number of lines output with identity reducer is greater than the number of lines output using reducer NONE.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.
[jira] Commented: (HADOOP-878) reducer NONE does not work with
multiple maps
Posted by "Mahadev konar (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/HADOOP-878?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12472935 ]
Mahadev konar commented on HADOOP-878:
--------------------------------------
the reducer none problem is still not fixed in the current trunk. It is still broken. The jobs fail consistently with -reducer NONE.
> reducer NONE does not work with multiple maps
> ---------------------------------------------
>
> Key: HADOOP-878
> URL: https://issues.apache.org/jira/browse/HADOOP-878
> Project: Hadoop
> Issue Type: Bug
> Components: contrib/streaming
> Reporter: Mahadev konar
> Assigned To: Mahadev konar
> Priority: Minor
>
> If you execute more than one maps with -reducer None output data seems to get lost. The number of lines output with identity reducer is greater than the number of lines output using reducer NONE.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.
[jira] Assigned: (HADOOP-878) reducer NONE does not work with
multiple maps
Posted by "Arun C Murthy (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/HADOOP-878?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Arun C Murthy reassigned HADOOP-878:
------------------------------------
Assignee: Arun C Murthy (was: Mahadev konar)
> reducer NONE does not work with multiple maps
> ---------------------------------------------
>
> Key: HADOOP-878
> URL: https://issues.apache.org/jira/browse/HADOOP-878
> Project: Hadoop
> Issue Type: Bug
> Components: contrib/streaming
> Reporter: Mahadev konar
> Assigned To: Arun C Murthy
> Priority: Minor
> Attachments: HADOOP-878_20070220_1.patch
>
>
> If you execute more than one maps with -reducer None output data seems to get lost. The number of lines output with identity reducer is greater than the number of lines output using reducer NONE.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.
[jira] Updated: (HADOOP-878) reducer NONE does not work with
multiple maps
Posted by "Doug Cutting (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/HADOOP-878?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Doug Cutting updated HADOOP-878:
--------------------------------
Resolution: Fixed
Fix Version/s: 0.12.0
Status: Resolved (was: Patch Available)
I just committed this. Thanks, Arun!
> reducer NONE does not work with multiple maps
> ---------------------------------------------
>
> Key: HADOOP-878
> URL: https://issues.apache.org/jira/browse/HADOOP-878
> Project: Hadoop
> Issue Type: Bug
> Components: contrib/streaming
> Reporter: Mahadev konar
> Assigned To: Arun C Murthy
> Priority: Minor
> Fix For: 0.12.0
>
> Attachments: HADOOP-878_20070220_1.patch, HADOOP-878_20070222_2.patch
>
>
> If you execute more than one maps with -reducer None output data seems to get lost. The number of lines output with identity reducer is greater than the number of lines output using reducer NONE.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.
[jira] Commented: (HADOOP-878) reducer NONE does not work with
multiple maps
Posted by "Arun C Murthy (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/HADOOP-878?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12475443 ]
Arun C Murthy commented on HADOOP-878:
--------------------------------------
Looks like it was TestCheckPoint which failed, which obviously this patch doesn't affect... could we ignore the -1?
> reducer NONE does not work with multiple maps
> ---------------------------------------------
>
> Key: HADOOP-878
> URL: https://issues.apache.org/jira/browse/HADOOP-878
> Project: Hadoop
> Issue Type: Bug
> Components: contrib/streaming
> Reporter: Mahadev konar
> Assigned To: Arun C Murthy
> Priority: Minor
> Attachments: HADOOP-878_20070220_1.patch, HADOOP-878_20070222_2.patch
>
>
> If you execute more than one maps with -reducer None output data seems to get lost. The number of lines output with identity reducer is greater than the number of lines output using reducer NONE.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.
[jira] Commented: (HADOOP-878) reducer NONE does not work with
multiple maps
Posted by "Hadoop QA (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/HADOOP-878?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12475287 ]
Hadoop QA commented on HADOOP-878:
----------------------------------
-1, because 2 attempts failed to build and test the latest attachment (http://issues.apache.org/jira/secure/attachment/12351850/HADOOP-878_20070222_2.patch) against trunk revision http://svn.apache.org/repos/asf/lucene/hadoop/trunk/510644. Please note that this message is automatically generated and may represent a problem with the automation system and not the patch. Results are at http://lucene.zones.apache.org:8080/hudson/job/Hadoop-Patch
> reducer NONE does not work with multiple maps
> ---------------------------------------------
>
> Key: HADOOP-878
> URL: https://issues.apache.org/jira/browse/HADOOP-878
> Project: Hadoop
> Issue Type: Bug
> Components: contrib/streaming
> Reporter: Mahadev konar
> Assigned To: Arun C Murthy
> Priority: Minor
> Attachments: HADOOP-878_20070220_1.patch, HADOOP-878_20070222_2.patch
>
>
> If you execute more than one maps with -reducer None output data seems to get lost. The number of lines output with identity reducer is greater than the number of lines output using reducer NONE.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.
[jira] Updated: (HADOOP-878) reducer NONE does not work with
multiple maps
Posted by "Arun C Murthy (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/HADOOP-878?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Arun C Murthy updated HADOOP-878:
---------------------------------
Attachment: HADOOP-878_20070222_2.patch
Better tested post HADOOP-1029.
> reducer NONE does not work with multiple maps
> ---------------------------------------------
>
> Key: HADOOP-878
> URL: https://issues.apache.org/jira/browse/HADOOP-878
> Project: Hadoop
> Issue Type: Bug
> Components: contrib/streaming
> Reporter: Mahadev konar
> Assigned To: Arun C Murthy
> Priority: Minor
> Attachments: HADOOP-878_20070220_1.patch, HADOOP-878_20070222_2.patch
>
>
> If you execute more than one maps with -reducer None output data seems to get lost. The number of lines output with identity reducer is greater than the number of lines output using reducer NONE.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.
[jira] Commented: (HADOOP-878) reducer NONE does not work with
multiple maps
Posted by "Mahadev konar (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/HADOOP-878?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12472946 ]
Mahadev konar commented on HADOOP-878:
--------------------------------------
java.lang.IllegalArgumentException: java.net.URISyntaxException: Relative path in absolute URI: input_file:269+270
at org.apache.hadoop.fs.Path.initialize(Path.java:128)
at org.apache.hadoop.fs.Path.(Path.java:115)
at org.apache.hadoop.fs.Path.(Path.java:44)
at org.apache.hadoop.streaming.PipeMapRed.configure(PipeMapRed.java:268)
at org.apache.hadoop.util.ReflectionUtils.setConf(ReflectionUtils.java:50)
at org.apache.hadoop.util.ReflectionUtils.newInstance(ReflectionUtils.java:70)
at org.apache.hadoop.mapred.MapRunner.configure(MapRunner.java:34)
at org.apache.hadoop.util.ReflectionUtils.setConf(ReflectionUtils.java:50)
at org.apache.hadoop.util.ReflectionUtils.newInstance(ReflectionUtils.java:70)
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:178)
at org.apache.hadoop.mapred.TaskTracker$Child.main(TaskTracker.java:1396)
Caused by: java.net.URISyntaxException: Relative path in absolute URI: input.txt:269+270
at java.net.URI.checkPath(URI.java:1787)
at java.net.URI.(URI.java:735)
at org.apache.hadoop.fs.Path.initialize(Path.java:125)
... 10 more
This is the exception I get on running with -reducer NONE
> reducer NONE does not work with multiple maps
> ---------------------------------------------
>
> Key: HADOOP-878
> URL: https://issues.apache.org/jira/browse/HADOOP-878
> Project: Hadoop
> Issue Type: Bug
> Components: contrib/streaming
> Reporter: Mahadev konar
> Assigned To: Mahadev konar
> Priority: Minor
>
> If you execute more than one maps with -reducer None output data seems to get lost. The number of lines output with identity reducer is greater than the number of lines output using reducer NONE.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.
[jira] Updated: (HADOOP-878) reducer NONE does not work with
multiple maps
Posted by "Arun C Murthy (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/HADOOP-878?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Arun C Murthy updated HADOOP-878:
---------------------------------
Status: Patch Available (was: Open)
> reducer NONE does not work with multiple maps
> ---------------------------------------------
>
> Key: HADOOP-878
> URL: https://issues.apache.org/jira/browse/HADOOP-878
> Project: Hadoop
> Issue Type: Bug
> Components: contrib/streaming
> Reporter: Mahadev konar
> Assigned To: Arun C Murthy
> Priority: Minor
> Attachments: HADOOP-878_20070220_1.patch, HADOOP-878_20070222_2.patch
>
>
> If you execute more than one maps with -reducer None output data seems to get lost. The number of lines output with identity reducer is greater than the number of lines output using reducer NONE.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.
[jira] Commented: (HADOOP-878) reducer NONE does not work with
multiple maps
Posted by "Sanjay Dahiya (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/HADOOP-878?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12471251 ]
Sanjay Dahiya commented on HADOOP-878:
--------------------------------------
This seems to be working for multiple maps after recent patches HADOOP-788 and HADOOP-929. Can you please reconfirm if you are still facing this issue.
> reducer NONE does not work with multiple maps
> ---------------------------------------------
>
> Key: HADOOP-878
> URL: https://issues.apache.org/jira/browse/HADOOP-878
> Project: Hadoop
> Issue Type: Bug
> Components: contrib/streaming
> Reporter: Mahadev konar
> Assigned To: Sanjay Dahiya
> Priority: Minor
>
> If you execute more than one maps with -reducer None output data seems to get lost. The number of lines output with identity reducer is greater than the number of lines output using reducer NONE.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.