You are viewing a plain text version of this content. The canonical link for it is here.
Posted to common-dev@hadoop.apache.org by "Paul Saab (JIRA)" <ji...@apache.org> on 2007/12/13 20:50:43 UTC

[jira] Created: (HADOOP-2419) HADOOP-1965 breaks nutch

HADOOP-1965 breaks nutch
------------------------

                 Key: HADOOP-2419
                 URL: https://issues.apache.org/jira/browse/HADOOP-2419
             Project: Hadoop
          Issue Type: Bug
            Reporter: Paul Saab


When running nutch on trunk, nutch is unable to complete a fetch and the following exceptions are raised:

java.io.EOFException
        at java.io.DataInputStream.readFully(DataInputStream.java:180)
        at org.apache.nutch.protocol.Content.readFields(Content.java:158)
        at org.apache.nutch.util.GenericWritableConfigurable.readFields(GenericWritableConfigurable.java:38)
        at org.apache.hadoop.mapred.MapTask$MapOutputBuffer.spill(MapTask.java:536)
        at org.apache.hadoop.mapred.MapTask$MapOutputBuffer.sortAndSpillToDisk(MapTask.java:474)
        at org.apache.hadoop.mapred.MapTask$MapOutputBuffer.access$100(MapTask.java:248)
        at org.apache.hadoop.mapred.MapTask$MapOutputBuffer$1.run(MapTask.java:413)

Exception in thread "SortSpillThread" java.lang.NegativeArraySizeException
     at org.apache.hadoop.io.Text.readString(Text.java:388)
     at org.apache.nutch.metadata.Metadata.readFields(Metadata.java:243)
     at org.apache.nutch.protocol.Content.readFields(Content.java:151)
     at org.apache.nutch.util.GenericWritableConfigurable.readFields(GenericWritableConfigurable.java:38)
     at org.apache.hadoop.mapred.MapTask$MapOutputBuffer.spill(MapTask.java:536)
     at org.apache.hadoop.mapred.MapTask$MapOutputBuffer.sortAndSpillToDisk(MapTask.java:474)
     at org.apache.hadoop.mapred.MapTask$MapOutputBuffer.access$100(MapTask.java:248)
     at org.apache.hadoop.mapred.MapTask$MapOutputBuffer$1.run(MapTask.java:413)

After reverting HADOOP-1965 nutch works just fine.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (HADOOP-2419) HADOOP-1965 breaks nutch

Posted by "Dennis Kubes (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HADOOP-2419?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Dennis Kubes updated HADOOP-2419:
---------------------------------

    Attachment: jobtasks.jsp.html

Here is an HTML of a fetching job that failed.  Interesting results.

> HADOOP-1965 breaks nutch
> ------------------------
>
>                 Key: HADOOP-2419
>                 URL: https://issues.apache.org/jira/browse/HADOOP-2419
>             Project: Hadoop
>          Issue Type: Bug
>          Components: mapred
>    Affects Versions: 0.16.0
>            Reporter: Paul Saab
>         Attachments: jobtasks.jsp.html, MapRunnableTest.java
>
>
> When running nutch on trunk, nutch is unable to complete a fetch and the following exceptions are raised:
> java.io.EOFException
>         at java.io.DataInputStream.readFully(DataInputStream.java:180)
>         at org.apache.nutch.protocol.Content.readFields(Content.java:158)
>         at org.apache.nutch.util.GenericWritableConfigurable.readFields(GenericWritableConfigurable.java:38)
>         at org.apache.hadoop.mapred.MapTask$MapOutputBuffer.spill(MapTask.java:536)
>         at org.apache.hadoop.mapred.MapTask$MapOutputBuffer.sortAndSpillToDisk(MapTask.java:474)
>         at org.apache.hadoop.mapred.MapTask$MapOutputBuffer.access$100(MapTask.java:248)
>         at org.apache.hadoop.mapred.MapTask$MapOutputBuffer$1.run(MapTask.java:413)
> Exception in thread "SortSpillThread" java.lang.NegativeArraySizeException
>      at org.apache.hadoop.io.Text.readString(Text.java:388)
>      at org.apache.nutch.metadata.Metadata.readFields(Metadata.java:243)
>      at org.apache.nutch.protocol.Content.readFields(Content.java:151)
>      at org.apache.nutch.util.GenericWritableConfigurable.readFields(GenericWritableConfigurable.java:38)
>      at org.apache.hadoop.mapred.MapTask$MapOutputBuffer.spill(MapTask.java:536)
>      at org.apache.hadoop.mapred.MapTask$MapOutputBuffer.sortAndSpillToDisk(MapTask.java:474)
>      at org.apache.hadoop.mapred.MapTask$MapOutputBuffer.access$100(MapTask.java:248)
>      at org.apache.hadoop.mapred.MapTask$MapOutputBuffer$1.run(MapTask.java:413)
> After reverting HADOOP-1965 nutch works just fine.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HADOOP-2419) HADOOP-1965 breaks nutch

Posted by "Amar Kamat (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-2419?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12554326 ] 

Amar Kamat commented on HADOOP-2419:
------------------------------------

Submitted a patch with added test for the *thread-safe* property.

> HADOOP-1965 breaks nutch
> ------------------------
>
>                 Key: HADOOP-2419
>                 URL: https://issues.apache.org/jira/browse/HADOOP-2419
>             Project: Hadoop
>          Issue Type: Bug
>          Components: mapred
>    Affects Versions: 0.16.0
>            Reporter: Paul Saab
>            Assignee: Amar Kamat
>         Attachments: jobtasks.jsp.html, MapRunnableTest.java
>
>
> When running nutch on trunk, nutch is unable to complete a fetch and the following exceptions are raised:
> java.io.EOFException
>         at java.io.DataInputStream.readFully(DataInputStream.java:180)
>         at org.apache.nutch.protocol.Content.readFields(Content.java:158)
>         at org.apache.nutch.util.GenericWritableConfigurable.readFields(GenericWritableConfigurable.java:38)
>         at org.apache.hadoop.mapred.MapTask$MapOutputBuffer.spill(MapTask.java:536)
>         at org.apache.hadoop.mapred.MapTask$MapOutputBuffer.sortAndSpillToDisk(MapTask.java:474)
>         at org.apache.hadoop.mapred.MapTask$MapOutputBuffer.access$100(MapTask.java:248)
>         at org.apache.hadoop.mapred.MapTask$MapOutputBuffer$1.run(MapTask.java:413)
> Exception in thread "SortSpillThread" java.lang.NegativeArraySizeException
>      at org.apache.hadoop.io.Text.readString(Text.java:388)
>      at org.apache.nutch.metadata.Metadata.readFields(Metadata.java:243)
>      at org.apache.nutch.protocol.Content.readFields(Content.java:151)
>      at org.apache.nutch.util.GenericWritableConfigurable.readFields(GenericWritableConfigurable.java:38)
>      at org.apache.hadoop.mapred.MapTask$MapOutputBuffer.spill(MapTask.java:536)
>      at org.apache.hadoop.mapred.MapTask$MapOutputBuffer.sortAndSpillToDisk(MapTask.java:474)
>      at org.apache.hadoop.mapred.MapTask$MapOutputBuffer.access$100(MapTask.java:248)
>      at org.apache.hadoop.mapred.MapTask$MapOutputBuffer$1.run(MapTask.java:413)
> After reverting HADOOP-1965 nutch works just fine.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Assigned: (HADOOP-2419) HADOOP-1965 breaks nutch

Posted by "Devaraj Das (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HADOOP-2419?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Devaraj Das reassigned HADOOP-2419:
-----------------------------------

    Assignee: Amar Kamat

> HADOOP-1965 breaks nutch
> ------------------------
>
>                 Key: HADOOP-2419
>                 URL: https://issues.apache.org/jira/browse/HADOOP-2419
>             Project: Hadoop
>          Issue Type: Bug
>          Components: mapred
>    Affects Versions: 0.16.0
>            Reporter: Paul Saab
>            Assignee: Amar Kamat
>         Attachments: jobtasks.jsp.html, MapRunnableTest.java
>
>
> When running nutch on trunk, nutch is unable to complete a fetch and the following exceptions are raised:
> java.io.EOFException
>         at java.io.DataInputStream.readFully(DataInputStream.java:180)
>         at org.apache.nutch.protocol.Content.readFields(Content.java:158)
>         at org.apache.nutch.util.GenericWritableConfigurable.readFields(GenericWritableConfigurable.java:38)
>         at org.apache.hadoop.mapred.MapTask$MapOutputBuffer.spill(MapTask.java:536)
>         at org.apache.hadoop.mapred.MapTask$MapOutputBuffer.sortAndSpillToDisk(MapTask.java:474)
>         at org.apache.hadoop.mapred.MapTask$MapOutputBuffer.access$100(MapTask.java:248)
>         at org.apache.hadoop.mapred.MapTask$MapOutputBuffer$1.run(MapTask.java:413)
> Exception in thread "SortSpillThread" java.lang.NegativeArraySizeException
>      at org.apache.hadoop.io.Text.readString(Text.java:388)
>      at org.apache.nutch.metadata.Metadata.readFields(Metadata.java:243)
>      at org.apache.nutch.protocol.Content.readFields(Content.java:151)
>      at org.apache.nutch.util.GenericWritableConfigurable.readFields(GenericWritableConfigurable.java:38)
>      at org.apache.hadoop.mapred.MapTask$MapOutputBuffer.spill(MapTask.java:536)
>      at org.apache.hadoop.mapred.MapTask$MapOutputBuffer.sortAndSpillToDisk(MapTask.java:474)
>      at org.apache.hadoop.mapred.MapTask$MapOutputBuffer.access$100(MapTask.java:248)
>      at org.apache.hadoop.mapred.MapTask$MapOutputBuffer$1.run(MapTask.java:413)
> After reverting HADOOP-1965 nutch works just fine.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HADOOP-2419) HADOOP-1965 breaks nutch

Posted by "Nigel Daley (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-2419?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12552564 ] 

Nigel Daley commented on HADOOP-2419:
-------------------------------------

Amar, does HADOOP-2419.patch cover the same case as MapRunnableTest.java attached to this Jira?  If not, it would be good to incorporate the case covered by MapRunnableTest.java.

> HADOOP-1965 breaks nutch
> ------------------------
>
>                 Key: HADOOP-2419
>                 URL: https://issues.apache.org/jira/browse/HADOOP-2419
>             Project: Hadoop
>          Issue Type: Bug
>          Components: mapred
>    Affects Versions: 0.16.0
>            Reporter: Paul Saab
>            Assignee: Amar Kamat
>         Attachments: jobtasks.jsp.html, MapRunnableTest.java
>
>
> When running nutch on trunk, nutch is unable to complete a fetch and the following exceptions are raised:
> java.io.EOFException
>         at java.io.DataInputStream.readFully(DataInputStream.java:180)
>         at org.apache.nutch.protocol.Content.readFields(Content.java:158)
>         at org.apache.nutch.util.GenericWritableConfigurable.readFields(GenericWritableConfigurable.java:38)
>         at org.apache.hadoop.mapred.MapTask$MapOutputBuffer.spill(MapTask.java:536)
>         at org.apache.hadoop.mapred.MapTask$MapOutputBuffer.sortAndSpillToDisk(MapTask.java:474)
>         at org.apache.hadoop.mapred.MapTask$MapOutputBuffer.access$100(MapTask.java:248)
>         at org.apache.hadoop.mapred.MapTask$MapOutputBuffer$1.run(MapTask.java:413)
> Exception in thread "SortSpillThread" java.lang.NegativeArraySizeException
>      at org.apache.hadoop.io.Text.readString(Text.java:388)
>      at org.apache.nutch.metadata.Metadata.readFields(Metadata.java:243)
>      at org.apache.nutch.protocol.Content.readFields(Content.java:151)
>      at org.apache.nutch.util.GenericWritableConfigurable.readFields(GenericWritableConfigurable.java:38)
>      at org.apache.hadoop.mapred.MapTask$MapOutputBuffer.spill(MapTask.java:536)
>      at org.apache.hadoop.mapred.MapTask$MapOutputBuffer.sortAndSpillToDisk(MapTask.java:474)
>      at org.apache.hadoop.mapred.MapTask$MapOutputBuffer.access$100(MapTask.java:248)
>      at org.apache.hadoop.mapred.MapTask$MapOutputBuffer$1.run(MapTask.java:413)
> After reverting HADOOP-1965 nutch works just fine.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (HADOOP-2419) HADOOP-1965 breaks nutch

Posted by "Paul Saab (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HADOOP-2419?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Paul Saab updated HADOOP-2419:
------------------------------

          Component/s: mapred
    Affects Version/s: 0.16.0

> HADOOP-1965 breaks nutch
> ------------------------
>
>                 Key: HADOOP-2419
>                 URL: https://issues.apache.org/jira/browse/HADOOP-2419
>             Project: Hadoop
>          Issue Type: Bug
>          Components: mapred
>    Affects Versions: 0.16.0
>            Reporter: Paul Saab
>         Attachments: MapRunnableTest.java
>
>
> When running nutch on trunk, nutch is unable to complete a fetch and the following exceptions are raised:
> java.io.EOFException
>         at java.io.DataInputStream.readFully(DataInputStream.java:180)
>         at org.apache.nutch.protocol.Content.readFields(Content.java:158)
>         at org.apache.nutch.util.GenericWritableConfigurable.readFields(GenericWritableConfigurable.java:38)
>         at org.apache.hadoop.mapred.MapTask$MapOutputBuffer.spill(MapTask.java:536)
>         at org.apache.hadoop.mapred.MapTask$MapOutputBuffer.sortAndSpillToDisk(MapTask.java:474)
>         at org.apache.hadoop.mapred.MapTask$MapOutputBuffer.access$100(MapTask.java:248)
>         at org.apache.hadoop.mapred.MapTask$MapOutputBuffer$1.run(MapTask.java:413)
> Exception in thread "SortSpillThread" java.lang.NegativeArraySizeException
>      at org.apache.hadoop.io.Text.readString(Text.java:388)
>      at org.apache.nutch.metadata.Metadata.readFields(Metadata.java:243)
>      at org.apache.nutch.protocol.Content.readFields(Content.java:151)
>      at org.apache.nutch.util.GenericWritableConfigurable.readFields(GenericWritableConfigurable.java:38)
>      at org.apache.hadoop.mapred.MapTask$MapOutputBuffer.spill(MapTask.java:536)
>      at org.apache.hadoop.mapred.MapTask$MapOutputBuffer.sortAndSpillToDisk(MapTask.java:474)
>      at org.apache.hadoop.mapred.MapTask$MapOutputBuffer.access$100(MapTask.java:248)
>      at org.apache.hadoop.mapred.MapTask$MapOutputBuffer$1.run(MapTask.java:413)
> After reverting HADOOP-1965 nutch works just fine.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HADOOP-2419) HADOOP-1965 breaks nutch

Posted by "Amar Kamat (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-2419?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12552325 ] 

Amar Kamat commented on HADOOP-2419:
------------------------------------

Plz check the new patch [https://issues.apache.org/jira/secure/attachment/12371775/HADOOP-2419.patch]. Earlier the  calls to {{MapTask.Collect()}} were not thread-safe. Now the call to {{Collect()}} is made thread-safe. Let us know if this patch works fine.

> HADOOP-1965 breaks nutch
> ------------------------
>
>                 Key: HADOOP-2419
>                 URL: https://issues.apache.org/jira/browse/HADOOP-2419
>             Project: Hadoop
>          Issue Type: Bug
>          Components: mapred
>    Affects Versions: 0.16.0
>            Reporter: Paul Saab
>            Assignee: Amar Kamat
>         Attachments: jobtasks.jsp.html, MapRunnableTest.java
>
>
> When running nutch on trunk, nutch is unable to complete a fetch and the following exceptions are raised:
> java.io.EOFException
>         at java.io.DataInputStream.readFully(DataInputStream.java:180)
>         at org.apache.nutch.protocol.Content.readFields(Content.java:158)
>         at org.apache.nutch.util.GenericWritableConfigurable.readFields(GenericWritableConfigurable.java:38)
>         at org.apache.hadoop.mapred.MapTask$MapOutputBuffer.spill(MapTask.java:536)
>         at org.apache.hadoop.mapred.MapTask$MapOutputBuffer.sortAndSpillToDisk(MapTask.java:474)
>         at org.apache.hadoop.mapred.MapTask$MapOutputBuffer.access$100(MapTask.java:248)
>         at org.apache.hadoop.mapred.MapTask$MapOutputBuffer$1.run(MapTask.java:413)
> Exception in thread "SortSpillThread" java.lang.NegativeArraySizeException
>      at org.apache.hadoop.io.Text.readString(Text.java:388)
>      at org.apache.nutch.metadata.Metadata.readFields(Metadata.java:243)
>      at org.apache.nutch.protocol.Content.readFields(Content.java:151)
>      at org.apache.nutch.util.GenericWritableConfigurable.readFields(GenericWritableConfigurable.java:38)
>      at org.apache.hadoop.mapred.MapTask$MapOutputBuffer.spill(MapTask.java:536)
>      at org.apache.hadoop.mapred.MapTask$MapOutputBuffer.sortAndSpillToDisk(MapTask.java:474)
>      at org.apache.hadoop.mapred.MapTask$MapOutputBuffer.access$100(MapTask.java:248)
>      at org.apache.hadoop.mapred.MapTask$MapOutputBuffer$1.run(MapTask.java:413)
> After reverting HADOOP-1965 nutch works just fine.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HADOOP-2419) HADOOP-1965 breaks nutch

Posted by "Amar Kamat (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-2419?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12552621 ] 

Amar Kamat commented on HADOOP-2419:
------------------------------------

The guess is that {{MapRunnableTest.java{}} assumes that {{MapTask.collect()}} is thread-safe. Which the earlier patch did not provide. So the change makes the call thread-safe. 

> HADOOP-1965 breaks nutch
> ------------------------
>
>                 Key: HADOOP-2419
>                 URL: https://issues.apache.org/jira/browse/HADOOP-2419
>             Project: Hadoop
>          Issue Type: Bug
>          Components: mapred
>    Affects Versions: 0.16.0
>            Reporter: Paul Saab
>            Assignee: Amar Kamat
>         Attachments: jobtasks.jsp.html, MapRunnableTest.java
>
>
> When running nutch on trunk, nutch is unable to complete a fetch and the following exceptions are raised:
> java.io.EOFException
>         at java.io.DataInputStream.readFully(DataInputStream.java:180)
>         at org.apache.nutch.protocol.Content.readFields(Content.java:158)
>         at org.apache.nutch.util.GenericWritableConfigurable.readFields(GenericWritableConfigurable.java:38)
>         at org.apache.hadoop.mapred.MapTask$MapOutputBuffer.spill(MapTask.java:536)
>         at org.apache.hadoop.mapred.MapTask$MapOutputBuffer.sortAndSpillToDisk(MapTask.java:474)
>         at org.apache.hadoop.mapred.MapTask$MapOutputBuffer.access$100(MapTask.java:248)
>         at org.apache.hadoop.mapred.MapTask$MapOutputBuffer$1.run(MapTask.java:413)
> Exception in thread "SortSpillThread" java.lang.NegativeArraySizeException
>      at org.apache.hadoop.io.Text.readString(Text.java:388)
>      at org.apache.nutch.metadata.Metadata.readFields(Metadata.java:243)
>      at org.apache.nutch.protocol.Content.readFields(Content.java:151)
>      at org.apache.nutch.util.GenericWritableConfigurable.readFields(GenericWritableConfigurable.java:38)
>      at org.apache.hadoop.mapred.MapTask$MapOutputBuffer.spill(MapTask.java:536)
>      at org.apache.hadoop.mapred.MapTask$MapOutputBuffer.sortAndSpillToDisk(MapTask.java:474)
>      at org.apache.hadoop.mapred.MapTask$MapOutputBuffer.access$100(MapTask.java:248)
>      at org.apache.hadoop.mapred.MapTask$MapOutputBuffer$1.run(MapTask.java:413)
> After reverting HADOOP-1965 nutch works just fine.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HADOOP-2419) HADOOP-1965 breaks nutch

Posted by "Hudson (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-2419?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12552064 ] 

Hudson commented on HADOOP-2419:
--------------------------------

Integrated in Hadoop-Nightly #333 (See [http://lucene.zones.apache.org:8080/hudson/job/Hadoop-Nightly/333/])

> HADOOP-1965 breaks nutch
> ------------------------
>
>                 Key: HADOOP-2419
>                 URL: https://issues.apache.org/jira/browse/HADOOP-2419
>             Project: Hadoop
>          Issue Type: Bug
>          Components: mapred
>    Affects Versions: 0.16.0
>            Reporter: Paul Saab
>            Assignee: Amar Kamat
>         Attachments: jobtasks.jsp.html, MapRunnableTest.java
>
>
> When running nutch on trunk, nutch is unable to complete a fetch and the following exceptions are raised:
> java.io.EOFException
>         at java.io.DataInputStream.readFully(DataInputStream.java:180)
>         at org.apache.nutch.protocol.Content.readFields(Content.java:158)
>         at org.apache.nutch.util.GenericWritableConfigurable.readFields(GenericWritableConfigurable.java:38)
>         at org.apache.hadoop.mapred.MapTask$MapOutputBuffer.spill(MapTask.java:536)
>         at org.apache.hadoop.mapred.MapTask$MapOutputBuffer.sortAndSpillToDisk(MapTask.java:474)
>         at org.apache.hadoop.mapred.MapTask$MapOutputBuffer.access$100(MapTask.java:248)
>         at org.apache.hadoop.mapred.MapTask$MapOutputBuffer$1.run(MapTask.java:413)
> Exception in thread "SortSpillThread" java.lang.NegativeArraySizeException
>      at org.apache.hadoop.io.Text.readString(Text.java:388)
>      at org.apache.nutch.metadata.Metadata.readFields(Metadata.java:243)
>      at org.apache.nutch.protocol.Content.readFields(Content.java:151)
>      at org.apache.nutch.util.GenericWritableConfigurable.readFields(GenericWritableConfigurable.java:38)
>      at org.apache.hadoop.mapred.MapTask$MapOutputBuffer.spill(MapTask.java:536)
>      at org.apache.hadoop.mapred.MapTask$MapOutputBuffer.sortAndSpillToDisk(MapTask.java:474)
>      at org.apache.hadoop.mapred.MapTask$MapOutputBuffer.access$100(MapTask.java:248)
>      at org.apache.hadoop.mapred.MapTask$MapOutputBuffer$1.run(MapTask.java:413)
> After reverting HADOOP-1965 nutch works just fine.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Issue Comment Edited: (HADOOP-2419) HADOOP-1965 breaks nutch

Posted by "Amar Kamat (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-2419?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12554326 ] 

amar_kamat edited comment on HADOOP-2419 at 12/24/07 10:49 PM:
---------------------------------------------------------------

Submitted a patch with added test for the *thread-safe* property. See HADOOP-1965

      was (Author: amar_kamat):
    Submitted a patch with added test for the *thread-safe* property.
  
> HADOOP-1965 breaks nutch
> ------------------------
>
>                 Key: HADOOP-2419
>                 URL: https://issues.apache.org/jira/browse/HADOOP-2419
>             Project: Hadoop
>          Issue Type: Bug
>          Components: mapred
>    Affects Versions: 0.16.0
>            Reporter: Paul Saab
>            Assignee: Amar Kamat
>         Attachments: jobtasks.jsp.html, MapRunnableTest.java
>
>
> When running nutch on trunk, nutch is unable to complete a fetch and the following exceptions are raised:
> java.io.EOFException
>         at java.io.DataInputStream.readFully(DataInputStream.java:180)
>         at org.apache.nutch.protocol.Content.readFields(Content.java:158)
>         at org.apache.nutch.util.GenericWritableConfigurable.readFields(GenericWritableConfigurable.java:38)
>         at org.apache.hadoop.mapred.MapTask$MapOutputBuffer.spill(MapTask.java:536)
>         at org.apache.hadoop.mapred.MapTask$MapOutputBuffer.sortAndSpillToDisk(MapTask.java:474)
>         at org.apache.hadoop.mapred.MapTask$MapOutputBuffer.access$100(MapTask.java:248)
>         at org.apache.hadoop.mapred.MapTask$MapOutputBuffer$1.run(MapTask.java:413)
> Exception in thread "SortSpillThread" java.lang.NegativeArraySizeException
>      at org.apache.hadoop.io.Text.readString(Text.java:388)
>      at org.apache.nutch.metadata.Metadata.readFields(Metadata.java:243)
>      at org.apache.nutch.protocol.Content.readFields(Content.java:151)
>      at org.apache.nutch.util.GenericWritableConfigurable.readFields(GenericWritableConfigurable.java:38)
>      at org.apache.hadoop.mapred.MapTask$MapOutputBuffer.spill(MapTask.java:536)
>      at org.apache.hadoop.mapred.MapTask$MapOutputBuffer.sortAndSpillToDisk(MapTask.java:474)
>      at org.apache.hadoop.mapred.MapTask$MapOutputBuffer.access$100(MapTask.java:248)
>      at org.apache.hadoop.mapred.MapTask$MapOutputBuffer$1.run(MapTask.java:413)
> After reverting HADOOP-1965 nutch works just fine.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (HADOOP-2419) HADOOP-1965 breaks nutch

Posted by "Paul Saab (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HADOOP-2419?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Paul Saab updated HADOOP-2419:
------------------------------

    Attachment: MapRunnableTest.java

This is a sample MR job that mimics what nutch's Fetcher2 does with threads and queuing url's to be fetched.  It takes 2 arguments on the command line, an input path and an output path.  All it does it reads a TextInputFormat, queues it up for threads to then write it back out as it came in.  The sample run I just did came back with the following exception.  Reverting HADOOP-1965 allows the job to finish without error.

java.lang.ArrayIndexOutOfBoundsException
	at java.lang.System.arraycopy(Native Method)
	at org.apache.hadoop.mapred.MergeSorter.sort(MergeSorter.java:45)
	at org.apache.hadoop.mapred.MapTask$MapOutputBuffer.sortAndSpillToDisk(MapTask.java:446)
	at org.apache.hadoop.mapred.MapTask$MapOutputBuffer.flush(MapTask.java:690)
	at org.apache.hadoop.mapred.MapTask.run(MapTask.java:193)
	at org.apache.hadoop.mapred.TaskTracker$Child.main(TaskTracker.java:2016)

java.lang.ArrayIndexOutOfBoundsException: 6
	at org.apache.hadoop.mapred.BasicTypeSorterBase.compare(BasicTypeSorterBase.java:133)
	at org.apache.hadoop.mapred.MergeSorter.compare(MergeSorter.java:59)
	at org.apache.hadoop.mapred.MergeSorter.compare(MergeSorter.java:35)
	at org.apache.hadoop.util.MergeSort.mergeSort(MergeSort.java:46)
	at org.apache.hadoop.util.MergeSort.mergeSort(MergeSort.java:56)
	at org.apache.hadoop.mapred.MergeSorter.sort(MergeSorter.java:46)
	at org.apache.hadoop.mapred.MapTask$MapOutputBuffer.sortAndSpillToDisk(MapTask.java:446)
	at org.apache.hadoop.mapred.MapTask$MapOutputBuffer.flush(MapTask.java:690)
	at org.apache.hadoop.mapred.MapTask.run(MapTask.java:193)
	at org.apache.hadoop.mapred.TaskTracker$Child.main(TaskTracker.java:2016)


> HADOOP-1965 breaks nutch
> ------------------------
>
>                 Key: HADOOP-2419
>                 URL: https://issues.apache.org/jira/browse/HADOOP-2419
>             Project: Hadoop
>          Issue Type: Bug
>            Reporter: Paul Saab
>         Attachments: MapRunnableTest.java
>
>
> When running nutch on trunk, nutch is unable to complete a fetch and the following exceptions are raised:
> java.io.EOFException
>         at java.io.DataInputStream.readFully(DataInputStream.java:180)
>         at org.apache.nutch.protocol.Content.readFields(Content.java:158)
>         at org.apache.nutch.util.GenericWritableConfigurable.readFields(GenericWritableConfigurable.java:38)
>         at org.apache.hadoop.mapred.MapTask$MapOutputBuffer.spill(MapTask.java:536)
>         at org.apache.hadoop.mapred.MapTask$MapOutputBuffer.sortAndSpillToDisk(MapTask.java:474)
>         at org.apache.hadoop.mapred.MapTask$MapOutputBuffer.access$100(MapTask.java:248)
>         at org.apache.hadoop.mapred.MapTask$MapOutputBuffer$1.run(MapTask.java:413)
> Exception in thread "SortSpillThread" java.lang.NegativeArraySizeException
>      at org.apache.hadoop.io.Text.readString(Text.java:388)
>      at org.apache.nutch.metadata.Metadata.readFields(Metadata.java:243)
>      at org.apache.nutch.protocol.Content.readFields(Content.java:151)
>      at org.apache.nutch.util.GenericWritableConfigurable.readFields(GenericWritableConfigurable.java:38)
>      at org.apache.hadoop.mapred.MapTask$MapOutputBuffer.spill(MapTask.java:536)
>      at org.apache.hadoop.mapred.MapTask$MapOutputBuffer.sortAndSpillToDisk(MapTask.java:474)
>      at org.apache.hadoop.mapred.MapTask$MapOutputBuffer.access$100(MapTask.java:248)
>      at org.apache.hadoop.mapred.MapTask$MapOutputBuffer$1.run(MapTask.java:413)
> After reverting HADOOP-1965 nutch works just fine.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HADOOP-2419) HADOOP-1965 breaks nutch

Posted by "Amar Kamat (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-2419?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12552178 ] 

Amar Kamat commented on HADOOP-2419:
------------------------------------

Plz refer HADOOP-1965 for further details and discussions.

> HADOOP-1965 breaks nutch
> ------------------------
>
>                 Key: HADOOP-2419
>                 URL: https://issues.apache.org/jira/browse/HADOOP-2419
>             Project: Hadoop
>          Issue Type: Bug
>          Components: mapred
>    Affects Versions: 0.16.0
>            Reporter: Paul Saab
>            Assignee: Amar Kamat
>         Attachments: jobtasks.jsp.html, MapRunnableTest.java
>
>
> When running nutch on trunk, nutch is unable to complete a fetch and the following exceptions are raised:
> java.io.EOFException
>         at java.io.DataInputStream.readFully(DataInputStream.java:180)
>         at org.apache.nutch.protocol.Content.readFields(Content.java:158)
>         at org.apache.nutch.util.GenericWritableConfigurable.readFields(GenericWritableConfigurable.java:38)
>         at org.apache.hadoop.mapred.MapTask$MapOutputBuffer.spill(MapTask.java:536)
>         at org.apache.hadoop.mapred.MapTask$MapOutputBuffer.sortAndSpillToDisk(MapTask.java:474)
>         at org.apache.hadoop.mapred.MapTask$MapOutputBuffer.access$100(MapTask.java:248)
>         at org.apache.hadoop.mapred.MapTask$MapOutputBuffer$1.run(MapTask.java:413)
> Exception in thread "SortSpillThread" java.lang.NegativeArraySizeException
>      at org.apache.hadoop.io.Text.readString(Text.java:388)
>      at org.apache.nutch.metadata.Metadata.readFields(Metadata.java:243)
>      at org.apache.nutch.protocol.Content.readFields(Content.java:151)
>      at org.apache.nutch.util.GenericWritableConfigurable.readFields(GenericWritableConfigurable.java:38)
>      at org.apache.hadoop.mapred.MapTask$MapOutputBuffer.spill(MapTask.java:536)
>      at org.apache.hadoop.mapred.MapTask$MapOutputBuffer.sortAndSpillToDisk(MapTask.java:474)
>      at org.apache.hadoop.mapred.MapTask$MapOutputBuffer.access$100(MapTask.java:248)
>      at org.apache.hadoop.mapred.MapTask$MapOutputBuffer$1.run(MapTask.java:413)
> After reverting HADOOP-1965 nutch works just fine.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HADOOP-2419) HADOOP-1965 breaks nutch

Posted by "Amar Kamat (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-2419?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12558422#action_12558422 ] 

Amar Kamat commented on HADOOP-2419:
------------------------------------

HADOOP-1965 got committed. Kindly check with the trunk and let us know.

> HADOOP-1965 breaks nutch
> ------------------------
>
>                 Key: HADOOP-2419
>                 URL: https://issues.apache.org/jira/browse/HADOOP-2419
>             Project: Hadoop
>          Issue Type: Bug
>          Components: mapred
>    Affects Versions: 0.16.0
>            Reporter: Paul Saab
>            Assignee: Amar Kamat
>         Attachments: jobtasks.jsp.html, MapRunnableTest.java
>
>
> When running nutch on trunk, nutch is unable to complete a fetch and the following exceptions are raised:
> java.io.EOFException
>         at java.io.DataInputStream.readFully(DataInputStream.java:180)
>         at org.apache.nutch.protocol.Content.readFields(Content.java:158)
>         at org.apache.nutch.util.GenericWritableConfigurable.readFields(GenericWritableConfigurable.java:38)
>         at org.apache.hadoop.mapred.MapTask$MapOutputBuffer.spill(MapTask.java:536)
>         at org.apache.hadoop.mapred.MapTask$MapOutputBuffer.sortAndSpillToDisk(MapTask.java:474)
>         at org.apache.hadoop.mapred.MapTask$MapOutputBuffer.access$100(MapTask.java:248)
>         at org.apache.hadoop.mapred.MapTask$MapOutputBuffer$1.run(MapTask.java:413)
> Exception in thread "SortSpillThread" java.lang.NegativeArraySizeException
>      at org.apache.hadoop.io.Text.readString(Text.java:388)
>      at org.apache.nutch.metadata.Metadata.readFields(Metadata.java:243)
>      at org.apache.nutch.protocol.Content.readFields(Content.java:151)
>      at org.apache.nutch.util.GenericWritableConfigurable.readFields(GenericWritableConfigurable.java:38)
>      at org.apache.hadoop.mapred.MapTask$MapOutputBuffer.spill(MapTask.java:536)
>      at org.apache.hadoop.mapred.MapTask$MapOutputBuffer.sortAndSpillToDisk(MapTask.java:474)
>      at org.apache.hadoop.mapred.MapTask$MapOutputBuffer.access$100(MapTask.java:248)
>      at org.apache.hadoop.mapred.MapTask$MapOutputBuffer$1.run(MapTask.java:413)
> After reverting HADOOP-1965 nutch works just fine.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Resolved: (HADOOP-2419) HADOOP-1965 breaks nutch

Posted by "Amar Kamat (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HADOOP-2419?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Amar Kamat resolved HADOOP-2419.
--------------------------------

    Resolution: Fixed

Assuming that concurrent access to {{MapTask.collect()}} was the cause (which is now fixed in HADOOP-1965), resolving the issue.

> HADOOP-1965 breaks nutch
> ------------------------
>
>                 Key: HADOOP-2419
>                 URL: https://issues.apache.org/jira/browse/HADOOP-2419
>             Project: Hadoop
>          Issue Type: Bug
>          Components: mapred
>    Affects Versions: 0.16.0
>            Reporter: Paul Saab
>            Assignee: Amar Kamat
>         Attachments: jobtasks.jsp.html, MapRunnableTest.java
>
>
> When running nutch on trunk, nutch is unable to complete a fetch and the following exceptions are raised:
> java.io.EOFException
>         at java.io.DataInputStream.readFully(DataInputStream.java:180)
>         at org.apache.nutch.protocol.Content.readFields(Content.java:158)
>         at org.apache.nutch.util.GenericWritableConfigurable.readFields(GenericWritableConfigurable.java:38)
>         at org.apache.hadoop.mapred.MapTask$MapOutputBuffer.spill(MapTask.java:536)
>         at org.apache.hadoop.mapred.MapTask$MapOutputBuffer.sortAndSpillToDisk(MapTask.java:474)
>         at org.apache.hadoop.mapred.MapTask$MapOutputBuffer.access$100(MapTask.java:248)
>         at org.apache.hadoop.mapred.MapTask$MapOutputBuffer$1.run(MapTask.java:413)
> Exception in thread "SortSpillThread" java.lang.NegativeArraySizeException
>      at org.apache.hadoop.io.Text.readString(Text.java:388)
>      at org.apache.nutch.metadata.Metadata.readFields(Metadata.java:243)
>      at org.apache.nutch.protocol.Content.readFields(Content.java:151)
>      at org.apache.nutch.util.GenericWritableConfigurable.readFields(GenericWritableConfigurable.java:38)
>      at org.apache.hadoop.mapred.MapTask$MapOutputBuffer.spill(MapTask.java:536)
>      at org.apache.hadoop.mapred.MapTask$MapOutputBuffer.sortAndSpillToDisk(MapTask.java:474)
>      at org.apache.hadoop.mapred.MapTask$MapOutputBuffer.access$100(MapTask.java:248)
>      at org.apache.hadoop.mapred.MapTask$MapOutputBuffer$1.run(MapTask.java:413)
> After reverting HADOOP-1965 nutch works just fine.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Issue Comment Edited: (HADOOP-2419) HADOOP-1965 breaks nutch

Posted by "Amar Kamat (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-2419?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12552325 ] 

amar_kamat edited comment on HADOOP-2419 at 12/17/07 9:59 AM:
--------------------------------------------------------------

Plz check the new patch [https://issues.apache.org/jira/secure/attachment/12371797/HADOOP-2419.patch]. Earlier the  calls to {{MapTask.Collect()}} were not thread-safe. Now the call to {{Collect()}} is made thread-safe. Let us know if this patch works fine.

      was (Author: amar_kamat):
    Plz check the new patch [https://issues.apache.org/jira/secure/attachment/12371775/HADOOP-2419.patch]. Earlier the  calls to {{MapTask.Collect()}} were not thread-safe. Now the call to {{Collect()}} is made thread-safe. Let us know if this patch works fine.
  
> HADOOP-1965 breaks nutch
> ------------------------
>
>                 Key: HADOOP-2419
>                 URL: https://issues.apache.org/jira/browse/HADOOP-2419
>             Project: Hadoop
>          Issue Type: Bug
>          Components: mapred
>    Affects Versions: 0.16.0
>            Reporter: Paul Saab
>            Assignee: Amar Kamat
>         Attachments: jobtasks.jsp.html, MapRunnableTest.java
>
>
> When running nutch on trunk, nutch is unable to complete a fetch and the following exceptions are raised:
> java.io.EOFException
>         at java.io.DataInputStream.readFully(DataInputStream.java:180)
>         at org.apache.nutch.protocol.Content.readFields(Content.java:158)
>         at org.apache.nutch.util.GenericWritableConfigurable.readFields(GenericWritableConfigurable.java:38)
>         at org.apache.hadoop.mapred.MapTask$MapOutputBuffer.spill(MapTask.java:536)
>         at org.apache.hadoop.mapred.MapTask$MapOutputBuffer.sortAndSpillToDisk(MapTask.java:474)
>         at org.apache.hadoop.mapred.MapTask$MapOutputBuffer.access$100(MapTask.java:248)
>         at org.apache.hadoop.mapred.MapTask$MapOutputBuffer$1.run(MapTask.java:413)
> Exception in thread "SortSpillThread" java.lang.NegativeArraySizeException
>      at org.apache.hadoop.io.Text.readString(Text.java:388)
>      at org.apache.nutch.metadata.Metadata.readFields(Metadata.java:243)
>      at org.apache.nutch.protocol.Content.readFields(Content.java:151)
>      at org.apache.nutch.util.GenericWritableConfigurable.readFields(GenericWritableConfigurable.java:38)
>      at org.apache.hadoop.mapred.MapTask$MapOutputBuffer.spill(MapTask.java:536)
>      at org.apache.hadoop.mapred.MapTask$MapOutputBuffer.sortAndSpillToDisk(MapTask.java:474)
>      at org.apache.hadoop.mapred.MapTask$MapOutputBuffer.access$100(MapTask.java:248)
>      at org.apache.hadoop.mapred.MapTask$MapOutputBuffer$1.run(MapTask.java:413)
> After reverting HADOOP-1965 nutch works just fine.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HADOOP-2419) HADOOP-1965 breaks nutch

Posted by "Devaraj Das (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-2419?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12551916 ] 

Devaraj Das commented on HADOOP-2419:
-------------------------------------

I just reverted HADOOP-1965.

> HADOOP-1965 breaks nutch
> ------------------------
>
>                 Key: HADOOP-2419
>                 URL: https://issues.apache.org/jira/browse/HADOOP-2419
>             Project: Hadoop
>          Issue Type: Bug
>          Components: mapred
>    Affects Versions: 0.16.0
>            Reporter: Paul Saab
>            Assignee: Amar Kamat
>         Attachments: jobtasks.jsp.html, MapRunnableTest.java
>
>
> When running nutch on trunk, nutch is unable to complete a fetch and the following exceptions are raised:
> java.io.EOFException
>         at java.io.DataInputStream.readFully(DataInputStream.java:180)
>         at org.apache.nutch.protocol.Content.readFields(Content.java:158)
>         at org.apache.nutch.util.GenericWritableConfigurable.readFields(GenericWritableConfigurable.java:38)
>         at org.apache.hadoop.mapred.MapTask$MapOutputBuffer.spill(MapTask.java:536)
>         at org.apache.hadoop.mapred.MapTask$MapOutputBuffer.sortAndSpillToDisk(MapTask.java:474)
>         at org.apache.hadoop.mapred.MapTask$MapOutputBuffer.access$100(MapTask.java:248)
>         at org.apache.hadoop.mapred.MapTask$MapOutputBuffer$1.run(MapTask.java:413)
> Exception in thread "SortSpillThread" java.lang.NegativeArraySizeException
>      at org.apache.hadoop.io.Text.readString(Text.java:388)
>      at org.apache.nutch.metadata.Metadata.readFields(Metadata.java:243)
>      at org.apache.nutch.protocol.Content.readFields(Content.java:151)
>      at org.apache.nutch.util.GenericWritableConfigurable.readFields(GenericWritableConfigurable.java:38)
>      at org.apache.hadoop.mapred.MapTask$MapOutputBuffer.spill(MapTask.java:536)
>      at org.apache.hadoop.mapred.MapTask$MapOutputBuffer.sortAndSpillToDisk(MapTask.java:474)
>      at org.apache.hadoop.mapred.MapTask$MapOutputBuffer.access$100(MapTask.java:248)
>      at org.apache.hadoop.mapred.MapTask$MapOutputBuffer$1.run(MapTask.java:413)
> After reverting HADOOP-1965 nutch works just fine.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.