You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@nutch.apache.org by "Markus Jelsma (JIRA)" <ji...@apache.org> on 2013/11/26 13:09:36 UTC

[jira] [Created] (NUTCH-1675) NutchField to support long

Markus Jelsma created NUTCH-1675:
------------------------------------

             Summary: NutchField to support long
                 Key: NUTCH-1675
                 URL: https://issues.apache.org/jira/browse/NUTCH-1675
             Project: Nutch
          Issue Type: Bug
            Reporter: Markus Jelsma
            Assignee: Markus Jelsma


NutchField has no support for Long in readfields. Usually this is not a problem because in reducers it is only written to the output. But when using NutchField in mappers, then a reducer cannot read a Long.

{code}
java.lang.RuntimeException: problem advancing post rec#0
        at org.apache.hadoop.mapred.Task$ValuesIterator.next(Task.java:1217)
        at org.apache.hadoop.mapred.ReduceTask$ReduceValuesIterator.moveToNext(ReduceTask.java:250)
        at org.apache.hadoop.mapred.ReduceTask$ReduceValuesIterator.next(ReduceTask.java:246)
        at org.apache.nutch.fetcher.Fetcher$FetcherReducer.reduce(Fetcher.java:1440)
        at org.apache.nutch.fetcher.Fetcher$FetcherReducer.reduce(Fetcher.java:1401)
        at org.apache.hadoop.mapred.ReduceTask.runOldReducer(ReduceTask.java:522)
        at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:421)
        at org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:398)
Caused by: java.io.EOFException
        at java.io.DataInputStream.readFully(DataInputStream.java:197)
        at org.apache.hadoop.io.Text.readString(Text.java:402)
        at org.apache.nutch.indexer.NutchField.readFields(NutchField.java:89)
        at org.apache.nutch.indexer.NutchDocument.readFields(NutchDocument.java:112)
        at org.apache.nutch.indexer.NutchIndexAction.readFields(NutchIndexAction.java:81)
        at org.apache.nutch.util.GenericWritableConfigurable.readFields(GenericWritableConfigurable.java:54)
        at org.apache.hadoop.io.serializer.WritableSerialization$WritableDeserializer.deserialize(WritableSerialization.java:67)
        at org.apache.hadoop.io.serializer.WritableSerialization$WritableDeserializer.deserialize(WritableSerialization.java:40)
        at org.apache.hadoop.mapred.Task$ValuesIterator.readNextValue(Task.java:1276)
        at org.apache.hadoop.mapred.Task$ValuesIterator.next(Task.java:1214)
        ... 7 more
{code}



--
This message was sent by Atlassian JIRA
(v6.1#6144)