You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@nutch.apache.org by Stanislav Orlenko <or...@gmail.com> on 2012/12/19 12:35:50 UTC

IllegalArgumentException

Hello
Have anyone faced such a problem?

java.lang.IllegalArgumentException: offset (0) + length (4) exceed the
capacity of the array: 2
        at
org.apache.nutch.util.Bytes.explainWrongLengthOrOffset(Bytes.java:559)
        at org.apache.nutch.util.Bytes.toInt(Bytes.java:740)
        at org.apache.nutch.util.Bytes.toFloat(Bytes.java:611)
        at org.apache.nutch.util.Bytes.toFloat(Bytes.java:598)
        at
org.apache.nutch.scoring.opic.OPICScoringFilter.distributeScoreToOutlinks(OPICScoringFilter.java:128)
        at
org.apache.nutch.scoring.ScoringFilters.distributeScoreToOutlinks(ScoringFilters.java:117)
        at org.apache.nutch.crawl.DbUpdateMapper.map(DbUpdateMapper.java:70)
        at org.apache.nutch.crawl.DbUpdateMapper.map(DbUpdateMapper.java:37)
        at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:144)
        at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:764)
        at org.apache.hadoop.mapred.MapTask.run(MapTask.java:370)
        at
org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:212)

Nutch version is 2.1.

Thanks

Re: IllegalArgumentException

Posted by Stanislav Orlenko <or...@gmail.com>.
Yes, I can reproduce it with the command bin/nutch updatedb. Sorry, I am
new in nutch, could you please advice how I can get info about the
outlinks.size() ?

Thanks



On Wed, Dec 19, 2012 at 2:58 PM, Lewis John Mcgibbney <
lewis.mcgibbney@gmail.com> wrote:

> Hi,
>
> In short no.
>
> I see that just before we distribute the score to outlinks in line 70 of
> DbUpdateMapper.java [0] there is a TODO which reads
>
> // TODO: Outlink filtering (i.e. "only keep the first n outlinks")
>
> I wonder if this could be why the if condition is satisfied in the toInt()
> method (line 739) of Bytes.java [1]
>
> Can you reproduce this and explain a bit more about the outlinks.size() for
> the URL?
>
> Thanks
>
> Lewis
>
> [0]
>
> http://svn.apache.org/viewvc/nutch/branches/2.x/src/java/org/apache/nutch/crawl/DbUpdateMapper.java?view=markup
> [1]
>
> http://svn.apache.org/viewvc/nutch/branches/2.x/src/java/org/apache/nutch/util/Bytes.java?view=markup
>
> On Wed, Dec 19, 2012 at 11:35 AM, Stanislav Orlenko
> <or...@gmail.com>wrote:
>
> > Hello
> > Have anyone faced such a problem?
> >
> > java.lang.IllegalArgumentException: offset (0) + length (4) exceed the
> > capacity of the array: 2
> >         at
> > org.apache.nutch.util.Bytes.explainWrongLengthOrOffset(Bytes.java:559)
> >         at org.apache.nutch.util.Bytes.toInt(Bytes.java:740)
> >         at org.apache.nutch.util.Bytes.toFloat(Bytes.java:611)
> >         at org.apache.nutch.util.Bytes.toFloat(Bytes.java:598)
> >         at
> >
> >
> org.apache.nutch.scoring.opic.OPICScoringFilter.distributeScoreToOutlinks(OPICScoringFilter.java:128)
> >         at
> >
> >
> org.apache.nutch.scoring.ScoringFilters.distributeScoreToOutlinks(ScoringFilters.java:117)
> >         at
> > org.apache.nutch.crawl.DbUpdateMapper.map(DbUpdateMapper.java:70)
> >         at
> > org.apache.nutch.crawl.DbUpdateMapper.map(DbUpdateMapper.java:37)
> >         at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:144)
> >         at
> org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:764)
> >         at org.apache.hadoop.mapred.MapTask.run(MapTask.java:370)
> >         at
> > org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:212)
> >
> > Nutch version is 2.1.
> >
> > Thanks
> >
>
>
>
> --
> *Lewis*
>

Re: IllegalArgumentException

Posted by Lewis John Mcgibbney <le...@gmail.com>.
Hi,

In short no.

I see that just before we distribute the score to outlinks in line 70 of
DbUpdateMapper.java [0] there is a TODO which reads

// TODO: Outlink filtering (i.e. "only keep the first n outlinks")

I wonder if this could be why the if condition is satisfied in the toInt()
method (line 739) of Bytes.java [1]

Can you reproduce this and explain a bit more about the outlinks.size() for
the URL?

Thanks

Lewis

[0]
http://svn.apache.org/viewvc/nutch/branches/2.x/src/java/org/apache/nutch/crawl/DbUpdateMapper.java?view=markup
[1]
http://svn.apache.org/viewvc/nutch/branches/2.x/src/java/org/apache/nutch/util/Bytes.java?view=markup

On Wed, Dec 19, 2012 at 11:35 AM, Stanislav Orlenko
<or...@gmail.com>wrote:

> Hello
> Have anyone faced such a problem?
>
> java.lang.IllegalArgumentException: offset (0) + length (4) exceed the
> capacity of the array: 2
>         at
> org.apache.nutch.util.Bytes.explainWrongLengthOrOffset(Bytes.java:559)
>         at org.apache.nutch.util.Bytes.toInt(Bytes.java:740)
>         at org.apache.nutch.util.Bytes.toFloat(Bytes.java:611)
>         at org.apache.nutch.util.Bytes.toFloat(Bytes.java:598)
>         at
>
> org.apache.nutch.scoring.opic.OPICScoringFilter.distributeScoreToOutlinks(OPICScoringFilter.java:128)
>         at
>
> org.apache.nutch.scoring.ScoringFilters.distributeScoreToOutlinks(ScoringFilters.java:117)
>         at
> org.apache.nutch.crawl.DbUpdateMapper.map(DbUpdateMapper.java:70)
>         at
> org.apache.nutch.crawl.DbUpdateMapper.map(DbUpdateMapper.java:37)
>         at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:144)
>         at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:764)
>         at org.apache.hadoop.mapred.MapTask.run(MapTask.java:370)
>         at
> org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:212)
>
> Nutch version is 2.1.
>
> Thanks
>



-- 
*Lewis*