You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@nutch.apache.org by Gal Nitzan <gn...@usa.net> on 2005/12/30 00:05:34 UTC
Bug in DeleteDuplicates.java ?
this function throws IOException. Why?
public long getPos() throws IOException {
return (doc*INDEX_LENGTH)/maxDoc;
}
It should be throwing ArithmeticException
What happens when maxDoc is zero?
Gal
Re: Bug in DeleteDuplicates.java ?
Posted by Doug Cutting <cu...@nutch.org>.
Andrzej Bialecki wrote:
> Gal Nitzan wrote:
>
>> this function throws IOException. Why?
>>
>> public long getPos() throws IOException {
>> return (doc*INDEX_LENGTH)/maxDoc;
>> }
>>
>> It should be throwing ArithmeticException
>>
>>
>
> The IOException is required by the API of RecordReader.
>
>> What happens when maxDoc is zero?
>>
>>
>
> Ka-boom! ;-) You're right, this should be wrapped in an IOException and
> rethrown.
No, it should really just be fixed to not cause an ArithmeticException.
This is called to report progress. In this case the input "file" for
the map is a Lucene index whose documents we iterate through. To
simplify the construction of input splits (without opening each index) a
constant "length" is used for each "file". So we have to scale the
document numbers to give progress in this range.
The problem is that progress may be reported even when there are no
documents in the index. So the call is valid and no exception should be
thrown.
Doug
Re: Bug in DeleteDuplicates.java ?
Posted by Andrzej Bialecki <ab...@getopt.org>.
Gal Nitzan wrote:
>this function throws IOException. Why?
>
> public long getPos() throws IOException {
> return (doc*INDEX_LENGTH)/maxDoc;
> }
>
>It should be throwing ArithmeticException
>
>
>
The IOException is required by the API of RecordReader.
>What happens when maxDoc is zero?
>
>
Ka-boom! ;-) You're right, this should be wrapped in an IOException and
rethrown.
--
Best regards,
Andrzej Bialecki <><
___. ___ ___ ___ _ _ __________________________________
[__ || __|__/|__||\/| Information Retrieval, Semantic Web
___|||__|| \| || | Embedded Unix, System Integration
http://www.sigram.com Contact: info at sigram dot com