You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@nutch.apache.org by Doug Cutting <cu...@nutch.org> on 2006/01/02 19:57:56 UTC

Re: Bug in DeleteDuplicates.java ?

Andrzej Bialecki wrote:
> Gal Nitzan wrote:
> 
>> this function throws IOException. Why?
>>
>>         public long getPos() throws IOException {
>>            return (doc*INDEX_LENGTH)/maxDoc;
>>          }
>>
>> It should be throwing ArithmeticException
>>  
>>
> 
> The IOException is required by the API of RecordReader.
> 
>> What happens when maxDoc is zero?
>>  
>>
> 
> Ka-boom! ;-) You're right, this should be wrapped in an IOException and 
> rethrown.

No, it should really just be fixed to not cause an ArithmeticException. 
  This is called to report progress.  In this case the input "file" for 
the map is a Lucene index whose documents we iterate through.  To 
simplify the construction of input splits (without opening each index) a 
constant "length" is used for each "file".  So we have to scale the 
document numbers to give progress in this range.

The problem is that progress may be reported even when there are no 
documents in the index.  So the call is valid and no exception should be 
thrown.

Doug