You are viewing a plain text version of this content. The canonical link for it is here.
Posted to java-user@lucene.apache.org by Ivan Vasilev <iv...@sirma.bg> on 2012/11/01 11:11:53 UTC

Suggestion for extending class DocumentStoredFieldVisitor

Hy Guys,

I intend to extend DocumentStoredFieldVisitor class like this:

class DocumentStoredNonRepeatableFieldVisitor extends 
DocumentStoredFieldVisitor {

       @Override
       public Status needsField(FieldInfo fieldInfo) throws IOException {
         return fieldsToAdd == null || fieldsToAdd.contains(fieldInfo.name)
             ? Status.YES
             : fieldsToAdd.size() < doc.getFields().size() ? Status.NO : 
Status.STOP;
       }
}

The gain in our application is that we utilize the usage of Status.STOP. 
We have 98 fields currently and in some cases we need loading just 1, 2 
or 3 fields per matching doc and using this we could skip lot of field 
extraction if happen matching fields to be firstly visited.

I do not have time to see how is applied the method doc.add(new 
StoredField(fieldInfo.name, value)) in Document class - when comes field 
with name already existing in the document does it change the existing 
"value" for that field by concatenating two values, or just adds 
somewhere the new value and concatenation is made later. In latter case 
I think my suggested class could be applied only in cases when we do not 
index fields with the same names to same documents, otherwise with 
checking (fieldsToAdd.size() < doc.getFields().size()) we could stop 
loading fields for that document and in this way miss some repeating 
field. This is why my class is called 
"DocumentStoredNonRepeatableFieldVisitor".

As we use Lucene not like a jar-file but the sources - I will change 
visibility of DocumentStoredFieldVisitor's members "doc" and "fieldsToAdd".
I just want to suggest in the original class DocumentStoredFieldVisitor 
those members's visibility to also be changed so that the class to be 
extended in applications that use jar-files.

Cheers,
Ivan

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Re: Suggestion for extending class DocumentStoredFieldVisitor

Posted by Ivan Vasilev <iv...@sirma.bg>.

On 01.11.2012 г. 15:09, Michael McCandless wrote:
> On Thu, Nov 1, 2012 at 6:11 AM, Ivan Vasilev <iv...@sirma.bg> wrote:
>> Hy Guys,
>>
>> I intend to extend DocumentStoredFieldVisitor class like this:
>>
>> class DocumentStoredNonRepeatableFieldVisitor extends
>> DocumentStoredFieldVisitor {
>>
>>        @Override
>>        public Status needsField(FieldInfo fieldInfo) throws IOException {
>>          return fieldsToAdd == null || fieldsToAdd.contains(fieldInfo.name)
>>              ? Status.YES
>>              : fieldsToAdd.size() < doc.getFields().size() ? Status.NO :
>> Status.STOP;
>>        }
>> }
>>
>> The gain in our application is that we utilize the usage of Status.STOP. We
>> have 98 fields currently and in some cases we need loading just 1, 2 or 3
>> fields per matching doc and using this we could skip lot of field extraction
>> if happen matching fields to be firstly visited.
>>
>> I do not have time to see how is applied the method doc.add(new
>> StoredField(fieldInfo.name, value)) in Document class - when comes field
>> with name already existing in the document does it change the existing
>> "value" for that field by concatenating two values, or just adds somewhere
>> the new value and concatenation is made later. In latter case I think my
>> suggested class could be applied only in cases when we do not index fields
>> with the same names to same documents, otherwise with checking
>> (fieldsToAdd.size() < doc.getFields().size()) we could stop loading fields
>> for that document and in this way miss some repeating field. This is why my
>> class is called "DocumentStoredNonRepeatableFieldVisitor".
>>
>> As we use Lucene not like a jar-file but the sources - I will change
>> visibility of DocumentStoredFieldVisitor's members "doc" and "fieldsToAdd".
>> I just want to suggest in the original class DocumentStoredFieldVisitor
>> those members's visibility to also be changed so that the class to be
>> extended in applications that use jar-files.
> I think opening up DocumentStoredFieldVisitor should be fine?  Can you
> open an issue / patch?  Thanks!
>
> Mike McCandless
>
> http://blog.mikemccandless.com
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org
>
>

OK I will do so may be tomorrow.

Ivan Vasilev

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Re: Suggestion for extending class DocumentStoredFieldVisitor

Posted by Michael McCandless <lu...@mikemccandless.com>.
On Thu, Nov 1, 2012 at 6:11 AM, Ivan Vasilev <iv...@sirma.bg> wrote:
> Hy Guys,
>
> I intend to extend DocumentStoredFieldVisitor class like this:
>
> class DocumentStoredNonRepeatableFieldVisitor extends
> DocumentStoredFieldVisitor {
>
>       @Override
>       public Status needsField(FieldInfo fieldInfo) throws IOException {
>         return fieldsToAdd == null || fieldsToAdd.contains(fieldInfo.name)
>             ? Status.YES
>             : fieldsToAdd.size() < doc.getFields().size() ? Status.NO :
> Status.STOP;
>       }
> }
>
> The gain in our application is that we utilize the usage of Status.STOP. We
> have 98 fields currently and in some cases we need loading just 1, 2 or 3
> fields per matching doc and using this we could skip lot of field extraction
> if happen matching fields to be firstly visited.
>
> I do not have time to see how is applied the method doc.add(new
> StoredField(fieldInfo.name, value)) in Document class - when comes field
> with name already existing in the document does it change the existing
> "value" for that field by concatenating two values, or just adds somewhere
> the new value and concatenation is made later. In latter case I think my
> suggested class could be applied only in cases when we do not index fields
> with the same names to same documents, otherwise with checking
> (fieldsToAdd.size() < doc.getFields().size()) we could stop loading fields
> for that document and in this way miss some repeating field. This is why my
> class is called "DocumentStoredNonRepeatableFieldVisitor".
>
> As we use Lucene not like a jar-file but the sources - I will change
> visibility of DocumentStoredFieldVisitor's members "doc" and "fieldsToAdd".
> I just want to suggest in the original class DocumentStoredFieldVisitor
> those members's visibility to also be changed so that the class to be
> extended in applications that use jar-files.

I think opening up DocumentStoredFieldVisitor should be fine?  Can you
open an issue / patch?  Thanks!

Mike McCandless

http://blog.mikemccandless.com

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org