You are viewing a plain text version of this content. The canonical link for it is here.
Posted to java-user@lucene.apache.org by "Hasenberger, Josef" <Jo...@zetcom.com> on 2015/01/12 22:41:34 UTC

StoredField available in Collector.setNextReader

Hello,

I have tried to retrieve values stored via StoredField type inside a Collector when its method setNextReader(AtomicReaderContext) was called.
I used the following method from FieldCache, but do not get back any values:
      FieldCache.DEFAULT.getTerms(indexReader, field, false);

Retrieving the values from the document itself during call to Collector.collect(int) works fine.
But this is much much slower than getting all terms at once as by the above method.

My question:
Is there a way to get binary content with similar performance as by the above described concept, i.e. retrieving the field terms when setting the reader in a Collector?


Besides, the concept works fine for any stored field that is indexed, e.g. like in the following code snippet:

            final FieldType fieldType = new FieldType();
            {
                fieldType.setStored(true);
                fieldType.setIndexed(true); // need to index, otherwise no fast retrieval of terms in collector is possible
                fieldType.setIndexOptions(IndexOptions.DOCS_ONLY);
                fieldType.setTokenized(false);
                fieldType.setOmitNorms(true);
                fieldType.freeze();
            }

            Field field = new Field(fieldName, fieldValue, fieldType); // fieldValue is of type String

But this does not allow me to store binary content (i.e. values in byte[] arrays) as is available for StoredField.
The constructor expects field content of type String.
I have tried to convert the content into base64 encoded strings, but the conversion from base64 encoded strings to byte arrays is quite expensive for large indexes.


Thanks for your advice.

Best regards,

Josef