You are viewing a plain text version of this content. The canonical link for it is here.

Posted to dev@lucene.apache.org by "David Smiley (@MITRE.org)" <DS...@mitre.org> on 2010/05/10 16:11:48 UTC

Indexing a Reader instead of a String to a field value

I have a DIH setup in which I obtain a java.io.Reader for a field's value. 
It's a reader because I'm getting it from a source that may store a lot of
text.  I traced the value of a field, stored for quite some time as an
Object, through Solr until it got to Solr's DocumentBuilder line ~272 which
calls toString() on it.  I recall in the past dealing with Lucene I could
use a Reader.  Is there some reason Solr insists on a String?

~ David

-----
 Author: https://www.packtpub.com/solr-1-4-enterprise-search-server/book
-- 
View this message in context: http://lucene.472066.n3.nabble.com/Indexing-a-Reader-instead-of-a-String-to-a-field-value-tp789088p789088.html
Sent from the Solr - Dev mailing list archive at Nabble.com.

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org

Re: Indexing a Reader instead of a String to a field value

Posted by Chris Hostetter <ho...@fucit.org>.

: I have a DIH setup in which I obtain a java.io.Reader for a field's value. 
: It's a reader because I'm getting it from a source that may store a lot of
: text.  I traced the value of a field, stored for quite some time as an
: Object, through Solr until it got to Solr's DocumentBuilder line ~272 which
: calls toString() on it.  I recall in the past dealing with Lucene I could
: use a Reader.  Is there some reason Solr insists on a String?

I think it's just historical ... the APIs for sending docs to Solr all 
dealt primarily with strings, and using toString() made it easy to also 
support 'simple objects' like Integer,Float,Double, etc...  (you'll note 
that there had to be special case code for dealing with Dates and 
BinaryFields)

As a quick win you could also add special case code for "Reader" (but 
there may be a hitch in making sure the Reader gets closed eventaully -- I 
can't remember what the semantics are for Field's based on Readers in 
IndexWriter).  But, ideally we should just expand the 
FieldType.createField() api to support arbitrary objects as a mirror of 
the FieldType.toObject() method -- that way we can push this logic into 
hte FieldTypes and don't need special handling in DocumentBuilder.



-Hoss


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org