You are viewing a plain text version of this content. The canonical link for it is here.
Posted to solr-user@lucene.apache.org by Max Lynch <ih...@gmail.com> on 2010/08/26 05:49:18 UTC

Duplicating a Solr Doc

Right now I am doing some processing on my Solr index using Lucene Java.
Basically, I loop through the index in Java and do some extra processing of
each document (processing that is too intensive to do during indexing).

However, when I try to update the document in solr with new fields (using
SolrJ), the document either loses fields I don't explicitly set, or if I
have Solr-specific fields such as a solr "date" field type, I am not able to
copy the value as I can't read the value from Java.

Is there a way to add a field to a solr document without having to re-create
the document?  If not, how can I read the value of a Solr date in java?
Document.get("date_field") returns null even though the value shows up when
I access it through solr.  If I could read this value I could just copy the
fields from the Lucene Document to a SolrInputDocument.

Thanks.

Re: Duplicating a Solr Doc

Posted by Lance Norskog <go...@gmail.com>.
On further investigation: DocumentBuilder.loadStoredFields() is used
in one utility function which is only called from one unit test. This
should be considered dead code. Don't use it.

SolrPluginUtils.docListToSolrDocument()
SolrPluginUtilsTest.testDocListConversion()



On Wed, Aug 25, 2010 at 9:29 PM, Max Lynch <ih...@gmail.com> wrote:
> It seems like this is a way to accomplish what I was looking for:
>        CoreContainer coreContainer = new CoreContainer();
>        File home = new
> File("/home/max/packages/test/apache-solr-1.4.1/example/solr");
>        File f = new File(home, "solr.xml");
>
>
> coreContainer.load("/home/max/packages/test/apache-solr-1.4.1/example/solr",
> f);
>
>        SolrCore core = coreContainer.getCore("newsblog");
>        IndexSchema schema = core.getSchema();
>        DocumentBuilder builder = new DocumentBuilder(schema);
>
>
>        // get a Lucene Doc
>        // Document d = ...
>
>
>        SolrDocument solrDocument = new SolrDocument();
>
>        builder.loadStoredFields(solrDocument, d);
>        logger.debug("Loaded stored date: " +
> solrDocument.getFieldValue("date_added_solr"));
>
> However, one thing that scares me is the warning message I get from the
> CoreContainer:
>     [java] Aug 25, 2010 10:25:23 PM org.apache.solr.update.SolrIndexWriter
> finalize
>     [java] SEVERE: SolrIndexWriter was not closed prior to finalize(),
> indicates a bug -- POSSIBLE RESOURCE LEAK!!!
>
> I'm not sure what exactly triggers that but it's a result of the code I
> posted above.
>
> On Wed, Aug 25, 2010 at 10:49 PM, Max Lynch <ih...@gmail.com> wrote:
>
>> Right now I am doing some processing on my Solr index using Lucene Java.
>> Basically, I loop through the index in Java and do some extra processing of
>> each document (processing that is too intensive to do during indexing).
>>
>> However, when I try to update the document in solr with new fields (using
>> SolrJ), the document either loses fields I don't explicitly set, or if I
>> have Solr-specific fields such as a solr "date" field type, I am not able to
>> copy the value as I can't read the value from Java.
>>
>> Is there a way to add a field to a solr document without having to
>> re-create the document?  If not, how can I read the value of a Solr date in
>> java?  Document.get("date_field") returns null even though the value shows
>> up when I access it through solr.  If I could read this value I could just
>> copy the fields from the Lucene Document to a SolrInputDocument.
>>
>> Thanks.
>>
>



-- 
Lance Norskog
goksron@gmail.com

Re: Duplicating a Solr Doc

Posted by Max Lynch <ih...@gmail.com>.
It seems like this is a way to accomplish what I was looking for:
        CoreContainer coreContainer = new CoreContainer();
        File home = new
File("/home/max/packages/test/apache-solr-1.4.1/example/solr");
        File f = new File(home, "solr.xml");


coreContainer.load("/home/max/packages/test/apache-solr-1.4.1/example/solr",
f);

        SolrCore core = coreContainer.getCore("newsblog");
        IndexSchema schema = core.getSchema();
        DocumentBuilder builder = new DocumentBuilder(schema);


        // get a Lucene Doc
        // Document d = ...


        SolrDocument solrDocument = new SolrDocument();

        builder.loadStoredFields(solrDocument, d);
        logger.debug("Loaded stored date: " +
solrDocument.getFieldValue("date_added_solr"));

However, one thing that scares me is the warning message I get from the
CoreContainer:
     [java] Aug 25, 2010 10:25:23 PM org.apache.solr.update.SolrIndexWriter
finalize
     [java] SEVERE: SolrIndexWriter was not closed prior to finalize(),
indicates a bug -- POSSIBLE RESOURCE LEAK!!!

I'm not sure what exactly triggers that but it's a result of the code I
posted above.

On Wed, Aug 25, 2010 at 10:49 PM, Max Lynch <ih...@gmail.com> wrote:

> Right now I am doing some processing on my Solr index using Lucene Java.
> Basically, I loop through the index in Java and do some extra processing of
> each document (processing that is too intensive to do during indexing).
>
> However, when I try to update the document in solr with new fields (using
> SolrJ), the document either loses fields I don't explicitly set, or if I
> have Solr-specific fields such as a solr "date" field type, I am not able to
> copy the value as I can't read the value from Java.
>
> Is there a way to add a field to a solr document without having to
> re-create the document?  If not, how can I read the value of a Solr date in
> java?  Document.get("date_field") returns null even though the value shows
> up when I access it through solr.  If I could read this value I could just
> copy the fields from the Lucene Document to a SolrInputDocument.
>
> Thanks.
>