You are viewing a plain text version of this content. The canonical link for it is here.
Posted to solr-user@lucene.apache.org by Pablo Ferrari <pa...@gmail.com> on 2009/11/16 20:49:47 UTC

PhP, Solr and Delta Imports

Hello,

I have an already working Solr service based un full imports connected via
php to a Zend Framework MVC (I connect it directly to the Controller).
I use the SolrClient class for php which is great:
http://www.php.net/manual/en/class.solrclient.php

For now on, every time I want to edit a document I have to do a full import
again or I can delete the document by its id and add it again with the
updated info...
Anyone can guide me a bit in how to do delta imports? If its via php,
better!

Thanks in advance,

Pablo Ferrari
Tinkerlabs.net

Re: PhP, Solr and Delta Imports

Posted by Israel Ekpo <is...@gmail.com>.
On Mon, Nov 16, 2009 at 2:49 PM, Pablo Ferrari <pa...@gmail.com>wrote:

> Hello,
>
> I have an already working Solr service based un full imports connected via
> php to a Zend Framework MVC (I connect it directly to the Controller).
> I use the SolrClient class for php which is great:
> http://www.php.net/manual/en/class.solrclient.php
>
> For now on, every time I want to edit a document I have to do a full import
> again or I can delete the document by its id and add it again with the
> updated info...
> Anyone can guide me a bit in how to do delta imports? If its via php,
> better!
>
> Thanks in advance,
>
> Pablo Ferrari
> Tinkerlabs.net
>


Hello Pablo,

You have a couple of options and you do not have to do a full data re-import
for the entire index.

My example below uses 'doc_id' as the uniqueKey field in your schema. It
also assumes that it is an integer type

1. You can remove the document from the index by query or by id (assuming
you have its id or uniqueKey field) if you want to just take it out of the
active index.

$client = new SolrClient($options);

$client->deleteById(400); // I recommend this one

OR

$client->deleteByQuery('doc_id:400'); // This should work too.

2. If all you want to do is to replace/update an existing document in the
Solr index and you still want the document to remain active in the index
then you can just update it by building a SolrInputDocument object and then
submitting just that document using the SolrClient.

$client = new SolrClient($options);

$doc = new SolrInputDocument();

$doc->addField('doc_id', 334455);
$doc->addField('other_field', 'Other Field Value');
$doc->addField('another_field', 'Another Field Value');

$updateResponse = $client->addDocument($doc);

If your changes are coming from the db it would be helpful to have a time
stamp column that changes each time the record is modified.

Then you can keep track of when the last index process was done and the next
time you can retrieve only 'active' documents that have been modified or
created after this last re-index process. You can send the
SolrInputDocuments to the Solr Index using the SolrClient object as shown
above for each document.

Do not forget to save the changes to the index with a call to
SolrClient::commit()

If you are updating a lot of records, I would remmend waiting till the end
to do the commit (and optimize call if needed).

More examples are available here

http://us2.php.net/manual/en/solr.examples.php

-- 
"Good Enough" is not good enough.
To give anything less than your best is to sacrifice the gift.
Quality First. Measure Twice. Cut Once.