You are viewing a plain text version of this content. The canonical link for it is here.
Posted to solr-user@lucene.apache.org by swapna_here <sw...@gmail.com> on 2009/09/30 10:17:05 UTC

delay while adding document to solr index

hi all,

I have indexed 100000 documents (daily around 5000 documents will be indexed
one at a time to solr)
at the same time daily few(around 2000) indexed documents (added 30 days
back) will be deleted using DeleteByQuery of SolrJ
Previously each document used to be indexed within 5ms..
but recently i am facing a delay (sometimes 2min to 10 min) while adding
document to index.
And my index (folder) size is also increased to 625MB which is very large
Previously it was around 230MB

My Questions are:

1) is solr not deleting the older documents(added 30 days back) permenently
from index event after committing 

2)Why the index size is increased

3)reason for delay (2min to 10 mins) while adding the document one at a time
to index

Help is appreciated

Thanks in advance..

-- 
View this message in context: http://www.nabble.com/delay-while-adding-document-to-solr-index-tp25676777p25676777.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: delay while adding document to solr index

Posted by Yonik Seeley <yo...@lucidimagination.com>.
On Thu, Oct 8, 2009 at 1:58 AM, swapna_here <sw...@gmail.com> wrote:
> i don't understand why my solr index increasing daily
> when i am adding and deleting the same number of documents daily

A delete is just a bit flip, and does not reclaim disk space immediately.
Deleted documents are squeezed out when segment merges happen
(including an optimize which merges all segments).
If you have large segments that documents are deleted from, those
segments may not be involved in a merge and hence the deleted docs can
hang around for quite some time.

-Yonik
http://www.lucidimagination.com




> i run org.apache.solr.client.solrj.SolrServer.optimize() manually four times
> a day
>
> is it not the right way to run optimize, if yes what is the procedure to run
> optimize?
>
> thanks in advance :)
> --
> View this message in context: http://www.nabble.com/delay-while-adding-document-to-solr-index-tp25676777p25798789.html
> Sent from the Solr - User mailing list archive at Nabble.com.
>
>

Re: delay while adding document to solr index

Posted by swapna_here <sw...@gmail.com>.
thanks for your reply 
but sorry for the delay 

as you said i have removed the commit while adding single document and set
the auto commit for
      <maxDocs>200</maxDocs>
      <maxTime>10000</maxTime>

after setting when i run optimize() manually the size decreased to
350MB(100000 docs) from 638MB(100000 docs)

i think this happened because i run the optimize for the first time on index
data that is configured 4 months back..

this worked great but after one week again the index size reached 504MB
(100000 docs) 

i don't understand why my solr index increasing daily 
when i am adding and deleting the same number of documents daily

i run org.apache.solr.client.solrj.SolrServer.optimize() manually four times
a day

is it not the right way to run optimize, if yes what is the procedure to run
optimize?

thanks in advance :)
-- 
View this message in context: http://www.nabble.com/delay-while-adding-document-to-solr-index-tp25676777p25798789.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: delay while adding document to solr index

Posted by Jérôme Etévé <je...@gmail.com>.
Hi,

- Try to let solr do the commits for you (setting up autocommit
feature). (and stop committing after inserting one document). This
should greatly improve the delays you're experiencing.

- If you do not optimize, it's normal your index size only grows.
Optimize once regularly when your load is minimal.

Jerome.

2009/9/30 swapna_here <sw...@gmail.com>:
>
> thanks again for your immediate response
>
> yes, i am running the commit after a document is indexed
>
> here i don't understand why my index size is increased to 625MB(for the
> 100000 documents)
> which was previously 250MB
> is this due to i have not optimized at all my index or since i am adding
> documents individually
>
> i need solution for this urgently
> thanks a lot
> --
> View this message in context: http://www.nabble.com/delay-while-adding-document-to-solr-index-tp25676777p25679463.html
> Sent from the Solr - User mailing list archive at Nabble.com.
>
>



-- 
Jerome Eteve.
http://www.eteve.net
jerome@eteve.net

Re: delay while adding document to solr index

Posted by swapna_here <sw...@gmail.com>.
thanks again for your immediate response 

yes, i am running the commit after a document is indexed

here i don't understand why my index size is increased to 625MB(for the
100000 documents)
which was previously 250MB
is this due to i have not optimized at all my index or since i am adding
documents individually

i need solution for this urgently 
thanks a lot
-- 
View this message in context: http://www.nabble.com/delay-while-adding-document-to-solr-index-tp25676777p25679463.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: delay while adding document to solr index

Posted by Pravin Paratey <pr...@gmail.com>.
Swapna

While the disk space does increase during the process of optimization,
it should almost always return to the original size or slightly less.

This is a silly question. But off the top of my head, I can't think of
any other reason why the index size would increase - Are you running a
<commit/> after adding documents?

If you are, you might want to compare the size of each document being
currently indexed with the ones you indexed a few months back.

To optimize the index, simply post <optimize/> to Solr. Or read
[http://wiki.apache.org/solr/SolrOperationsTools]

Pravin

2009/9/30 swapna_here <sw...@gmail.com>:
>
> thanks for your reply
> i have not optimized at all
> my knowledge is optimize improves the query performance but it will take
> more disk space
> except that i have no idea how to use it
>
> previously for 100000 documents the size occupied was around 250MB
>
> But after 2 months it is 625MB
>
> why this happened ?
> is it because i have not optimized the index
> can any body tell me when and how to optimize the index(with configuration
> details) .
> --
> View this message in context: http://www.nabble.com/delay-while-adding-document-to-solr-index-tp25676777p25678531.html
> Sent from the Solr - User mailing list archive at Nabble.com.
>
>

Re: delay while adding document to solr index

Posted by swapna_here <sw...@gmail.com>.
thanks for your reply
i have not optimized at all
my knowledge is optimize improves the query performance but it will take
more disk space
except that i have no idea how to use it

previously for 100000 documents the size occupied was around 250MB

But after 2 months it is 625MB

why this happened ?
is it because i have not optimized the index
can any body tell me when and how to optimize the index(with configuration
details) .
-- 
View this message in context: http://www.nabble.com/delay-while-adding-document-to-solr-index-tp25676777p25678531.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: delay while adding document to solr index

Posted by Pravin Paratey <pr...@gmail.com>.
Also, what is your merge factor set to?

Pravin

2009/9/30 Pravin Paratey <pr...@gmail.com>:
> Swapna,
>
> Your answers are inline.
>
> 2009/9/30 swapna_here <sw...@gmail.com>:
>>
>> hi all,
>>
>> I have indexed 100000 documents (daily around 5000 documents will be indexed
>> one at a time to solr)
>> at the same time daily few(around 2000) indexed documents (added 30 days
>> back) will be deleted using DeleteByQuery of SolrJ
>> Previously each document used to be indexed within 5ms..
>> but recently i am facing a delay (sometimes 2min to 10 min) while adding
>> document to index.
>> And my index (folder) size is also increased to 625MB which is very large
>> Previously it was around 230MB
>>
>> My Questions are:
>>
>> 1) is solr not deleting the older documents(added 30 days back) permenently
>> from index event after committing
>
> Have you run optimize?
>
>> 2)Why the index size is increased
>
> If 5000 docs are added daily and only 2000 deleted, the index size
> would increase because of the remaining 3000 documents.
>
>> 3)reason for delay (2min to 10 mins) while adding the document one at a time
>> to index
>
> I don't know why this would happen. Is your disk nearly full? Which OS
> are you running on? What is the configuration of Solr?
>
>> Help is appreciated
>>
>> Thanks in advance..
>>
>> --
>> View this message in context: http://www.nabble.com/delay-while-adding-document-to-solr-index-tp25676777p25676777.html
>> Sent from the Solr - User mailing list archive at Nabble.com.
>>
>>
>
> Hope this helps
> Pravin
>

Re: delay while adding document to solr index

Posted by Pravin Paratey <pr...@gmail.com>.
Swapna,

Your answers are inline.

2009/9/30 swapna_here <sw...@gmail.com>:
>
> hi all,
>
> I have indexed 100000 documents (daily around 5000 documents will be indexed
> one at a time to solr)
> at the same time daily few(around 2000) indexed documents (added 30 days
> back) will be deleted using DeleteByQuery of SolrJ
> Previously each document used to be indexed within 5ms..
> but recently i am facing a delay (sometimes 2min to 10 min) while adding
> document to index.
> And my index (folder) size is also increased to 625MB which is very large
> Previously it was around 230MB
>
> My Questions are:
>
> 1) is solr not deleting the older documents(added 30 days back) permenently
> from index event after committing

Have you run optimize?

> 2)Why the index size is increased

If 5000 docs are added daily and only 2000 deleted, the index size
would increase because of the remaining 3000 documents.

> 3)reason for delay (2min to 10 mins) while adding the document one at a time
> to index

I don't know why this would happen. Is your disk nearly full? Which OS
are you running on? What is the configuration of Solr?

> Help is appreciated
>
> Thanks in advance..
>
> --
> View this message in context: http://www.nabble.com/delay-while-adding-document-to-solr-index-tp25676777p25676777.html
> Sent from the Solr - User mailing list archive at Nabble.com.
>
>

Hope this helps
Pravin