You are viewing a plain text version of this content. The canonical link for it is here.
Posted to solr-user@lucene.apache.org by John Milton <jo...@gmail.com> on 2019/01/01 15:59:23 UTC

Improve indexing speed?

Hi to all,

My document contains 65 fields. All the fields needs to be indexed. But for
the 100 documents takes 10 seconds for indexing.
I am using Solr 7.5 (2 cloud instance), with 50 shards.
It's running on Windows OS and it has 32 GB RAM. Java heap space 15 GB.
How to improve indexing speed?
Note :
All the fields contains maximum 20 characters only. Field type is text
general with case insensitive.

Thanks,
John Milton

Re: Improve indexing speed?

Posted by Erick Erickson <er...@gmail.com>.
What have you tried? The first thing I'd try is using just 1 or 2
shards. My first guess is that you're doing a lot of GC because you
have 50 shards in a single JVM (1 replica/shard?).

I regularly get several thousand Wikipedia docs/second on my macbook
pro, so your numbers are way out of the norm.

Best,
Erick

On Tue, Jan 1, 2019 at 9:05 AM John Milton <jo...@gmail.com> wrote:
>
> Hi to all,
>
> My document contains 65 fields. All the fields needs to be indexed. But for
> the 100 documents takes 10 seconds for indexing.
> I am using Solr 7.5 (2 cloud instance), with 50 shards.
> It's running on Windows OS and it has 32 GB RAM. Java heap space 15 GB.
> How to improve indexing speed?
> Note :
> All the fields contains maximum 20 characters only. Field type is text
> general with case insensitive.
>
> Thanks,
> John Milton

Re: Improve indexing speed?

Posted by Shawn Heisey <ap...@elyograg.org>.
On 1/1/2019 8:59 AM, John Milton wrote:
> My document contains 65 fields. All the fields needs to be indexed. But for
> the 100 documents takes 10 seconds for indexing.
> I am using Solr 7.5 (2 cloud instance), with 50 shards.

The best way to achieve fast indexing in Solr is to index multiple items 
in parallel.  That is, make your indexing system multi-threaded or 
multi-process.

As Erick also asked ... why do you have so many shards?  The only good 
reason I can imagine for so many shards is a need to handle billions of 
documents.

Thanks,
Shawn


Re: Improve indexing speed?

Posted by Hendrik Haddorp <he...@gmx.net>.
How are you indexing the documents? Are you using SolrJ or the plain 
REST API?
Are you sending the documents one by one or all in one request? The 
performance is far better if you send the 100 documents in one request.
If you send them individual, are you doing any commits between them?

regards,
Hendrik

On 01.01.2019 16:59, John Milton wrote:
> Hi to all,
>
> My document contains 65 fields. All the fields needs to be indexed. But for
> the 100 documents takes 10 seconds for indexing.
> I am using Solr 7.5 (2 cloud instance), with 50 shards.
> It's running on Windows OS and it has 32 GB RAM. Java heap space 15 GB.
> How to improve indexing speed?
> Note :
> All the fields contains maximum 20 characters only. Field type is text
> general with case insensitive.
>
> Thanks,
> John Milton
>