You are viewing a plain text version of this content. The canonical link for it is here.
Posted to solr-user@lucene.apache.org by Midas A <te...@gmail.com> on 2015/06/04 13:15:56 UTC

indexing issue

I have some indexing issue . While indexing IOwait is high in solr server
and load also.

Re: indexing issue

Posted by Midas A <te...@gmail.com>.
Thanks for replying below is commit frequency

<autoCommit> <maxTime>60000</maxTime> <!-- currently 1 min, old value is
15000 --> <openSearcher>false</openSearcher> </autoCommit> <autoSoftCommit>
<maxTime>600000</maxTime> <!-- currently 10 min, old value is 500 --> </
autoSoftCommit>


On Thu, Jun 4, 2015 at 4:49 PM, Toke Eskildsen <te...@statsbiblioteket.dk>
wrote:

> On Thu, 2015-06-04 at 16:45 +0530, Midas A wrote:
> > I have some indexing issue . While indexing IOwait is high in solr server
> > and load also.
>
> Might be because you commit too frequently. How often do you do that?
>
> - Toke Eskildsen, State and University Library, Denmark
>
>
>

Re: indexing issue

Posted by Midas A <te...@gmail.com>.
Hi Alessandro,



On Thu, Jun 4, 2015 at 5:19 PM, Alessandro Benedetti <
benedetti.alex85@gmail.com> wrote:

> Honestly your auto-commit configuration seems not alarming at all!
> Can you give me more details regarding :
>
> Load expected : currently it is 7- 15 should be below 1
> *[Abhishek] :  solr server load average.*
> What does this mean ? Without a unit of measure i find hard to understand
> plain numbers :)
>


>  was expecting the number of documents per unit of time you index, and an
> average size of these docs.
>
*   [Abhishek] :  avg size of doc : 250 kb *
<autoCommit> <maxTime>60000</maxTime> <!-- currently 1 min, old value is
15000 --> <openSearcher>false</openSearcher> </autoCommit>
we have not specified Max docs limit

Which kind of DIH processor ? Where is your data coming from ? A database ?
> *  [Abhishek] :  Using mysql data base and inbuilt  Solr DIH  (Data import
> handler)*
>


> Let's try to improve the understanding of the situation and then evaluate
> an approach.
>
> Cheers
>
> ​
>

Re: indexing issue

Posted by Alessandro Benedetti <be...@gmail.com>.
Honestly your auto-commit configuration seems not alarming at all!
Can you give me more details regarding :

Load expected : currently it is 7- 15 should be below 1

What does this mean ? Without a unit of measure i find hard to understand
plain numbers :)
 was expecting the number of documents per unit of time you index, and an
average size of these docs.
Which kind of DIH processor ? Where is your data coming from ? A database ?

Let's try to improve the understanding of the situation and then evaluate
an approach.

Cheers

​

Re: indexing issue

Posted by Midas A <te...@gmail.com>.
Thanks Alessandro,

Please find the info inline .

Which version of Solr are you using : 4.2.1

   - Architecture : Master -slave

Load expected : currently it is 7- 15 should be below 1
Indexing approach : Using DIH
When does your problem happens :  we run delta import every 10 mins full
index once a day .. some time it goes to 7-15


On Thu, Jun 4, 2015 at 4:52 PM, Alessandro Benedetti <
benedetti.alex85@gmail.com> wrote:

> I think this mail is really poor in term of details.
> Which version of Solr are you using ?
> Architecture ?
> Load expected ?
> Indexing approach ?
> When does your problem happens ?
>
> More detail we give, easier will be to provide help.
>
> Cheers
>
> 2015-06-04 12:19 GMT+01:00 Toke Eskildsen <te...@statsbiblioteket.dk>:
>
> > On Thu, 2015-06-04 at 16:45 +0530, Midas A wrote:
> > > I have some indexing issue . While indexing IOwait is high in solr
> server
> > > and load also.
> >
> > Might be because you commit too frequently. How often do you do that?
> >
> > - Toke Eskildsen, State and University Library, Denmark
> >
> >
> >
>
>
> --
> --------------------------
>
> Benedetti Alessandro
> Visiting card : http://about.me/alessandro_benedetti
>
> "Tyger, tyger burning bright
> In the forests of the night,
> What immortal hand or eye
> Could frame thy fearful symmetry?"
>
> William Blake - Songs of Experience -1794 England
>

Re: indexing issue

Posted by Alessandro Benedetti <be...@gmail.com>.
I think this mail is really poor in term of details.
Which version of Solr are you using ?
Architecture ?
Load expected ?
Indexing approach ?
When does your problem happens ?

More detail we give, easier will be to provide help.

Cheers

2015-06-04 12:19 GMT+01:00 Toke Eskildsen <te...@statsbiblioteket.dk>:

> On Thu, 2015-06-04 at 16:45 +0530, Midas A wrote:
> > I have some indexing issue . While indexing IOwait is high in solr server
> > and load also.
>
> Might be because you commit too frequently. How often do you do that?
>
> - Toke Eskildsen, State and University Library, Denmark
>
>
>


-- 
--------------------------

Benedetti Alessandro
Visiting card : http://about.me/alessandro_benedetti

"Tyger, tyger burning bright
In the forests of the night,
What immortal hand or eye
Could frame thy fearful symmetry?"

William Blake - Songs of Experience -1794 England

Re: indexing issue

Posted by Toke Eskildsen <te...@statsbiblioteket.dk>.
On Thu, 2015-06-04 at 16:45 +0530, Midas A wrote:
> I have some indexing issue . While indexing IOwait is high in solr server
> and load also.

Might be because you commit too frequently. How often do you do that?

- Toke Eskildsen, State and University Library, Denmark



Re: indexing issue

Posted by Midas A <te...@gmail.com>.
we are indexing around 50000 docs par 10 min .

On Thu, Jun 4, 2015 at 11:02 PM, Midas A <te...@gmail.com> wrote:

> Shwan,
>
> Please find the log . give me some sense what is happening
>
> On Thu, Jun 4, 2015 at 10:56 PM, Shawn Heisey <ap...@elyograg.org> wrote:
>
>> On 6/4/2015 11:12 AM, Midas A wrote:
>> > sorry Shawn ,
>> >
>> > a) Total docs solr is handling is 3 million .
>> > b) index size is only 5 GB
>>
>> If your total index size is only 5GB, then there should be no need for a
>> 30GB heap.  For that much index, I'd start with 4GB, and implement GC
>> tuning.
>>
>> A high iowait doesn't make any sense for that situation, but it WOULD
>> make sense with 600 GB of total index.
>>
>> Thanks,
>> Shawn
>>
>>
>

Re: indexing issue

Posted by Midas A <te...@gmail.com>.
Shwan,

Please find the log . give me some sense what is happening

On Thu, Jun 4, 2015 at 10:56 PM, Shawn Heisey <ap...@elyograg.org> wrote:

> On 6/4/2015 11:12 AM, Midas A wrote:
> > sorry Shawn ,
> >
> > a) Total docs solr is handling is 3 million .
> > b) index size is only 5 GB
>
> If your total index size is only 5GB, then there should be no need for a
> 30GB heap.  For that much index, I'd start with 4GB, and implement GC
> tuning.
>
> A high iowait doesn't make any sense for that situation, but it WOULD
> make sense with 600 GB of total index.
>
> Thanks,
> Shawn
>
>

Re: indexing issue

Posted by Shawn Heisey <ap...@elyograg.org>.
On 6/4/2015 11:12 AM, Midas A wrote:
> sorry Shawn ,
>
> a) Total docs solr is handling is 3 million .
> b) index size is only 5 GB

If your total index size is only 5GB, then there should be no need for a
30GB heap.  For that much index, I'd start with 4GB, and implement GC
tuning.

A high iowait doesn't make any sense for that situation, but it WOULD
make sense with 600 GB of total index.

Thanks,
Shawn


Re: indexing issue

Posted by Midas A <te...@gmail.com>.
sorry Shawn ,

a) Total docs solr is handling is 3 million .
b) index size is only 5 GB



On Thu, Jun 4, 2015 at 9:35 PM, Shawn Heisey <ap...@elyograg.org> wrote:

> On 6/4/2015 7:38 AM, Midas A wrote:
> > On Thu, Jun 4, 2015 at 6:48 PM, Shawn Heisey <ap...@elyograg.org>
> wrote:
> >
> >> On 6/4/2015 5:15 AM, Midas A wrote:
> >>> I have some indexing issue . While indexing IOwait is high in solr
> server
> >>> and load also.
> >> My first suspect here is that you don't have enough RAM for your index
> >> size.
> >>
> >> * How many total docs is Solr handling (all cores)?
> >>
> >      --30,00000 dos
> >
> >> * What is the total size on disk of all your cores?
> >>
> >      --  600 GB
> >
> >> * How much RAM does the machine have?
> >>
> >      --48 GB
> >
> >> * What is the java max heap?
> >> --30 GB(jvm)
>
> Is that 3 million docs or 30 million docs?  The actual numbers are 3
> million, but you put a single comma in the number after the 30, so I am
> not sure which you meant.  Either way, those documents must be quite
> large, to make a 600GB index.  30 million docs in my index would only be
> about 30GB.
>
> With 48 GB of RAM, 30 GB allocated to Solr, and a 600GB index, you don't
> have anywhere even close to enough RAM to cache your index effectively.
> There's only 18GB of RAM left over for the OS disk cache.  That's only 3
> percent of the index data that can fit in the OS disk cache.  I would
> imagine that you're going to need to be able to fit somewhere between 25
> and 50 percent of the index into RAM, which would mean that you're going
> to want around 256GB of RAM for that index. 128GB *might* be enough.
> Alternatively, you could work on making your index smaller -- but be
> aware that to improve performance with low memory, you need to reduce
> the *indexed* part, the *stored* part makes little difference.
>
> Another potential problem with a 30GB heap is related to garbage
> collection tuning.  If you haven't tuned your GC at all, then
> performance will be terrible on a heap that large, especially when you
> are indexing.  The wiki page I linked on my previous reply contains a
> link to my personal page, which covers GC tuning:
>
> https://wiki.apache.org/solr/ShawnHeisey
>
> Thanks,
> Shawn
>
>

Re: indexing issue

Posted by Shawn Heisey <ap...@elyograg.org>.
On 6/4/2015 7:38 AM, Midas A wrote:
> On Thu, Jun 4, 2015 at 6:48 PM, Shawn Heisey <ap...@elyograg.org> wrote:
>
>> On 6/4/2015 5:15 AM, Midas A wrote:
>>> I have some indexing issue . While indexing IOwait is high in solr server
>>> and load also.
>> My first suspect here is that you don't have enough RAM for your index
>> size.
>>
>> * How many total docs is Solr handling (all cores)?
>>
>      --30,00000 dos
>
>> * What is the total size on disk of all your cores?
>>
>      --  600 GB
>
>> * How much RAM does the machine have?
>>
>      --48 GB
>
>> * What is the java max heap?
>> --30 GB(jvm)

Is that 3 million docs or 30 million docs?  The actual numbers are 3
million, but you put a single comma in the number after the 30, so I am
not sure which you meant.  Either way, those documents must be quite
large, to make a 600GB index.  30 million docs in my index would only be
about 30GB.

With 48 GB of RAM, 30 GB allocated to Solr, and a 600GB index, you don't
have anywhere even close to enough RAM to cache your index effectively. 
There's only 18GB of RAM left over for the OS disk cache.  That's only 3
percent of the index data that can fit in the OS disk cache.  I would
imagine that you're going to need to be able to fit somewhere between 25
and 50 percent of the index into RAM, which would mean that you're going
to want around 256GB of RAM for that index. 128GB *might* be enough. 
Alternatively, you could work on making your index smaller -- but be
aware that to improve performance with low memory, you need to reduce
the *indexed* part, the *stored* part makes little difference.

Another potential problem with a 30GB heap is related to garbage
collection tuning.  If you haven't tuned your GC at all, then
performance will be terrible on a heap that large, especially when you
are indexing.  The wiki page I linked on my previous reply contains a
link to my personal page, which covers GC tuning:

https://wiki.apache.org/solr/ShawnHeisey

Thanks,
Shawn


Re: indexing issue

Posted by Midas A <te...@gmail.com>.
Hi shawn,

Please find comment in line.

On Thu, Jun 4, 2015 at 6:48 PM, Shawn Heisey <ap...@elyograg.org> wrote:

> On 6/4/2015 5:15 AM, Midas A wrote:
> > I have some indexing issue . While indexing IOwait is high in solr server
> > and load also.
>
> My first suspect here is that you don't have enough RAM for your index
> size.
>
> * How many total docs is Solr handling (all cores)?
>
     --30,00000 dos

> * What is the total size on disk of all your cores?
>
     --  600 GB

> * How much RAM does the machine have?
>
     --48 GB

> * What is the java max heap?
> --30 GB(jvm)
> Here is some additional information on memory requirements for Solr:
>
> https://wiki.apache.org/solr/SolrPerformanceProblems#RAM
>
> When Alessandro asked about the load on Solr, the hope was to find out
> your *rate* of indexing and querying, not the load average from the
> operating system.  Indexing requires a fair amount of heap memory and
> CPU resources.  If your heap is too small, then Java might have to work
> extremely hard to free up memory for normal operation.
>
> Thanks,
> Shawn
>
>

Re: indexing issue

Posted by Shawn Heisey <ap...@elyograg.org>.
On 6/4/2015 5:15 AM, Midas A wrote:
> I have some indexing issue . While indexing IOwait is high in solr server
> and load also.

My first suspect here is that you don't have enough RAM for your index size.

* How many total docs is Solr handling (all cores)?
* What is the total size on disk of all your cores?
* How much RAM does the machine have?
* What is the java max heap?

Here is some additional information on memory requirements for Solr:

https://wiki.apache.org/solr/SolrPerformanceProblems#RAM

When Alessandro asked about the load on Solr, the hope was to find out
your *rate* of indexing and querying, not the load average from the
operating system.  Indexing requires a fair amount of heap memory and
CPU resources.  If your heap is too small, then Java might have to work
extremely hard to free up memory for normal operation.

Thanks,
Shawn