You are viewing a plain text version of this content. The canonical link for it is here.
Posted to java-user@lucene.apache.org by Bingtao Yin <yb...@gmail.com> on 2017/12/03 02:42:25 UTC

Re: Optimize FTS memory footprint

Thanks, mike. I'm facing a similar problem.
I'm running a 2.0 elasticsearch cluster, and find the fst of _uid field
takes a lot of memory. The _uid field is not analyzed and generated by
elasticsearch, which also has high cardinality.
Is there any ways to  reduce memory cost for _uid field? Thanks.


2017-11-29 5:47 GMT+08:00 elirev <el...@gmail.com>:

> Thanks   Mike .
> I did not  find  any  clear  way to know it its FST or Norm , or something
> else ( unless i miss something )  the fact the FST is an in memory prefix
> index lead me to think it using most of the heap   .
> Our  mapping is normal with around of 200 columns one of the columns is
> nested object with limited amount of objects (up to 4 instances  )   , we
> are using monthly base indexes  (keep 6 month open ) . In last month  i see
> dramatic extra  allocation on the segment memory (around 30% where in
> regulare    month is around 5%)  , the only change i see is that the nested
> object is now include avg 8 instances  )  , this increases the amount of
> the hidden document we have now on the  index (about more then twice) .
> When we optimize the index the amount of allocation memory was reduced (we
> see it only after rolling restart the nodes )   .
>
> If you don't mind  i have few question :
> 1) Do you know about an  way  to figure   out which component is taking all
> this memory .
> 2) Do you see relation between the fact that the nested objects was
> increases to the extra memory allocation we have ?
> 3) Did FST memory usage is  impacted by the fact we optimize the
> problematic
> index  and why  we see it only after restarting ES service
>
> Thanks mike
>
> .
>
>
>
>
>
>
> --
> Sent from: http://lucene.472066.n3.nabble.com/Lucene-Java-Users-
> f532864.html
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org
>
>

Re: Optimize FTS memory footprint

Posted by Michael McCandless <lu...@mikemccandless.com>.
Try upgrading Elasticsearch -- it's up to 6.0 release just a few week ago
now -- its (and Lucene's) memory usage has decreased over time.

The _uid field in particular will always be costly, unfortunately.  Since
it's a primary key, every term will be unique, and the term index has to
work hard to store all the prefixes for those keys.

Mike McCandless

http://blog.mikemccandless.com

On Sat, Dec 2, 2017 at 9:42 PM, Bingtao Yin <yb...@gmail.com> wrote:

> Thanks, mike. I'm facing a similar problem.
> I'm running a 2.0 elasticsearch cluster, and find the fst of _uid field
> takes a lot of memory. The _uid field is not analyzed and generated by
> elasticsearch, which also has high cardinality.
> Is there any ways to  reduce memory cost for _uid field? Thanks.
>
>
> 2017-11-29 5:47 GMT+08:00 elirev <el...@gmail.com>:
>
> > Thanks   Mike .
> > I did not  find  any  clear  way to know it its FST or Norm , or
> something
> > else ( unless i miss something )  the fact the FST is an in memory prefix
> > index lead me to think it using most of the heap   .
> > Our  mapping is normal with around of 200 columns one of the columns is
> > nested object with limited amount of objects (up to 4 instances  )   , we
> > are using monthly base indexes  (keep 6 month open ) . In last month  i
> see
> > dramatic extra  allocation on the segment memory (around 30% where in
> > regulare    month is around 5%)  , the only change i see is that the
> nested
> > object is now include avg 8 instances  )  , this increases the amount of
> > the hidden document we have now on the  index (about more then twice) .
> > When we optimize the index the amount of allocation memory was reduced
> (we
> > see it only after rolling restart the nodes )   .
> >
> > If you don't mind  i have few question :
> > 1) Do you know about an  way  to figure   out which component is taking
> all
> > this memory .
> > 2) Do you see relation between the fact that the nested objects was
> > increases to the extra memory allocation we have ?
> > 3) Did FST memory usage is  impacted by the fact we optimize the
> > problematic
> > index  and why  we see it only after restarting ES service
> >
> > Thanks mike
> >
> > .
> >
> >
> >
> >
> >
> >
> > --
> > Sent from: http://lucene.472066.n3.nabble.com/Lucene-Java-Users-
> > f532864.html
> >
> > ---------------------------------------------------------------------
> > To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> > For additional commands, e-mail: java-user-help@lucene.apache.org
> >
> >
>

Re: Optimize FTS memory footprint

Posted by Bingtao Yin <yb...@gmail.com>.
Hi elirev,

The field "index" of class "org.apache.lucene.codecs.blocktree.FieldReader"
is the fst of each field; its type is FST<BytesRef>. I close a index and
pick a shard; wirte some code to directly read the shard and then use the
reflection to get the actual fst object of _uid field. The ramBytesUsed()
method returns memory cost of the fst.

2017-12-12 1:05 GMT+08:00 elirev <el...@gmail.com>:

> Hו yin
> How do you determine the size being allocated for your _uid ?
>
>
>
> --
> Sent from: http://lucene.472066.n3.nabble.com/Lucene-Java-Users-
> f532864.html
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org
>
>

Re: Optimize FTS memory footprint

Posted by elirev <el...@gmail.com>.
Hו yin
How do you determine the size being allocated for your _uid ? 



--
Sent from: http://lucene.472066.n3.nabble.com/Lucene-Java-Users-f532864.html

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org