You are viewing a plain text version of this content. The canonical link for it is here.
Posted to java-user@lucene.apache.org by Bingtao Yin <yb...@gmail.com> on 2017/12/03 02:42:25 UTC
Re: Optimize FTS memory footprint
Thanks, mike. I'm facing a similar problem.
I'm running a 2.0 elasticsearch cluster, and find the fst of _uid field
takes a lot of memory. The _uid field is not analyzed and generated by
elasticsearch, which also has high cardinality.
Is there any ways to reduce memory cost for _uid field? Thanks.
2017-11-29 5:47 GMT+08:00 elirev <el...@gmail.com>:
> Thanks Mike .
> I did not find any clear way to know it its FST or Norm , or something
> else ( unless i miss something ) the fact the FST is an in memory prefix
> index lead me to think it using most of the heap .
> Our mapping is normal with around of 200 columns one of the columns is
> nested object with limited amount of objects (up to 4 instances ) , we
> are using monthly base indexes (keep 6 month open ) . In last month i see
> dramatic extra allocation on the segment memory (around 30% where in
> regulare month is around 5%) , the only change i see is that the nested
> object is now include avg 8 instances ) , this increases the amount of
> the hidden document we have now on the index (about more then twice) .
> When we optimize the index the amount of allocation memory was reduced (we
> see it only after rolling restart the nodes ) .
>
> If you don't mind i have few question :
> 1) Do you know about an way to figure out which component is taking all
> this memory .
> 2) Do you see relation between the fact that the nested objects was
> increases to the extra memory allocation we have ?
> 3) Did FST memory usage is impacted by the fact we optimize the
> problematic
> index and why we see it only after restarting ES service
>
> Thanks mike
>
> .
>
>
>
>
>
>
> --
> Sent from: http://lucene.472066.n3.nabble.com/Lucene-Java-Users-
> f532864.html
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org
>
>
Re: Optimize FTS memory footprint
Posted by Michael McCandless <lu...@mikemccandless.com>.
Try upgrading Elasticsearch -- it's up to 6.0 release just a few week ago
now -- its (and Lucene's) memory usage has decreased over time.
The _uid field in particular will always be costly, unfortunately. Since
it's a primary key, every term will be unique, and the term index has to
work hard to store all the prefixes for those keys.
Mike McCandless
http://blog.mikemccandless.com
On Sat, Dec 2, 2017 at 9:42 PM, Bingtao Yin <yb...@gmail.com> wrote:
> Thanks, mike. I'm facing a similar problem.
> I'm running a 2.0 elasticsearch cluster, and find the fst of _uid field
> takes a lot of memory. The _uid field is not analyzed and generated by
> elasticsearch, which also has high cardinality.
> Is there any ways to reduce memory cost for _uid field? Thanks.
>
>
> 2017-11-29 5:47 GMT+08:00 elirev <el...@gmail.com>:
>
> > Thanks Mike .
> > I did not find any clear way to know it its FST or Norm , or
> something
> > else ( unless i miss something ) the fact the FST is an in memory prefix
> > index lead me to think it using most of the heap .
> > Our mapping is normal with around of 200 columns one of the columns is
> > nested object with limited amount of objects (up to 4 instances ) , we
> > are using monthly base indexes (keep 6 month open ) . In last month i
> see
> > dramatic extra allocation on the segment memory (around 30% where in
> > regulare month is around 5%) , the only change i see is that the
> nested
> > object is now include avg 8 instances ) , this increases the amount of
> > the hidden document we have now on the index (about more then twice) .
> > When we optimize the index the amount of allocation memory was reduced
> (we
> > see it only after rolling restart the nodes ) .
> >
> > If you don't mind i have few question :
> > 1) Do you know about an way to figure out which component is taking
> all
> > this memory .
> > 2) Do you see relation between the fact that the nested objects was
> > increases to the extra memory allocation we have ?
> > 3) Did FST memory usage is impacted by the fact we optimize the
> > problematic
> > index and why we see it only after restarting ES service
> >
> > Thanks mike
> >
> > .
> >
> >
> >
> >
> >
> >
> > --
> > Sent from: http://lucene.472066.n3.nabble.com/Lucene-Java-Users-
> > f532864.html
> >
> > ---------------------------------------------------------------------
> > To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> > For additional commands, e-mail: java-user-help@lucene.apache.org
> >
> >
>
Re: Optimize FTS memory footprint
Posted by Bingtao Yin <yb...@gmail.com>.
Hi elirev,
The field "index" of class "org.apache.lucene.codecs.blocktree.FieldReader"
is the fst of each field; its type is FST<BytesRef>. I close a index and
pick a shard; wirte some code to directly read the shard and then use the
reflection to get the actual fst object of _uid field. The ramBytesUsed()
method returns memory cost of the fst.
2017-12-12 1:05 GMT+08:00 elirev <el...@gmail.com>:
> Hו yin
> How do you determine the size being allocated for your _uid ?
>
>
>
> --
> Sent from: http://lucene.472066.n3.nabble.com/Lucene-Java-Users-
> f532864.html
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org
>
>
Re: Optimize FTS memory footprint
Posted by elirev <el...@gmail.com>.
Hו yin
How do you determine the size being allocated for your _uid ?
--
Sent from: http://lucene.472066.n3.nabble.com/Lucene-Java-Users-f532864.html
---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org