You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@lucene.apache.org by GitBox <gi...@apache.org> on 2021/08/28 05:42:09 UTC

[GitHub] [lucene] JavaCoderCff opened a new pull request #271: LUCENE-9969:TaxoArrays, a member variable of the DirectoryTaxonomyReader class, i…

JavaCoderCff opened a new pull request #271:
URL: https://github.com/apache/lucene/pull/271


   https://issues.apache.org/jira/browse/LUCENE-9969


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org


[GitHub] [lucene] mikemccand commented on pull request #271: LUCENE-9969:TaxoArrays, a member variable of the DirectoryTaxonomyReader class, i…

Posted by GitBox <gi...@apache.org>.
mikemccand commented on pull request #271:
URL: https://github.com/apache/lucene/pull/271#issuecomment-907633158


   Do you know how large your Taxonomy index is (how many unique `FacetLabel`s)?
   
   In your application, are all three arrays being allocated (`parents`, `siblings` and `children`)?  That triples the memory cost, but if you are not really needing full hierarchical facets, you should only need to allocate one of those arrays.  Or you could explore `SortedSetDocValues` faceting.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org


[GitHub] [lucene] JavaCoderCff commented on pull request #271: LUCENE-9969:TaxoArrays, a member variable of the DirectoryTaxonomyReader class, i…

Posted by GitBox <gi...@apache.org>.
JavaCoderCff commented on pull request #271:
URL: https://github.com/apache/lucene/pull/271#issuecomment-907637143


   Parents, siblings and children. I'm not sure why.
   
   I also think SoftRefrence is not the optimal solution, but it should be an urgent solution, at least our app will not crash frequently due to OOM.
   
   I really hope to have a better solution, thank you very much!!


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org


[GitHub] [lucene] dweiss commented on pull request #271: LUCENE-9969:TaxoArrays, a member variable of the DirectoryTaxonomyReader class, i…

Posted by GitBox <gi...@apache.org>.
dweiss commented on pull request #271:
URL: https://github.com/apache/lucene/pull/271#issuecomment-907578312


   Just a note from the side (I'm not familiar with this code): somehow I don't believe switching to a soft reference here is solving anything - it just shoves the problem under the rug at the cost of potentially recreating that value over and over in adverse heap conditions... if you're low on memory, it's sometimes better to hit an OOM... this way you're aware your heap is insufficient. A soft reference here and there will make everything trudge forward but may lead to dire runtime performance (a vicious cycle when the GC is freeing references, code keeps recreating them).


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org


[GitHub] [lucene] rmuir commented on pull request #271: LUCENE-9969:TaxoArrays, a member variable of the DirectoryTaxonomyReader class, i…

Posted by GitBox <gi...@apache.org>.
rmuir commented on pull request #271:
URL: https://github.com/apache/lucene/pull/271#issuecomment-907829353


   could we look at storing this stuff as docvalues instead of as payloads that are read into heap memory? It is just integers or list of integers right?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org


[GitHub] [lucene] mikemccand commented on pull request #271: LUCENE-9969:TaxoArrays, a member variable of the DirectoryTaxonomyReader class, i…

Posted by GitBox <gi...@apache.org>.
mikemccand commented on pull request #271:
URL: https://github.com/apache/lucene/pull/271#issuecomment-907633653


   These arrays are in general a crazy costly part of using taxonomy facets ... we should explore more efficient alternatives.  E.g. if the Lucene user is only using a single level hierarchy (field `foo` and value `bar`), maybe we could pre-allocate slices of ordinal space to each field `foo` which could be stored much more compactly (e.g. an interval tree) than the full `int[] parents` array we use today.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org


[GitHub] [lucene] rmuir commented on pull request #271: LUCENE-9969:TaxoArrays, a member variable of the DirectoryTaxonomyReader class, i…

Posted by GitBox <gi...@apache.org>.
rmuir commented on pull request #271:
URL: https://github.com/apache/lucene/pull/271#issuecomment-907612690


   Yeah, SoftReference is never the solution. We should add this class to our forbidden apis list.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org