You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@solr.apache.org by GitBox <gi...@apache.org> on 2022/11/17 21:23:27 UTC
[GitHub] [solr] risdenk opened a new pull request, #1183: SOLR-16555: SolrIndexSearcher - FilterCache intersections/andNot should not clone bitsets repeatedly
risdenk opened a new pull request, #1183:
URL: https://github.com/apache/solr/pull/1183
https://issues.apache.org/jira/browse/SOLR-16555
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: issues-unsubscribe@solr.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@solr.apache.org
For additional commands, e-mail: issues-help@solr.apache.org
[GitHub] [solr] dsmiley commented on pull request #1183: SOLR-16555: SolrIndexSearcher - FilterCache intersections/andNot should not clone bitsets repeatedly
Posted by GitBox <gi...@apache.org>.
dsmiley commented on PR #1183:
URL: https://github.com/apache/solr/pull/1183#issuecomment-1319255367
Just to be clear, this is a highly performance critical code path. Sadly we have no benchmarks of this that I know of but at least there's peer review. Once the solution looks more work-able, I'll at-mention some more eye-balls. Maybe you might consider adding to /solr/benchmark.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: issues-unsubscribe@solr.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@solr.apache.org
For additional commands, e-mail: issues-help@solr.apache.org
[GitHub] [solr] risdenk commented on pull request #1183: SOLR-16555: SolrIndexSearcher - FilterCache intersections/andNot should not clone bitsets repeatedly
Posted by GitBox <gi...@apache.org>.
risdenk commented on PR #1183:
URL: https://github.com/apache/solr/pull/1183#issuecomment-1319276769
From David on Jira an idea to explore:
> Create a subclass of BitDocSet called MutableBitDocSet. Its intersect() and andNot() methods can return itself, and be implemented that way. In getProcessedFilter, before those two tight loops that call those two methods, upgrade/replace "answer" with MutableBitDocSet if end (the number of combinations to do) is greater than 1. If it's 1 or 0 then there's no point in the upgrade.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: issues-unsubscribe@solr.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@solr.apache.org
For additional commands, e-mail: issues-help@solr.apache.org
[GitHub] [solr] risdenk commented on pull request #1183: SOLR-16555: SolrIndexSearcher - FilterCache intersections/andNot should not clone bitsets repeatedly
Posted by GitBox <gi...@apache.org>.
risdenk commented on PR #1183:
URL: https://github.com/apache/solr/pull/1183#issuecomment-1320309454
This is inefficient as David pointed out and caused timeouts on a bunch of queries during testing. I was hoping to at least capture some profile from it but didn't last long enough to do that. https://github.com/apache/solr/pull/1184 is next iteration using a `MutableBitSet` as David pointed out.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: issues-unsubscribe@solr.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@solr.apache.org
For additional commands, e-mail: issues-help@solr.apache.org
[GitHub] [solr] risdenk closed pull request #1183: SOLR-16555: SolrIndexSearcher - FilterCache intersections/andNot should not clone bitsets repeatedly
Posted by GitBox <gi...@apache.org>.
risdenk closed pull request #1183: SOLR-16555: SolrIndexSearcher - FilterCache intersections/andNot should not clone bitsets repeatedly
URL: https://github.com/apache/solr/pull/1183
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: issues-unsubscribe@solr.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@solr.apache.org
For additional commands, e-mail: issues-help@solr.apache.org
[GitHub] [solr] risdenk commented on a diff in pull request #1183: SOLR-16555: SolrIndexSearcher - FilterCache intersections/andNot should not clone bitsets repeatedly
Posted by GitBox <gi...@apache.org>.
risdenk commented on code in PR #1183:
URL: https://github.com/apache/solr/pull/1183#discussion_r1025766090
##########
solr/core/src/java/org/apache/solr/search/SolrIndexSearcher.java:
##########
@@ -1211,17 +1211,42 @@ public ProcessedFilter getProcessedFilter(DocSet setFilter, List<Query> queries)
}
// Are all of our normal cached filters negative?
- if (end > 0 && answer == null) {
- answer = getLiveDocSet();
- }
+ if (end > 0) {
+ if (answer == null) {
+ answer = getLiveDocSet();
+ }
- // do negative queries first to shrink set size
- for (int i = 0; i < end; i++) {
- if (neg[i]) answer = answer.andNot(sets[i]);
- }
+ // min size must be maxDoc based on BitDocSet conversion at the end
+ FixedBitSet tempAnswer = FixedBitSet.ensureCapacity(answer.getFixedBitSetClone(), maxDoc());
+
+ // do negative queries first to shrink set size
+ for (int i = 0; i < end; i++) {
+ if (neg[i]) {
+ // Clear any documents that are in the negative filter query
+ DocIterator iterator = sets[i].iterator();
+ while (iterator.hasNext()) {
+ int doc = iterator.nextDoc();
+ tempAnswer.clear(doc);
+ }
+ }
+ }
+
+ for (int i = 0; i < end; i++) {
+ if (!neg[i] && i != smallestIndex) {
+ DocIterator iterator = sets[i].iterator();
Review Comment:
I need to add a comment around why not using `answer.intersection(sets[i])`
##########
solr/core/src/java/org/apache/solr/search/SolrIndexSearcher.java:
##########
@@ -1211,17 +1211,42 @@ public ProcessedFilter getProcessedFilter(DocSet setFilter, List<Query> queries)
}
// Are all of our normal cached filters negative?
- if (end > 0 && answer == null) {
- answer = getLiveDocSet();
- }
+ if (end > 0) {
+ if (answer == null) {
+ answer = getLiveDocSet();
+ }
- // do negative queries first to shrink set size
- for (int i = 0; i < end; i++) {
- if (neg[i]) answer = answer.andNot(sets[i]);
- }
+ // min size must be maxDoc based on BitDocSet conversion at the end
+ FixedBitSet tempAnswer = FixedBitSet.ensureCapacity(answer.getFixedBitSetClone(), maxDoc());
+
+ // do negative queries first to shrink set size
+ for (int i = 0; i < end; i++) {
+ if (neg[i]) {
+ // Clear any documents that are in the negative filter query
Review Comment:
I need to add a comment around why not using `answer.andNot(sets[i])`
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: issues-unsubscribe@solr.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@solr.apache.org
For additional commands, e-mail: issues-help@solr.apache.org
[GitHub] [solr] risdenk commented on pull request #1183: SOLR-16555: SolrIndexSearcher - FilterCache intersections/andNot should not clone bitsets repeatedly
Posted by GitBox <gi...@apache.org>.
risdenk commented on PR #1183:
URL: https://github.com/apache/solr/pull/1183#issuecomment-1319266439
> Instead of bulk bitSet operations, you're iterating doc-by-doc! For small sets (SortedIntDocSet), this is fine, but for bigger ones -- no.
@dsmiley yea not saying this is the final version at all. I'm actually curious on testing this with some benchmarking stuff to see how much the memory allocations slowed this down. This approach requiring more CPU, but less memory. I would not disagree there is more performance using the bulk bitset operations.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: issues-unsubscribe@solr.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@solr.apache.org
For additional commands, e-mail: issues-help@solr.apache.org