You are viewing a plain text version of this content. The canonical link for it is here.
Posted to oak-issues@jackrabbit.apache.org by "Stefan Egli (Jira)" <ji...@apache.org> on 2022/08/04 14:22:00 UTC
[jira] [Created] (OAK-9880) Simplify rgc query
Stefan Egli created OAK-9880:
--------------------------------
Summary: Simplify rgc query
Key: OAK-9880
URL: https://issues.apache.org/jira/browse/OAK-9880
Project: Jackrabbit Oak
Issue Type: Task
Components: mongomk
Reporter: Stefan Egli
Assignee: Stefan Egli
We have seen a repeat of long running rgc *remove* operations - similarly to what was described in OAK-8351.
This time happening with the query generated by [queryForDefaultNoBranch|https://github.com/apache/jackrabbit-oak/blob/99b250a05ffe490f66de67374125fabee17f6fda/oak-store-document/src/main/java/org/apache/jackrabbit/oak/plugins/document/mongo/MongoVersionGCSupport.java#L213-L242] with the query shape for example similar to:
{noformat}
{
"_sdType" : 70,
"_sdMaxRevTime" : {
"$lt" : NumberLong(1603030303)
},
"$or" : [
{
"$or" : [
{
"_id" : /.*-1\/0/
},
{
"_id" : /[^-]*/,
"_path" : /.*-1\/0/
}
],
"_sdMaxRevTime" : {
"$lt" : NumberLong(1602020202)
}
},
{
"$or" : [
{
"_id" : /.*-2\/0/
},
{
"_id" : /[^-]*/,
"_path" : /.*-2/0/
}
],
"_sdMaxRevTime" : {
"$lt" : NumberLong(1601010101)
}
}
}
{noformat}
While setting an index filter with the query plan in mongodb is one option, we could additionally also look into simplifying the above query further into multiple queries : eg. by having 1 query per clusterNodeId, and then simplifying the {{_sdMaxRevTime}} accordingly, so that the above would translate into the following 2 queries (with the hope that mongodb finds the optimal query plan) :
{noformat}
{
"_sdType" : 70,
"_sdMaxRevTime" : {
"$lt" : NumberLong(1602020202)
},
"$or" : [
{
"_id" : /.*-1\/0/
},
{
"_id" : /[^-]*/,
"_path" : /.*-1\/0/
}
}
}
{noformat}
and
{noformat}
{
"_sdType" : 70,
"_sdMaxRevTime" : {
"$lt" : NumberLong(1601010101)
},
"$or" : [
{
"_id" : /.*-2\/0/
},
{
"_id" : /[^-]*/,
"_path" : /.*-2\/0/
}
}
}
{noformat}
--
This message was sent by Atlassian Jira
(v8.20.10#820010)