You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@jackrabbit.apache.org by Bart van der Schans <b....@onehippo.com> on 2012/01/20 15:22:09 UTC

Question about consistency checker and node ordering in queries

Hi all,

I was wondering about how the ConsistencyCheckerImpl fetches all the
node ids. The first set of node ids is fetched with
"pm.getAllNodeIds(null, NODESATONCE)" in the internalCheckConsistency
method. All subsequent sets are fetched with "pm.getAllNodeIds(lastId,
NODESATONCE)".

In the BundleDbPersistenceManager the method
getAllNodeIds(NodeId,maxCount) uses the "bundleSelectAllIdsSQL" if the
first parameter is null otherwise the "bundleSelectAllIdsFromSQL"
query. The  "bundleSelectAllIdsSQL" query doesn't use ordering and the
"bundleSelectAllIdsFromSQL" does.

Now my question is the following: souldn't the "bundleSelectAllIdsSQL"
also use an order clause to make sure the correct batch of node ids is
fetched the first time? Or is that somehow implicitly guaranteed?

Regards,
Bart

Re: Question about consistency checker and node ordering in queries

Posted by Julian Reschke <ju...@gmx.de>.
On 2012-01-20 17:16, Bart van der Schans wrote:
> On Fri, Jan 20, 2012 at 3:44 PM, Bart van der Schans
> <b....@onehippo.com>  wrote:
>> On Fri, Jan 20, 2012 at 3:39 PM, Julian Reschke<ju...@gmx.de>  wrote:
>>> On 2012-01-20 15:22, Bart van der Schans wrote:
>>>>
>>>> Hi all,
>>>>
>>>> I was wondering about how the ConsistencyCheckerImpl fetches all the
>>>> node ids. The first set of node ids is fetched with
>>>> "pm.getAllNodeIds(null, NODESATONCE)" in the internalCheckConsistency
>>>> method. All subsequent sets are fetched with "pm.getAllNodeIds(lastId,
>>>> NODESATONCE)".
>>>>
>>>> In the BundleDbPersistenceManager the method
>>>> getAllNodeIds(NodeId,maxCount) uses the "bundleSelectAllIdsSQL" if the
>>>> first parameter is null otherwise the "bundleSelectAllIdsFromSQL"
>>>> query. The  "bundleSelectAllIdsSQL" query doesn't use ordering and the
>>>> "bundleSelectAllIdsFromSQL" does.
>>>>
>>>> Now my question is the following: souldn't the "bundleSelectAllIdsSQL"
>>>> also use an order clause to make sure the correct batch of node ids is
>>>> fetched the first time? Or is that somehow implicitly guaranteed?
>>>
>>>
>>> Good question. I was confused about that as well. I tried and stepped
>>> through he code and everything seem to worked as advertised, so I stopped
>>> worrying about it :-)
>>
>> I think (at least on mysql) we're just lucky that the natural ordering
>> is by NODE_ID:
>>
>> mysql>  SELECT hex(NODE_ID) FROM VERSION_BUNDLE ORDER BY NODE_ID LIMIT 0,10;
>> +----------------------------------+
>> | hex(NODE_ID)                     |
>> +----------------------------------+
>> | 000005E87E6D402BA27CDF41E7999D8A |
>> | 000007DC02CF48BA8FFEDA5F566A3A98 |
>> | 00000C4A81744934B4F0CB55C9967378 |
>> | 000016C9E4464C4FA7B6F4B9CF2F9903 |
>> | 00001D5A1C6F4339B317BB436AA54431 |
>> | 0000257D67AB4F1EA4DDEDB84BB13039 |
>> | 00002C51F22145B89F59E882BB505036 |
>> | 00002F2AFC2E4842863C360B68A36BDA |
>> | 0000354F151945CEA953D62C922A4FAC |
>> | 00003C0254C8471E8349C0B6F0DEFDC2 |
>> +----------------------------------+
>> 10 rows in set (0.00 sec)
>>
>> mysql>  SELECT hex(NODE_ID) FROM VERSION_BUNDLE LIMIT 0,10;
>> +----------------------------------+
>> | hex(NODE_ID)                     |
>> +----------------------------------+
>> | 000005E87E6D402BA27CDF41E7999D8A |
>> | 000007DC02CF48BA8FFEDA5F566A3A98 |
>> | 00000C4A81744934B4F0CB55C9967378 |
>> | 000016C9E4464C4FA7B6F4B9CF2F9903 |
>> | 00001D5A1C6F4339B317BB436AA54431 |
>> | 0000257D67AB4F1EA4DDEDB84BB13039 |
>> | 00002C51F22145B89F59E882BB505036 |
>> | 00002F2AFC2E4842863C360B68A36BDA |
>> | 0000354F151945CEA953D62C922A4FAC |
>> | 00003C0254C8471E8349C0B6F0DEFDC2 |
>> +----------------------------------+
>> 10 rows in set (0.00 sec)
>>
>> But that doesn't have to be the case with other databases. Maybe
>> somebody can try similar queries on another database?
>
> Does anybody have an objection to adding an order clause to the
> "bundleSelectAllIdsSQL" statement? If not, I'll create an issue and
> add it.
>
> Regards,
> Bart

Sounds good to me.

Best regards, Julian


Re: Question about consistency checker and node ordering in queries

Posted by Jukka Zitting <ju...@gmail.com>.
Hi,

On Fri, Jan 20, 2012 at 5:16 PM, Bart van der Schans
<b....@onehippo.com> wrote:
> Does anybody have an objection to adding an order clause to the
> "bundleSelectAllIdsSQL" statement? If not, I'll create an issue and
> add it.

Sounds like the right thing to do.

BR,

Jukka Zitting

Re: Question about consistency checker and node ordering in queries

Posted by Bart van der Schans <b....@onehippo.com>.
On Fri, Jan 20, 2012 at 3:44 PM, Bart van der Schans
<b....@onehippo.com> wrote:
> On Fri, Jan 20, 2012 at 3:39 PM, Julian Reschke <ju...@gmx.de> wrote:
>> On 2012-01-20 15:22, Bart van der Schans wrote:
>>>
>>> Hi all,
>>>
>>> I was wondering about how the ConsistencyCheckerImpl fetches all the
>>> node ids. The first set of node ids is fetched with
>>> "pm.getAllNodeIds(null, NODESATONCE)" in the internalCheckConsistency
>>> method. All subsequent sets are fetched with "pm.getAllNodeIds(lastId,
>>> NODESATONCE)".
>>>
>>> In the BundleDbPersistenceManager the method
>>> getAllNodeIds(NodeId,maxCount) uses the "bundleSelectAllIdsSQL" if the
>>> first parameter is null otherwise the "bundleSelectAllIdsFromSQL"
>>> query. The  "bundleSelectAllIdsSQL" query doesn't use ordering and the
>>> "bundleSelectAllIdsFromSQL" does.
>>>
>>> Now my question is the following: souldn't the "bundleSelectAllIdsSQL"
>>> also use an order clause to make sure the correct batch of node ids is
>>> fetched the first time? Or is that somehow implicitly guaranteed?
>>
>>
>> Good question. I was confused about that as well. I tried and stepped
>> through he code and everything seem to worked as advertised, so I stopped
>> worrying about it :-)
>
> I think (at least on mysql) we're just lucky that the natural ordering
> is by NODE_ID:
>
> mysql> SELECT hex(NODE_ID) FROM VERSION_BUNDLE ORDER BY NODE_ID LIMIT 0,10;
> +----------------------------------+
> | hex(NODE_ID)                     |
> +----------------------------------+
> | 000005E87E6D402BA27CDF41E7999D8A |
> | 000007DC02CF48BA8FFEDA5F566A3A98 |
> | 00000C4A81744934B4F0CB55C9967378 |
> | 000016C9E4464C4FA7B6F4B9CF2F9903 |
> | 00001D5A1C6F4339B317BB436AA54431 |
> | 0000257D67AB4F1EA4DDEDB84BB13039 |
> | 00002C51F22145B89F59E882BB505036 |
> | 00002F2AFC2E4842863C360B68A36BDA |
> | 0000354F151945CEA953D62C922A4FAC |
> | 00003C0254C8471E8349C0B6F0DEFDC2 |
> +----------------------------------+
> 10 rows in set (0.00 sec)
>
> mysql> SELECT hex(NODE_ID) FROM VERSION_BUNDLE LIMIT 0,10;
> +----------------------------------+
> | hex(NODE_ID)                     |
> +----------------------------------+
> | 000005E87E6D402BA27CDF41E7999D8A |
> | 000007DC02CF48BA8FFEDA5F566A3A98 |
> | 00000C4A81744934B4F0CB55C9967378 |
> | 000016C9E4464C4FA7B6F4B9CF2F9903 |
> | 00001D5A1C6F4339B317BB436AA54431 |
> | 0000257D67AB4F1EA4DDEDB84BB13039 |
> | 00002C51F22145B89F59E882BB505036 |
> | 00002F2AFC2E4842863C360B68A36BDA |
> | 0000354F151945CEA953D62C922A4FAC |
> | 00003C0254C8471E8349C0B6F0DEFDC2 |
> +----------------------------------+
> 10 rows in set (0.00 sec)
>
> But that doesn't have to be the case with other databases. Maybe
> somebody can try similar queries on another database?

Does anybody have an objection to adding an order clause to the
"bundleSelectAllIdsSQL" statement? If not, I'll create an issue and
add it.

Regards,
Bart

Re: Question about consistency checker and node ordering in queries

Posted by Bart van der Schans <b....@onehippo.com>.
On Fri, Jan 20, 2012 at 3:39 PM, Julian Reschke <ju...@gmx.de> wrote:
> On 2012-01-20 15:22, Bart van der Schans wrote:
>>
>> Hi all,
>>
>> I was wondering about how the ConsistencyCheckerImpl fetches all the
>> node ids. The first set of node ids is fetched with
>> "pm.getAllNodeIds(null, NODESATONCE)" in the internalCheckConsistency
>> method. All subsequent sets are fetched with "pm.getAllNodeIds(lastId,
>> NODESATONCE)".
>>
>> In the BundleDbPersistenceManager the method
>> getAllNodeIds(NodeId,maxCount) uses the "bundleSelectAllIdsSQL" if the
>> first parameter is null otherwise the "bundleSelectAllIdsFromSQL"
>> query. The  "bundleSelectAllIdsSQL" query doesn't use ordering and the
>> "bundleSelectAllIdsFromSQL" does.
>>
>> Now my question is the following: souldn't the "bundleSelectAllIdsSQL"
>> also use an order clause to make sure the correct batch of node ids is
>> fetched the first time? Or is that somehow implicitly guaranteed?
>
>
> Good question. I was confused about that as well. I tried and stepped
> through he code and everything seem to worked as advertised, so I stopped
> worrying about it :-)

I think (at least on mysql) we're just lucky that the natural ordering
is by NODE_ID:

mysql> SELECT hex(NODE_ID) FROM VERSION_BUNDLE ORDER BY NODE_ID LIMIT 0,10;
+----------------------------------+
| hex(NODE_ID)                     |
+----------------------------------+
| 000005E87E6D402BA27CDF41E7999D8A |
| 000007DC02CF48BA8FFEDA5F566A3A98 |
| 00000C4A81744934B4F0CB55C9967378 |
| 000016C9E4464C4FA7B6F4B9CF2F9903 |
| 00001D5A1C6F4339B317BB436AA54431 |
| 0000257D67AB4F1EA4DDEDB84BB13039 |
| 00002C51F22145B89F59E882BB505036 |
| 00002F2AFC2E4842863C360B68A36BDA |
| 0000354F151945CEA953D62C922A4FAC |
| 00003C0254C8471E8349C0B6F0DEFDC2 |
+----------------------------------+
10 rows in set (0.00 sec)

mysql> SELECT hex(NODE_ID) FROM VERSION_BUNDLE LIMIT 0,10;
+----------------------------------+
| hex(NODE_ID)                     |
+----------------------------------+
| 000005E87E6D402BA27CDF41E7999D8A |
| 000007DC02CF48BA8FFEDA5F566A3A98 |
| 00000C4A81744934B4F0CB55C9967378 |
| 000016C9E4464C4FA7B6F4B9CF2F9903 |
| 00001D5A1C6F4339B317BB436AA54431 |
| 0000257D67AB4F1EA4DDEDB84BB13039 |
| 00002C51F22145B89F59E882BB505036 |
| 00002F2AFC2E4842863C360B68A36BDA |
| 0000354F151945CEA953D62C922A4FAC |
| 00003C0254C8471E8349C0B6F0DEFDC2 |
+----------------------------------+
10 rows in set (0.00 sec)

But that doesn't have to be the case with other databases. Maybe
somebody can try similar queries on another database?

Regards,
bart

Re: Question about consistency checker and node ordering in queries

Posted by Julian Reschke <ju...@gmx.de>.
On 2012-01-20 15:22, Bart van der Schans wrote:
> Hi all,
>
> I was wondering about how the ConsistencyCheckerImpl fetches all the
> node ids. The first set of node ids is fetched with
> "pm.getAllNodeIds(null, NODESATONCE)" in the internalCheckConsistency
> method. All subsequent sets are fetched with "pm.getAllNodeIds(lastId,
> NODESATONCE)".
>
> In the BundleDbPersistenceManager the method
> getAllNodeIds(NodeId,maxCount) uses the "bundleSelectAllIdsSQL" if the
> first parameter is null otherwise the "bundleSelectAllIdsFromSQL"
> query. The  "bundleSelectAllIdsSQL" query doesn't use ordering and the
> "bundleSelectAllIdsFromSQL" does.
>
> Now my question is the following: souldn't the "bundleSelectAllIdsSQL"
> also use an order clause to make sure the correct batch of node ids is
> fetched the first time? Or is that somehow implicitly guaranteed?

Good question. I was confused about that as well. I tried and stepped 
through he code and everything seem to worked as advertised, so I 
stopped worrying about it :-)