You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@jackrabbit.apache.org by "Unico Hommes (Created) (JIRA)" <ji...@apache.org> on 2012/03/16 15:25:44 UTC

[jira] [Created] (JCR-3261) BundleDbPersistenceManager getAllIds gives wrong amount of results for MySQL

BundleDbPersistenceManager getAllIds gives wrong amount of results for MySQL
----------------------------------------------------------------------------

                 Key: JCR-3261
                 URL: https://issues.apache.org/jira/browse/JCR-3261
             Project: Jackrabbit Content Repository
          Issue Type: Bug
    Affects Versions: 2.4
            Reporter: Unico Hommes
             Fix For: 2.4.1


The problem arises when the method parameter maxcount is less than the total amount of records in the bundle table.

First of all I found out that mysql orders the nodeid objects different than jackrabbit does. The following test describes this idea:

    public void testMySQLOrderByNodeId() throws Exception {
        NodeId nodeId1 = new NodeId("7ff9e87c-f87f-4d35-9d61-2e298e56ac37");
        NodeId nodeId2 = new NodeId("9fd0d452-b5d0-426b-8a0f-bef830ba0495");

        PreparedStatement stmt = connection.prepareStatement("SELECT NODE_ID FROM DEFAULT_BUNDLE WHERE NODE_ID = ? OR NODE_ID = ? ORDER BY NODE_ID");

        Object[] params = new Object[] { nodeId1.getRawBytes(), nodeId2.getRawBytes() };
        stmt.setObject(1, params[0]);
        stmt.setObject(2, params[1]);

        ArrayList<NodeId> nodeIds = new ArrayList<NodeId>();
        ResultSet resultSet = stmt.executeQuery();
        while(resultSet.next()) {
            NodeId nodeId = new NodeId(resultSet.getBytes(1));
            System.out.println(nodeId);
            nodeIds.add(nodeId);
        }
        Collections.sort(nodeIds);
        for (NodeId nodeId : nodeIds) {
            System.out.println(nodeId);
        }
    }

Which results in the following output:

7ff9e87c-f87f-4d35-9d61-2e298e56ac37
9fd0d452-b5d0-426b-8a0f-bef830ba0495
9fd0d452-b5d0-426b-8a0f-bef830ba0495
7ff9e87c-f87f-4d35-9d61-2e298e56ac37


Now the problem with the getAllNodeIds method is that it fetches an extra 10 records on top of maxcount (to avoid a problem where the first key is not the one you that is wanted). Afterwards it skips a number of records again, this time using nodeid.compareto. This compareto statement returns true unexpectedly for mysql because the code doesn't expect the mysql ordering.

I had the situation where I had about 17000 records in the bundle table but consecutively getting the ids a thousand records at a time returned only about 8000 records in all.

I'll attach a patch that fixes the problem.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Updated] (JCR-3261) Problems with BundleDbPersistenceManager getAllNodeIds

Posted by "Jukka Zitting (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/JCR-3261?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Jukka Zitting updated JCR-3261:
-------------------------------

    Fix Version/s:     (was: 2.6)
                   2.5
    
> Problems with BundleDbPersistenceManager getAllNodeIds
> ------------------------------------------------------
>
>                 Key: JCR-3261
>                 URL: https://issues.apache.org/jira/browse/JCR-3261
>             Project: Jackrabbit Content Repository
>          Issue Type: Bug
>    Affects Versions: 2.4
>            Reporter: Unico Hommes
>            Assignee: Bart van der Schans
>             Fix For: 2.4.1, 2.5
>
>         Attachments: bdbpm_allids.patch
>
>
> When using MySQL:
> The problem arises when the method parameter maxcount is less than the total amount of records in the bundle table.
> First of all I found out that mysql orders the nodeid objects different than jackrabbit does. The following test describes this idea:
>     public void testMySQLOrderByNodeId() throws Exception {
>         NodeId nodeId1 = new NodeId("7ff9e87c-f87f-4d35-9d61-2e298e56ac37");
>         NodeId nodeId2 = new NodeId("9fd0d452-b5d0-426b-8a0f-bef830ba0495");
>         PreparedStatement stmt = connection.prepareStatement("SELECT NODE_ID FROM DEFAULT_BUNDLE WHERE NODE_ID = ? OR NODE_ID = ? ORDER BY NODE_ID");
>         Object[] params = new Object[] { nodeId1.getRawBytes(), nodeId2.getRawBytes() };
>         stmt.setObject(1, params[0]);
>         stmt.setObject(2, params[1]);
>         ArrayList<NodeId> nodeIds = new ArrayList<NodeId>();
>         ResultSet resultSet = stmt.executeQuery();
>         while(resultSet.next()) {
>             NodeId nodeId = new NodeId(resultSet.getBytes(1));
>             System.out.println(nodeId);
>             nodeIds.add(nodeId);
>         }
>         Collections.sort(nodeIds);
>         for (NodeId nodeId : nodeIds) {
>             System.out.println(nodeId);
>         }
>     }
> Which results in the following output:
> 7ff9e87c-f87f-4d35-9d61-2e298e56ac37
> 9fd0d452-b5d0-426b-8a0f-bef830ba0495
> 9fd0d452-b5d0-426b-8a0f-bef830ba0495
> 7ff9e87c-f87f-4d35-9d61-2e298e56ac37
> Now the problem with the getAllNodeIds method is that it fetches an extra 10 records on top of maxcount (to avoid a problem where the first key is not the one you that is wanted). Afterwards it skips a number of records again, this time using nodeid.compareto. This compareto statement returns true unexpectedly for mysql because the code doesn't expect the mysql ordering.
> I had the situation where I had about 17000 records in the bundle table but consecutively getting the ids a thousand records at a time returned only about 8000 records in all.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Updated] (JCR-3261) Problems with BundleDbPersistenceManager getAllNodeIds

Posted by "Bart van der Schans (Updated) (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/JCR-3261?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Bart van der Schans updated JCR-3261:
-------------------------------------

    Fix Version/s: 2.6
    
> Problems with BundleDbPersistenceManager getAllNodeIds
> ------------------------------------------------------
>
>                 Key: JCR-3261
>                 URL: https://issues.apache.org/jira/browse/JCR-3261
>             Project: Jackrabbit Content Repository
>          Issue Type: Bug
>    Affects Versions: 2.4
>            Reporter: Unico Hommes
>            Assignee: Bart van der Schans
>             Fix For: 2.4.1, 2.6
>
>         Attachments: bdbpm_allids.patch
>
>
> When using MySQL:
> The problem arises when the method parameter maxcount is less than the total amount of records in the bundle table.
> First of all I found out that mysql orders the nodeid objects different than jackrabbit does. The following test describes this idea:
>     public void testMySQLOrderByNodeId() throws Exception {
>         NodeId nodeId1 = new NodeId("7ff9e87c-f87f-4d35-9d61-2e298e56ac37");
>         NodeId nodeId2 = new NodeId("9fd0d452-b5d0-426b-8a0f-bef830ba0495");
>         PreparedStatement stmt = connection.prepareStatement("SELECT NODE_ID FROM DEFAULT_BUNDLE WHERE NODE_ID = ? OR NODE_ID = ? ORDER BY NODE_ID");
>         Object[] params = new Object[] { nodeId1.getRawBytes(), nodeId2.getRawBytes() };
>         stmt.setObject(1, params[0]);
>         stmt.setObject(2, params[1]);
>         ArrayList<NodeId> nodeIds = new ArrayList<NodeId>();
>         ResultSet resultSet = stmt.executeQuery();
>         while(resultSet.next()) {
>             NodeId nodeId = new NodeId(resultSet.getBytes(1));
>             System.out.println(nodeId);
>             nodeIds.add(nodeId);
>         }
>         Collections.sort(nodeIds);
>         for (NodeId nodeId : nodeIds) {
>             System.out.println(nodeId);
>         }
>     }
> Which results in the following output:
> 7ff9e87c-f87f-4d35-9d61-2e298e56ac37
> 9fd0d452-b5d0-426b-8a0f-bef830ba0495
> 9fd0d452-b5d0-426b-8a0f-bef830ba0495
> 7ff9e87c-f87f-4d35-9d61-2e298e56ac37
> Now the problem with the getAllNodeIds method is that it fetches an extra 10 records on top of maxcount (to avoid a problem where the first key is not the one you that is wanted). Afterwards it skips a number of records again, this time using nodeid.compareto. This compareto statement returns true unexpectedly for mysql because the code doesn't expect the mysql ordering.
> I had the situation where I had about 17000 records in the bundle table but consecutively getting the ids a thousand records at a time returned only about 8000 records in all.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (JCR-3261) Problems with BundleDbPersistenceManager getAllNodeIds

Posted by "Thomas Mueller (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/JCR-3261?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13232605#comment-13232605 ] 

Thomas Mueller commented on JCR-3261:
-------------------------------------

> It would also defeat part of the purpose of the consistency checker (finding orphaned nodes).

Yes, absolutely true.
                
> Problems with BundleDbPersistenceManager getAllNodeIds
> ------------------------------------------------------
>
>                 Key: JCR-3261
>                 URL: https://issues.apache.org/jira/browse/JCR-3261
>             Project: Jackrabbit Content Repository
>          Issue Type: Bug
>    Affects Versions: 2.4
>            Reporter: Unico Hommes
>            Assignee: Bart van der Schans
>             Fix For: 2.4.1, 2.6
>
>         Attachments: bdbpm_allids.patch
>
>
> When using MySQL:
> The problem arises when the method parameter maxcount is less than the total amount of records in the bundle table.
> First of all I found out that mysql orders the nodeid objects different than jackrabbit does. The following test describes this idea:
>     public void testMySQLOrderByNodeId() throws Exception {
>         NodeId nodeId1 = new NodeId("7ff9e87c-f87f-4d35-9d61-2e298e56ac37");
>         NodeId nodeId2 = new NodeId("9fd0d452-b5d0-426b-8a0f-bef830ba0495");
>         PreparedStatement stmt = connection.prepareStatement("SELECT NODE_ID FROM DEFAULT_BUNDLE WHERE NODE_ID = ? OR NODE_ID = ? ORDER BY NODE_ID");
>         Object[] params = new Object[] { nodeId1.getRawBytes(), nodeId2.getRawBytes() };
>         stmt.setObject(1, params[0]);
>         stmt.setObject(2, params[1]);
>         ArrayList<NodeId> nodeIds = new ArrayList<NodeId>();
>         ResultSet resultSet = stmt.executeQuery();
>         while(resultSet.next()) {
>             NodeId nodeId = new NodeId(resultSet.getBytes(1));
>             System.out.println(nodeId);
>             nodeIds.add(nodeId);
>         }
>         Collections.sort(nodeIds);
>         for (NodeId nodeId : nodeIds) {
>             System.out.println(nodeId);
>         }
>     }
> Which results in the following output:
> 7ff9e87c-f87f-4d35-9d61-2e298e56ac37
> 9fd0d452-b5d0-426b-8a0f-bef830ba0495
> 9fd0d452-b5d0-426b-8a0f-bef830ba0495
> 7ff9e87c-f87f-4d35-9d61-2e298e56ac37
> Now the problem with the getAllNodeIds method is that it fetches an extra 10 records on top of maxcount (to avoid a problem where the first key is not the one you that is wanted). Afterwards it skips a number of records again, this time using nodeid.compareto. This compareto statement returns true unexpectedly for mysql because the code doesn't expect the mysql ordering.
> I had the situation where I had about 17000 records in the bundle table but consecutively getting the ids a thousand records at a time returned only about 8000 records in all.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (JCR-3261) Problems with BundleDbPersistenceManager getAllNodeIds

Posted by "Bart van der Schans (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/JCR-3261?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13232585#comment-13232585 ] 

Bart van der Schans commented on JCR-3261:
------------------------------------------

Patch committed in r1302401 in slightly adjusted form to trunk.
                
> Problems with BundleDbPersistenceManager getAllNodeIds
> ------------------------------------------------------
>
>                 Key: JCR-3261
>                 URL: https://issues.apache.org/jira/browse/JCR-3261
>             Project: Jackrabbit Content Repository
>          Issue Type: Bug
>    Affects Versions: 2.4
>            Reporter: Unico Hommes
>             Fix For: 2.4.1
>
>         Attachments: bdbpm_allids.patch
>
>
> When using MySQL:
> The problem arises when the method parameter maxcount is less than the total amount of records in the bundle table.
> First of all I found out that mysql orders the nodeid objects different than jackrabbit does. The following test describes this idea:
>     public void testMySQLOrderByNodeId() throws Exception {
>         NodeId nodeId1 = new NodeId("7ff9e87c-f87f-4d35-9d61-2e298e56ac37");
>         NodeId nodeId2 = new NodeId("9fd0d452-b5d0-426b-8a0f-bef830ba0495");
>         PreparedStatement stmt = connection.prepareStatement("SELECT NODE_ID FROM DEFAULT_BUNDLE WHERE NODE_ID = ? OR NODE_ID = ? ORDER BY NODE_ID");
>         Object[] params = new Object[] { nodeId1.getRawBytes(), nodeId2.getRawBytes() };
>         stmt.setObject(1, params[0]);
>         stmt.setObject(2, params[1]);
>         ArrayList<NodeId> nodeIds = new ArrayList<NodeId>();
>         ResultSet resultSet = stmt.executeQuery();
>         while(resultSet.next()) {
>             NodeId nodeId = new NodeId(resultSet.getBytes(1));
>             System.out.println(nodeId);
>             nodeIds.add(nodeId);
>         }
>         Collections.sort(nodeIds);
>         for (NodeId nodeId : nodeIds) {
>             System.out.println(nodeId);
>         }
>     }
> Which results in the following output:
> 7ff9e87c-f87f-4d35-9d61-2e298e56ac37
> 9fd0d452-b5d0-426b-8a0f-bef830ba0495
> 9fd0d452-b5d0-426b-8a0f-bef830ba0495
> 7ff9e87c-f87f-4d35-9d61-2e298e56ac37
> Now the problem with the getAllNodeIds method is that it fetches an extra 10 records on top of maxcount (to avoid a problem where the first key is not the one you that is wanted). Afterwards it skips a number of records again, this time using nodeid.compareto. This compareto statement returns true unexpectedly for mysql because the code doesn't expect the mysql ordering.
> I had the situation where I had about 17000 records in the bundle table but consecutively getting the ids a thousand records at a time returned only about 8000 records in all.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Updated] (JCR-3261) Problems with BundleDbPersistenceManager getAllNodeIds

Posted by "Unico Hommes (Updated) (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/JCR-3261?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Unico Hommes updated JCR-3261:
------------------------------

    Description: 
When using MySQL:
The problem arises when the method parameter maxcount is less than the total amount of records in the bundle table.

First of all I found out that mysql orders the nodeid objects different than jackrabbit does. The following test describes this idea:

    public void testMySQLOrderByNodeId() throws Exception {
        NodeId nodeId1 = new NodeId("7ff9e87c-f87f-4d35-9d61-2e298e56ac37");
        NodeId nodeId2 = new NodeId("9fd0d452-b5d0-426b-8a0f-bef830ba0495");

        PreparedStatement stmt = connection.prepareStatement("SELECT NODE_ID FROM DEFAULT_BUNDLE WHERE NODE_ID = ? OR NODE_ID = ? ORDER BY NODE_ID");

        Object[] params = new Object[] { nodeId1.getRawBytes(), nodeId2.getRawBytes() };
        stmt.setObject(1, params[0]);
        stmt.setObject(2, params[1]);

        ArrayList<NodeId> nodeIds = new ArrayList<NodeId>();
        ResultSet resultSet = stmt.executeQuery();
        while(resultSet.next()) {
            NodeId nodeId = new NodeId(resultSet.getBytes(1));
            System.out.println(nodeId);
            nodeIds.add(nodeId);
        }
        Collections.sort(nodeIds);
        for (NodeId nodeId : nodeIds) {
            System.out.println(nodeId);
        }
    }

Which results in the following output:

7ff9e87c-f87f-4d35-9d61-2e298e56ac37
9fd0d452-b5d0-426b-8a0f-bef830ba0495
9fd0d452-b5d0-426b-8a0f-bef830ba0495
7ff9e87c-f87f-4d35-9d61-2e298e56ac37


Now the problem with the getAllNodeIds method is that it fetches an extra 10 records on top of maxcount (to avoid a problem where the first key is not the one you that is wanted). Afterwards it skips a number of records again, this time using nodeid.compareto. This compareto statement returns true unexpectedly for mysql because the code doesn't expect the mysql ordering.

I had the situation where I had about 17000 records in the bundle table but consecutively getting the ids a thousand records at a time returned only about 8000 records in all.

With Derby DB there is an infinite loop when it is used as it is for instance by ConsistencyCheckerImpl. This is because of the bundleSelectAllIdsFromSQL sql query that is done in the case of SM_LONGLONG_KEYS which is used in the case of Derby. Compare for instance that statement with the corresponding statement for SM_BINARY_KEYS. You will see that the former uses >= while the latter only >. The latter is correct. We want only records that are bigger than the passed in parameter, not bigger or equal.


  was:
The problem arises when the method parameter maxcount is less than the total amount of records in the bundle table.

First of all I found out that mysql orders the nodeid objects different than jackrabbit does. The following test describes this idea:

    public void testMySQLOrderByNodeId() throws Exception {
        NodeId nodeId1 = new NodeId("7ff9e87c-f87f-4d35-9d61-2e298e56ac37");
        NodeId nodeId2 = new NodeId("9fd0d452-b5d0-426b-8a0f-bef830ba0495");

        PreparedStatement stmt = connection.prepareStatement("SELECT NODE_ID FROM DEFAULT_BUNDLE WHERE NODE_ID = ? OR NODE_ID = ? ORDER BY NODE_ID");

        Object[] params = new Object[] { nodeId1.getRawBytes(), nodeId2.getRawBytes() };
        stmt.setObject(1, params[0]);
        stmt.setObject(2, params[1]);

        ArrayList<NodeId> nodeIds = new ArrayList<NodeId>();
        ResultSet resultSet = stmt.executeQuery();
        while(resultSet.next()) {
            NodeId nodeId = new NodeId(resultSet.getBytes(1));
            System.out.println(nodeId);
            nodeIds.add(nodeId);
        }
        Collections.sort(nodeIds);
        for (NodeId nodeId : nodeIds) {
            System.out.println(nodeId);
        }
    }

Which results in the following output:

7ff9e87c-f87f-4d35-9d61-2e298e56ac37
9fd0d452-b5d0-426b-8a0f-bef830ba0495
9fd0d452-b5d0-426b-8a0f-bef830ba0495
7ff9e87c-f87f-4d35-9d61-2e298e56ac37


Now the problem with the getAllNodeIds method is that it fetches an extra 10 records on top of maxcount (to avoid a problem where the first key is not the one you that is wanted). Afterwards it skips a number of records again, this time using nodeid.compareto. This compareto statement returns true unexpectedly for mysql because the code doesn't expect the mysql ordering.

I had the situation where I had about 17000 records in the bundle table but consecutively getting the ids a thousand records at a time returned only about 8000 records in all.

I'll attach a patch that fixes the problem.

        Summary: Problems with BundleDbPersistenceManager getAllNodeIds  (was: BundleDbPersistenceManager getAllIds gives wrong amount of results for MySQL)

Updated title and description because another problem with the same method was found for Derby DB. Attaching an updated patch.
                
> Problems with BundleDbPersistenceManager getAllNodeIds
> ------------------------------------------------------
>
>                 Key: JCR-3261
>                 URL: https://issues.apache.org/jira/browse/JCR-3261
>             Project: Jackrabbit Content Repository
>          Issue Type: Bug
>    Affects Versions: 2.4
>            Reporter: Unico Hommes
>             Fix For: 2.4.1
>
>         Attachments: bdbpm_allids.patch
>
>
> When using MySQL:
> The problem arises when the method parameter maxcount is less than the total amount of records in the bundle table.
> First of all I found out that mysql orders the nodeid objects different than jackrabbit does. The following test describes this idea:
>     public void testMySQLOrderByNodeId() throws Exception {
>         NodeId nodeId1 = new NodeId("7ff9e87c-f87f-4d35-9d61-2e298e56ac37");
>         NodeId nodeId2 = new NodeId("9fd0d452-b5d0-426b-8a0f-bef830ba0495");
>         PreparedStatement stmt = connection.prepareStatement("SELECT NODE_ID FROM DEFAULT_BUNDLE WHERE NODE_ID = ? OR NODE_ID = ? ORDER BY NODE_ID");
>         Object[] params = new Object[] { nodeId1.getRawBytes(), nodeId2.getRawBytes() };
>         stmt.setObject(1, params[0]);
>         stmt.setObject(2, params[1]);
>         ArrayList<NodeId> nodeIds = new ArrayList<NodeId>();
>         ResultSet resultSet = stmt.executeQuery();
>         while(resultSet.next()) {
>             NodeId nodeId = new NodeId(resultSet.getBytes(1));
>             System.out.println(nodeId);
>             nodeIds.add(nodeId);
>         }
>         Collections.sort(nodeIds);
>         for (NodeId nodeId : nodeIds) {
>             System.out.println(nodeId);
>         }
>     }
> Which results in the following output:
> 7ff9e87c-f87f-4d35-9d61-2e298e56ac37
> 9fd0d452-b5d0-426b-8a0f-bef830ba0495
> 9fd0d452-b5d0-426b-8a0f-bef830ba0495
> 7ff9e87c-f87f-4d35-9d61-2e298e56ac37
> Now the problem with the getAllNodeIds method is that it fetches an extra 10 records on top of maxcount (to avoid a problem where the first key is not the one you that is wanted). Afterwards it skips a number of records again, this time using nodeid.compareto. This compareto statement returns true unexpectedly for mysql because the code doesn't expect the mysql ordering.
> I had the situation where I had about 17000 records in the bundle table but consecutively getting the ids a thousand records at a time returned only about 8000 records in all.
> With Derby DB there is an infinite loop when it is used as it is for instance by ConsistencyCheckerImpl. This is because of the bundleSelectAllIdsFromSQL sql query that is done in the case of SM_LONGLONG_KEYS which is used in the case of Derby. Compare for instance that statement with the corresponding statement for SM_BINARY_KEYS. You will see that the former uses >= while the latter only >. The latter is correct. We want only records that are bigger than the passed in parameter, not bigger or equal.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (JCR-3261) Problems with BundleDbPersistenceManager getAllNodeIds

Posted by "Thomas Mueller (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/JCR-3261?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13232556#comment-13232556 ] 

Thomas Mueller commented on JCR-3261:
-------------------------------------

OK I see. For data store garbage collection, you could live without the method, as persistence manager scan is optional there. But the consistency checker needs it, and changing the consistency checker to do a regular node traversal would be probably quite a lot of work.

Not sure how to solve it, possibly use "order by desc"? I guess more tests would be required (it seems we use varbinary(16) in MySQL, which might deal with trailing zeroes differently and binary(16))
                
> Problems with BundleDbPersistenceManager getAllNodeIds
> ------------------------------------------------------
>
>                 Key: JCR-3261
>                 URL: https://issues.apache.org/jira/browse/JCR-3261
>             Project: Jackrabbit Content Repository
>          Issue Type: Bug
>    Affects Versions: 2.4
>            Reporter: Unico Hommes
>             Fix For: 2.4.1
>
>         Attachments: bdbpm_allids.patch
>
>
> When using MySQL:
> The problem arises when the method parameter maxcount is less than the total amount of records in the bundle table.
> First of all I found out that mysql orders the nodeid objects different than jackrabbit does. The following test describes this idea:
>     public void testMySQLOrderByNodeId() throws Exception {
>         NodeId nodeId1 = new NodeId("7ff9e87c-f87f-4d35-9d61-2e298e56ac37");
>         NodeId nodeId2 = new NodeId("9fd0d452-b5d0-426b-8a0f-bef830ba0495");
>         PreparedStatement stmt = connection.prepareStatement("SELECT NODE_ID FROM DEFAULT_BUNDLE WHERE NODE_ID = ? OR NODE_ID = ? ORDER BY NODE_ID");
>         Object[] params = new Object[] { nodeId1.getRawBytes(), nodeId2.getRawBytes() };
>         stmt.setObject(1, params[0]);
>         stmt.setObject(2, params[1]);
>         ArrayList<NodeId> nodeIds = new ArrayList<NodeId>();
>         ResultSet resultSet = stmt.executeQuery();
>         while(resultSet.next()) {
>             NodeId nodeId = new NodeId(resultSet.getBytes(1));
>             System.out.println(nodeId);
>             nodeIds.add(nodeId);
>         }
>         Collections.sort(nodeIds);
>         for (NodeId nodeId : nodeIds) {
>             System.out.println(nodeId);
>         }
>     }
> Which results in the following output:
> 7ff9e87c-f87f-4d35-9d61-2e298e56ac37
> 9fd0d452-b5d0-426b-8a0f-bef830ba0495
> 9fd0d452-b5d0-426b-8a0f-bef830ba0495
> 7ff9e87c-f87f-4d35-9d61-2e298e56ac37
> Now the problem with the getAllNodeIds method is that it fetches an extra 10 records on top of maxcount (to avoid a problem where the first key is not the one you that is wanted). Afterwards it skips a number of records again, this time using nodeid.compareto. This compareto statement returns true unexpectedly for mysql because the code doesn't expect the mysql ordering.
> I had the situation where I had about 17000 records in the bundle table but consecutively getting the ids a thousand records at a time returned only about 8000 records in all.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Updated] (JCR-3261) BundleDbPersistenceManager getAllIds gives wrong amount of results for MySQL

Posted by "Unico Hommes (Updated) (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/JCR-3261?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Unico Hommes updated JCR-3261:
------------------------------

    Attachment: bdbpm_allids.patch

Patch against 2.4.x branch.

The special handling of maxcount is only necessary in the case of longlong storage model. In the case of a binary storage model I can easily understand that ordering is not well defined and can vary from database to database. Therefore only do the special handling for longlong storage model. 
                
> BundleDbPersistenceManager getAllIds gives wrong amount of results for MySQL
> ----------------------------------------------------------------------------
>
>                 Key: JCR-3261
>                 URL: https://issues.apache.org/jira/browse/JCR-3261
>             Project: Jackrabbit Content Repository
>          Issue Type: Bug
>    Affects Versions: 2.4
>            Reporter: Unico Hommes
>             Fix For: 2.4.1
>
>         Attachments: bdbpm_allids.patch
>
>
> The problem arises when the method parameter maxcount is less than the total amount of records in the bundle table.
> First of all I found out that mysql orders the nodeid objects different than jackrabbit does. The following test describes this idea:
>     public void testMySQLOrderByNodeId() throws Exception {
>         NodeId nodeId1 = new NodeId("7ff9e87c-f87f-4d35-9d61-2e298e56ac37");
>         NodeId nodeId2 = new NodeId("9fd0d452-b5d0-426b-8a0f-bef830ba0495");
>         PreparedStatement stmt = connection.prepareStatement("SELECT NODE_ID FROM DEFAULT_BUNDLE WHERE NODE_ID = ? OR NODE_ID = ? ORDER BY NODE_ID");
>         Object[] params = new Object[] { nodeId1.getRawBytes(), nodeId2.getRawBytes() };
>         stmt.setObject(1, params[0]);
>         stmt.setObject(2, params[1]);
>         ArrayList<NodeId> nodeIds = new ArrayList<NodeId>();
>         ResultSet resultSet = stmt.executeQuery();
>         while(resultSet.next()) {
>             NodeId nodeId = new NodeId(resultSet.getBytes(1));
>             System.out.println(nodeId);
>             nodeIds.add(nodeId);
>         }
>         Collections.sort(nodeIds);
>         for (NodeId nodeId : nodeIds) {
>             System.out.println(nodeId);
>         }
>     }
> Which results in the following output:
> 7ff9e87c-f87f-4d35-9d61-2e298e56ac37
> 9fd0d452-b5d0-426b-8a0f-bef830ba0495
> 9fd0d452-b5d0-426b-8a0f-bef830ba0495
> 7ff9e87c-f87f-4d35-9d61-2e298e56ac37
> Now the problem with the getAllNodeIds method is that it fetches an extra 10 records on top of maxcount (to avoid a problem where the first key is not the one you that is wanted). Afterwards it skips a number of records again, this time using nodeid.compareto. This compareto statement returns true unexpectedly for mysql because the code doesn't expect the mysql ordering.
> I had the situation where I had about 17000 records in the bundle table but consecutively getting the ids a thousand records at a time returned only about 8000 records in all.
> I'll attach a patch that fixes the problem.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (JCR-3261) Problems with BundleDbPersistenceManager getAllNodeIds

Posted by "Unico Hommes (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/JCR-3261?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13232558#comment-13232558 ] 

Unico Hommes commented on JCR-3261:
-----------------------------------

I think the supplied patch is probably the right solution. Order by descending won't work because it's not exactly the other way around that MySQL orders the nodes.
                
> Problems with BundleDbPersistenceManager getAllNodeIds
> ------------------------------------------------------
>
>                 Key: JCR-3261
>                 URL: https://issues.apache.org/jira/browse/JCR-3261
>             Project: Jackrabbit Content Repository
>          Issue Type: Bug
>    Affects Versions: 2.4
>            Reporter: Unico Hommes
>             Fix For: 2.4.1
>
>         Attachments: bdbpm_allids.patch
>
>
> When using MySQL:
> The problem arises when the method parameter maxcount is less than the total amount of records in the bundle table.
> First of all I found out that mysql orders the nodeid objects different than jackrabbit does. The following test describes this idea:
>     public void testMySQLOrderByNodeId() throws Exception {
>         NodeId nodeId1 = new NodeId("7ff9e87c-f87f-4d35-9d61-2e298e56ac37");
>         NodeId nodeId2 = new NodeId("9fd0d452-b5d0-426b-8a0f-bef830ba0495");
>         PreparedStatement stmt = connection.prepareStatement("SELECT NODE_ID FROM DEFAULT_BUNDLE WHERE NODE_ID = ? OR NODE_ID = ? ORDER BY NODE_ID");
>         Object[] params = new Object[] { nodeId1.getRawBytes(), nodeId2.getRawBytes() };
>         stmt.setObject(1, params[0]);
>         stmt.setObject(2, params[1]);
>         ArrayList<NodeId> nodeIds = new ArrayList<NodeId>();
>         ResultSet resultSet = stmt.executeQuery();
>         while(resultSet.next()) {
>             NodeId nodeId = new NodeId(resultSet.getBytes(1));
>             System.out.println(nodeId);
>             nodeIds.add(nodeId);
>         }
>         Collections.sort(nodeIds);
>         for (NodeId nodeId : nodeIds) {
>             System.out.println(nodeId);
>         }
>     }
> Which results in the following output:
> 7ff9e87c-f87f-4d35-9d61-2e298e56ac37
> 9fd0d452-b5d0-426b-8a0f-bef830ba0495
> 9fd0d452-b5d0-426b-8a0f-bef830ba0495
> 7ff9e87c-f87f-4d35-9d61-2e298e56ac37
> Now the problem with the getAllNodeIds method is that it fetches an extra 10 records on top of maxcount (to avoid a problem where the first key is not the one you that is wanted). Afterwards it skips a number of records again, this time using nodeid.compareto. This compareto statement returns true unexpectedly for mysql because the code doesn't expect the mysql ordering.
> I had the situation where I had about 17000 records in the bundle table but consecutively getting the ids a thousand records at a time returned only about 8000 records in all.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Updated] (JCR-3261) Problems with BundleDbPersistenceManager getAllNodeIds

Posted by "Unico Hommes (Updated) (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/JCR-3261?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Unico Hommes updated JCR-3261:
------------------------------

    Description: 
When using MySQL:
The problem arises when the method parameter maxcount is less than the total amount of records in the bundle table.

First of all I found out that mysql orders the nodeid objects different than jackrabbit does. The following test describes this idea:

    public void testMySQLOrderByNodeId() throws Exception {
        NodeId nodeId1 = new NodeId("7ff9e87c-f87f-4d35-9d61-2e298e56ac37");
        NodeId nodeId2 = new NodeId("9fd0d452-b5d0-426b-8a0f-bef830ba0495");

        PreparedStatement stmt = connection.prepareStatement("SELECT NODE_ID FROM DEFAULT_BUNDLE WHERE NODE_ID = ? OR NODE_ID = ? ORDER BY NODE_ID");

        Object[] params = new Object[] { nodeId1.getRawBytes(), nodeId2.getRawBytes() };
        stmt.setObject(1, params[0]);
        stmt.setObject(2, params[1]);

        ArrayList<NodeId> nodeIds = new ArrayList<NodeId>();
        ResultSet resultSet = stmt.executeQuery();
        while(resultSet.next()) {
            NodeId nodeId = new NodeId(resultSet.getBytes(1));
            System.out.println(nodeId);
            nodeIds.add(nodeId);
        }
        Collections.sort(nodeIds);
        for (NodeId nodeId : nodeIds) {
            System.out.println(nodeId);
        }
    }

Which results in the following output:

7ff9e87c-f87f-4d35-9d61-2e298e56ac37
9fd0d452-b5d0-426b-8a0f-bef830ba0495
9fd0d452-b5d0-426b-8a0f-bef830ba0495
7ff9e87c-f87f-4d35-9d61-2e298e56ac37


Now the problem with the getAllNodeIds method is that it fetches an extra 10 records on top of maxcount (to avoid a problem where the first key is not the one you that is wanted). Afterwards it skips a number of records again, this time using nodeid.compareto. This compareto statement returns true unexpectedly for mysql because the code doesn't expect the mysql ordering.

I had the situation where I had about 17000 records in the bundle table but consecutively getting the ids a thousand records at a time returned only about 8000 records in all.


  was:
When using MySQL:
The problem arises when the method parameter maxcount is less than the total amount of records in the bundle table.

First of all I found out that mysql orders the nodeid objects different than jackrabbit does. The following test describes this idea:

    public void testMySQLOrderByNodeId() throws Exception {
        NodeId nodeId1 = new NodeId("7ff9e87c-f87f-4d35-9d61-2e298e56ac37");
        NodeId nodeId2 = new NodeId("9fd0d452-b5d0-426b-8a0f-bef830ba0495");

        PreparedStatement stmt = connection.prepareStatement("SELECT NODE_ID FROM DEFAULT_BUNDLE WHERE NODE_ID = ? OR NODE_ID = ? ORDER BY NODE_ID");

        Object[] params = new Object[] { nodeId1.getRawBytes(), nodeId2.getRawBytes() };
        stmt.setObject(1, params[0]);
        stmt.setObject(2, params[1]);

        ArrayList<NodeId> nodeIds = new ArrayList<NodeId>();
        ResultSet resultSet = stmt.executeQuery();
        while(resultSet.next()) {
            NodeId nodeId = new NodeId(resultSet.getBytes(1));
            System.out.println(nodeId);
            nodeIds.add(nodeId);
        }
        Collections.sort(nodeIds);
        for (NodeId nodeId : nodeIds) {
            System.out.println(nodeId);
        }
    }

Which results in the following output:

7ff9e87c-f87f-4d35-9d61-2e298e56ac37
9fd0d452-b5d0-426b-8a0f-bef830ba0495
9fd0d452-b5d0-426b-8a0f-bef830ba0495
7ff9e87c-f87f-4d35-9d61-2e298e56ac37


Now the problem with the getAllNodeIds method is that it fetches an extra 10 records on top of maxcount (to avoid a problem where the first key is not the one you that is wanted). Afterwards it skips a number of records again, this time using nodeid.compareto. This compareto statement returns true unexpectedly for mysql because the code doesn't expect the mysql ordering.

I had the situation where I had about 17000 records in the bundle table but consecutively getting the ids a thousand records at a time returned only about 8000 records in all.

With Derby DB there is an infinite loop when it is used as it is for instance by ConsistencyCheckerImpl. This is because of the bundleSelectAllIdsFromSQL sql query that is done in the case of SM_LONGLONG_KEYS which is used in the case of Derby. Compare for instance that statement with the corresponding statement for SM_BINARY_KEYS. You will see that the former uses >= while the latter only >. The latter is correct. We want only records that are bigger than the passed in parameter, not bigger or equal.



Problem was at my end apparently. No problem with Derby DB after all.
                
> Problems with BundleDbPersistenceManager getAllNodeIds
> ------------------------------------------------------
>
>                 Key: JCR-3261
>                 URL: https://issues.apache.org/jira/browse/JCR-3261
>             Project: Jackrabbit Content Repository
>          Issue Type: Bug
>    Affects Versions: 2.4
>            Reporter: Unico Hommes
>             Fix For: 2.4.1
>
>         Attachments: bdbpm_allids.patch
>
>
> When using MySQL:
> The problem arises when the method parameter maxcount is less than the total amount of records in the bundle table.
> First of all I found out that mysql orders the nodeid objects different than jackrabbit does. The following test describes this idea:
>     public void testMySQLOrderByNodeId() throws Exception {
>         NodeId nodeId1 = new NodeId("7ff9e87c-f87f-4d35-9d61-2e298e56ac37");
>         NodeId nodeId2 = new NodeId("9fd0d452-b5d0-426b-8a0f-bef830ba0495");
>         PreparedStatement stmt = connection.prepareStatement("SELECT NODE_ID FROM DEFAULT_BUNDLE WHERE NODE_ID = ? OR NODE_ID = ? ORDER BY NODE_ID");
>         Object[] params = new Object[] { nodeId1.getRawBytes(), nodeId2.getRawBytes() };
>         stmt.setObject(1, params[0]);
>         stmt.setObject(2, params[1]);
>         ArrayList<NodeId> nodeIds = new ArrayList<NodeId>();
>         ResultSet resultSet = stmt.executeQuery();
>         while(resultSet.next()) {
>             NodeId nodeId = new NodeId(resultSet.getBytes(1));
>             System.out.println(nodeId);
>             nodeIds.add(nodeId);
>         }
>         Collections.sort(nodeIds);
>         for (NodeId nodeId : nodeIds) {
>             System.out.println(nodeId);
>         }
>     }
> Which results in the following output:
> 7ff9e87c-f87f-4d35-9d61-2e298e56ac37
> 9fd0d452-b5d0-426b-8a0f-bef830ba0495
> 9fd0d452-b5d0-426b-8a0f-bef830ba0495
> 7ff9e87c-f87f-4d35-9d61-2e298e56ac37
> Now the problem with the getAllNodeIds method is that it fetches an extra 10 records on top of maxcount (to avoid a problem where the first key is not the one you that is wanted). Afterwards it skips a number of records again, this time using nodeid.compareto. This compareto statement returns true unexpectedly for mysql because the code doesn't expect the mysql ordering.
> I had the situation where I had about 17000 records in the bundle table but consecutively getting the ids a thousand records at a time returned only about 8000 records in all.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (JCR-3261) Problems with BundleDbPersistenceManager getAllNodeIds

Posted by "Unico Hommes (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/JCR-3261?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13232540#comment-13232540 ] 

Unico Hommes commented on JCR-3261:
-----------------------------------

I was using the Jackrabbit consistency checker. See http://svn.apache.org/repos/asf/jackrabbit/tags/2.4.0/jackrabbit-core/src/main/java/org/apache/jackrabbit/core/persistence/bundle/ConsistencyCheckerImpl.java

                
> Problems with BundleDbPersistenceManager getAllNodeIds
> ------------------------------------------------------
>
>                 Key: JCR-3261
>                 URL: https://issues.apache.org/jira/browse/JCR-3261
>             Project: Jackrabbit Content Repository
>          Issue Type: Bug
>    Affects Versions: 2.4
>            Reporter: Unico Hommes
>             Fix For: 2.4.1
>
>         Attachments: bdbpm_allids.patch
>
>
> When using MySQL:
> The problem arises when the method parameter maxcount is less than the total amount of records in the bundle table.
> First of all I found out that mysql orders the nodeid objects different than jackrabbit does. The following test describes this idea:
>     public void testMySQLOrderByNodeId() throws Exception {
>         NodeId nodeId1 = new NodeId("7ff9e87c-f87f-4d35-9d61-2e298e56ac37");
>         NodeId nodeId2 = new NodeId("9fd0d452-b5d0-426b-8a0f-bef830ba0495");
>         PreparedStatement stmt = connection.prepareStatement("SELECT NODE_ID FROM DEFAULT_BUNDLE WHERE NODE_ID = ? OR NODE_ID = ? ORDER BY NODE_ID");
>         Object[] params = new Object[] { nodeId1.getRawBytes(), nodeId2.getRawBytes() };
>         stmt.setObject(1, params[0]);
>         stmt.setObject(2, params[1]);
>         ArrayList<NodeId> nodeIds = new ArrayList<NodeId>();
>         ResultSet resultSet = stmt.executeQuery();
>         while(resultSet.next()) {
>             NodeId nodeId = new NodeId(resultSet.getBytes(1));
>             System.out.println(nodeId);
>             nodeIds.add(nodeId);
>         }
>         Collections.sort(nodeIds);
>         for (NodeId nodeId : nodeIds) {
>             System.out.println(nodeId);
>         }
>     }
> Which results in the following output:
> 7ff9e87c-f87f-4d35-9d61-2e298e56ac37
> 9fd0d452-b5d0-426b-8a0f-bef830ba0495
> 9fd0d452-b5d0-426b-8a0f-bef830ba0495
> 7ff9e87c-f87f-4d35-9d61-2e298e56ac37
> Now the problem with the getAllNodeIds method is that it fetches an extra 10 records on top of maxcount (to avoid a problem where the first key is not the one you that is wanted). Afterwards it skips a number of records again, this time using nodeid.compareto. This compareto statement returns true unexpectedly for mysql because the code doesn't expect the mysql ordering.
> I had the situation where I had about 17000 records in the bundle table but consecutively getting the ids a thousand records at a time returned only about 8000 records in all.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Assigned] (JCR-3261) Problems with BundleDbPersistenceManager getAllNodeIds

Posted by "Bart van der Schans (Assigned) (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/JCR-3261?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Bart van der Schans reassigned JCR-3261:
----------------------------------------

    Assignee: Bart van der Schans
    
> Problems with BundleDbPersistenceManager getAllNodeIds
> ------------------------------------------------------
>
>                 Key: JCR-3261
>                 URL: https://issues.apache.org/jira/browse/JCR-3261
>             Project: Jackrabbit Content Repository
>          Issue Type: Bug
>    Affects Versions: 2.4
>            Reporter: Unico Hommes
>            Assignee: Bart van der Schans
>             Fix For: 2.4.1, 2.6
>
>         Attachments: bdbpm_allids.patch
>
>
> When using MySQL:
> The problem arises when the method parameter maxcount is less than the total amount of records in the bundle table.
> First of all I found out that mysql orders the nodeid objects different than jackrabbit does. The following test describes this idea:
>     public void testMySQLOrderByNodeId() throws Exception {
>         NodeId nodeId1 = new NodeId("7ff9e87c-f87f-4d35-9d61-2e298e56ac37");
>         NodeId nodeId2 = new NodeId("9fd0d452-b5d0-426b-8a0f-bef830ba0495");
>         PreparedStatement stmt = connection.prepareStatement("SELECT NODE_ID FROM DEFAULT_BUNDLE WHERE NODE_ID = ? OR NODE_ID = ? ORDER BY NODE_ID");
>         Object[] params = new Object[] { nodeId1.getRawBytes(), nodeId2.getRawBytes() };
>         stmt.setObject(1, params[0]);
>         stmt.setObject(2, params[1]);
>         ArrayList<NodeId> nodeIds = new ArrayList<NodeId>();
>         ResultSet resultSet = stmt.executeQuery();
>         while(resultSet.next()) {
>             NodeId nodeId = new NodeId(resultSet.getBytes(1));
>             System.out.println(nodeId);
>             nodeIds.add(nodeId);
>         }
>         Collections.sort(nodeIds);
>         for (NodeId nodeId : nodeIds) {
>             System.out.println(nodeId);
>         }
>     }
> Which results in the following output:
> 7ff9e87c-f87f-4d35-9d61-2e298e56ac37
> 9fd0d452-b5d0-426b-8a0f-bef830ba0495
> 9fd0d452-b5d0-426b-8a0f-bef830ba0495
> 7ff9e87c-f87f-4d35-9d61-2e298e56ac37
> Now the problem with the getAllNodeIds method is that it fetches an extra 10 records on top of maxcount (to avoid a problem where the first key is not the one you that is wanted). Afterwards it skips a number of records again, this time using nodeid.compareto. This compareto statement returns true unexpectedly for mysql because the code doesn't expect the mysql ordering.
> I had the situation where I had about 17000 records in the bundle table but consecutively getting the ids a thousand records at a time returned only about 8000 records in all.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (JCR-3261) Problems with BundleDbPersistenceManager getAllNodeIds

Posted by "Bart van der Schans (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/JCR-3261?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13232584#comment-13232584 ] 

Bart van der Schans commented on JCR-3261:
------------------------------------------

The problem seems to be that the following block should only be applied for LONGLONG keys and not for BINARY keys.

                    if (lowId != null) {
                        // skip the keys that are smaller or equal (see above, maxCount += 10)
                        // only required for SM_LONGLONG_KEYS
                        if (current.compareTo(lowId) <= 0) {
                            continue;
                        }
                    }

                
> Problems with BundleDbPersistenceManager getAllNodeIds
> ------------------------------------------------------
>
>                 Key: JCR-3261
>                 URL: https://issues.apache.org/jira/browse/JCR-3261
>             Project: Jackrabbit Content Repository
>          Issue Type: Bug
>    Affects Versions: 2.4
>            Reporter: Unico Hommes
>             Fix For: 2.4.1
>
>         Attachments: bdbpm_allids.patch
>
>
> When using MySQL:
> The problem arises when the method parameter maxcount is less than the total amount of records in the bundle table.
> First of all I found out that mysql orders the nodeid objects different than jackrabbit does. The following test describes this idea:
>     public void testMySQLOrderByNodeId() throws Exception {
>         NodeId nodeId1 = new NodeId("7ff9e87c-f87f-4d35-9d61-2e298e56ac37");
>         NodeId nodeId2 = new NodeId("9fd0d452-b5d0-426b-8a0f-bef830ba0495");
>         PreparedStatement stmt = connection.prepareStatement("SELECT NODE_ID FROM DEFAULT_BUNDLE WHERE NODE_ID = ? OR NODE_ID = ? ORDER BY NODE_ID");
>         Object[] params = new Object[] { nodeId1.getRawBytes(), nodeId2.getRawBytes() };
>         stmt.setObject(1, params[0]);
>         stmt.setObject(2, params[1]);
>         ArrayList<NodeId> nodeIds = new ArrayList<NodeId>();
>         ResultSet resultSet = stmt.executeQuery();
>         while(resultSet.next()) {
>             NodeId nodeId = new NodeId(resultSet.getBytes(1));
>             System.out.println(nodeId);
>             nodeIds.add(nodeId);
>         }
>         Collections.sort(nodeIds);
>         for (NodeId nodeId : nodeIds) {
>             System.out.println(nodeId);
>         }
>     }
> Which results in the following output:
> 7ff9e87c-f87f-4d35-9d61-2e298e56ac37
> 9fd0d452-b5d0-426b-8a0f-bef830ba0495
> 9fd0d452-b5d0-426b-8a0f-bef830ba0495
> 7ff9e87c-f87f-4d35-9d61-2e298e56ac37
> Now the problem with the getAllNodeIds method is that it fetches an extra 10 records on top of maxcount (to avoid a problem where the first key is not the one you that is wanted). Afterwards it skips a number of records again, this time using nodeid.compareto. This compareto statement returns true unexpectedly for mysql because the code doesn't expect the mysql ordering.
> I had the situation where I had about 17000 records in the bundle table but consecutively getting the ids a thousand records at a time returned only about 8000 records in all.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Resolved] (JCR-3261) Problems with BundleDbPersistenceManager getAllNodeIds

Posted by "Bart van der Schans (Resolved) (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/JCR-3261?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Bart van der Schans resolved JCR-3261.
--------------------------------------

    Resolution: Fixed
    
> Problems with BundleDbPersistenceManager getAllNodeIds
> ------------------------------------------------------
>
>                 Key: JCR-3261
>                 URL: https://issues.apache.org/jira/browse/JCR-3261
>             Project: Jackrabbit Content Repository
>          Issue Type: Bug
>    Affects Versions: 2.4
>            Reporter: Unico Hommes
>            Assignee: Bart van der Schans
>             Fix For: 2.4.1, 2.6
>
>         Attachments: bdbpm_allids.patch
>
>
> When using MySQL:
> The problem arises when the method parameter maxcount is less than the total amount of records in the bundle table.
> First of all I found out that mysql orders the nodeid objects different than jackrabbit does. The following test describes this idea:
>     public void testMySQLOrderByNodeId() throws Exception {
>         NodeId nodeId1 = new NodeId("7ff9e87c-f87f-4d35-9d61-2e298e56ac37");
>         NodeId nodeId2 = new NodeId("9fd0d452-b5d0-426b-8a0f-bef830ba0495");
>         PreparedStatement stmt = connection.prepareStatement("SELECT NODE_ID FROM DEFAULT_BUNDLE WHERE NODE_ID = ? OR NODE_ID = ? ORDER BY NODE_ID");
>         Object[] params = new Object[] { nodeId1.getRawBytes(), nodeId2.getRawBytes() };
>         stmt.setObject(1, params[0]);
>         stmt.setObject(2, params[1]);
>         ArrayList<NodeId> nodeIds = new ArrayList<NodeId>();
>         ResultSet resultSet = stmt.executeQuery();
>         while(resultSet.next()) {
>             NodeId nodeId = new NodeId(resultSet.getBytes(1));
>             System.out.println(nodeId);
>             nodeIds.add(nodeId);
>         }
>         Collections.sort(nodeIds);
>         for (NodeId nodeId : nodeIds) {
>             System.out.println(nodeId);
>         }
>     }
> Which results in the following output:
> 7ff9e87c-f87f-4d35-9d61-2e298e56ac37
> 9fd0d452-b5d0-426b-8a0f-bef830ba0495
> 9fd0d452-b5d0-426b-8a0f-bef830ba0495
> 7ff9e87c-f87f-4d35-9d61-2e298e56ac37
> Now the problem with the getAllNodeIds method is that it fetches an extra 10 records on top of maxcount (to avoid a problem where the first key is not the one you that is wanted). Afterwards it skips a number of records again, this time using nodeid.compareto. This compareto statement returns true unexpectedly for mysql because the code doesn't expect the mysql ordering.
> I had the situation where I had about 17000 records in the bundle table but consecutively getting the ids a thousand records at a time returned only about 8000 records in all.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (JCR-3261) Problems with BundleDbPersistenceManager getAllNodeIds

Posted by "Julian Reschke (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/JCR-3261?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13232529#comment-13232529 ] 

Julian Reschke commented on JCR-3261:
-------------------------------------

> What is your use case, that is, why do you need getAllNodeIds? I'm just wondering if we really need this method... If we could remove it then we wouldn't have to deal with such problems. 

It's needed by the datastore garbage collector and the consistency checker.
                
> Problems with BundleDbPersistenceManager getAllNodeIds
> ------------------------------------------------------
>
>                 Key: JCR-3261
>                 URL: https://issues.apache.org/jira/browse/JCR-3261
>             Project: Jackrabbit Content Repository
>          Issue Type: Bug
>    Affects Versions: 2.4
>            Reporter: Unico Hommes
>             Fix For: 2.4.1
>
>         Attachments: bdbpm_allids.patch
>
>
> When using MySQL:
> The problem arises when the method parameter maxcount is less than the total amount of records in the bundle table.
> First of all I found out that mysql orders the nodeid objects different than jackrabbit does. The following test describes this idea:
>     public void testMySQLOrderByNodeId() throws Exception {
>         NodeId nodeId1 = new NodeId("7ff9e87c-f87f-4d35-9d61-2e298e56ac37");
>         NodeId nodeId2 = new NodeId("9fd0d452-b5d0-426b-8a0f-bef830ba0495");
>         PreparedStatement stmt = connection.prepareStatement("SELECT NODE_ID FROM DEFAULT_BUNDLE WHERE NODE_ID = ? OR NODE_ID = ? ORDER BY NODE_ID");
>         Object[] params = new Object[] { nodeId1.getRawBytes(), nodeId2.getRawBytes() };
>         stmt.setObject(1, params[0]);
>         stmt.setObject(2, params[1]);
>         ArrayList<NodeId> nodeIds = new ArrayList<NodeId>();
>         ResultSet resultSet = stmt.executeQuery();
>         while(resultSet.next()) {
>             NodeId nodeId = new NodeId(resultSet.getBytes(1));
>             System.out.println(nodeId);
>             nodeIds.add(nodeId);
>         }
>         Collections.sort(nodeIds);
>         for (NodeId nodeId : nodeIds) {
>             System.out.println(nodeId);
>         }
>     }
> Which results in the following output:
> 7ff9e87c-f87f-4d35-9d61-2e298e56ac37
> 9fd0d452-b5d0-426b-8a0f-bef830ba0495
> 9fd0d452-b5d0-426b-8a0f-bef830ba0495
> 7ff9e87c-f87f-4d35-9d61-2e298e56ac37
> Now the problem with the getAllNodeIds method is that it fetches an extra 10 records on top of maxcount (to avoid a problem where the first key is not the one you that is wanted). Afterwards it skips a number of records again, this time using nodeid.compareto. This compareto statement returns true unexpectedly for mysql because the code doesn't expect the mysql ordering.
> I had the situation where I had about 17000 records in the bundle table but consecutively getting the ids a thousand records at a time returned only about 8000 records in all.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (JCR-3261) Problems with BundleDbPersistenceManager getAllNodeIds

Posted by "Bart van der Schans (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/JCR-3261?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13232616#comment-13232616 ] 

Bart van der Schans commented on JCR-3261:
------------------------------------------

Merged in 2.4 in r1302430.
                
> Problems with BundleDbPersistenceManager getAllNodeIds
> ------------------------------------------------------
>
>                 Key: JCR-3261
>                 URL: https://issues.apache.org/jira/browse/JCR-3261
>             Project: Jackrabbit Content Repository
>          Issue Type: Bug
>    Affects Versions: 2.4
>            Reporter: Unico Hommes
>            Assignee: Bart van der Schans
>             Fix For: 2.4.1, 2.6
>
>         Attachments: bdbpm_allids.patch
>
>
> When using MySQL:
> The problem arises when the method parameter maxcount is less than the total amount of records in the bundle table.
> First of all I found out that mysql orders the nodeid objects different than jackrabbit does. The following test describes this idea:
>     public void testMySQLOrderByNodeId() throws Exception {
>         NodeId nodeId1 = new NodeId("7ff9e87c-f87f-4d35-9d61-2e298e56ac37");
>         NodeId nodeId2 = new NodeId("9fd0d452-b5d0-426b-8a0f-bef830ba0495");
>         PreparedStatement stmt = connection.prepareStatement("SELECT NODE_ID FROM DEFAULT_BUNDLE WHERE NODE_ID = ? OR NODE_ID = ? ORDER BY NODE_ID");
>         Object[] params = new Object[] { nodeId1.getRawBytes(), nodeId2.getRawBytes() };
>         stmt.setObject(1, params[0]);
>         stmt.setObject(2, params[1]);
>         ArrayList<NodeId> nodeIds = new ArrayList<NodeId>();
>         ResultSet resultSet = stmt.executeQuery();
>         while(resultSet.next()) {
>             NodeId nodeId = new NodeId(resultSet.getBytes(1));
>             System.out.println(nodeId);
>             nodeIds.add(nodeId);
>         }
>         Collections.sort(nodeIds);
>         for (NodeId nodeId : nodeIds) {
>             System.out.println(nodeId);
>         }
>     }
> Which results in the following output:
> 7ff9e87c-f87f-4d35-9d61-2e298e56ac37
> 9fd0d452-b5d0-426b-8a0f-bef830ba0495
> 9fd0d452-b5d0-426b-8a0f-bef830ba0495
> 7ff9e87c-f87f-4d35-9d61-2e298e56ac37
> Now the problem with the getAllNodeIds method is that it fetches an extra 10 records on top of maxcount (to avoid a problem where the first key is not the one you that is wanted). Afterwards it skips a number of records again, this time using nodeid.compareto. This compareto statement returns true unexpectedly for mysql because the code doesn't expect the mysql ordering.
> I had the situation where I had about 17000 records in the bundle table but consecutively getting the ids a thousand records at a time returned only about 8000 records in all.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Updated] (JCR-3261) Problems with BundleDbPersistenceManager getAllNodeIds

Posted by "Unico Hommes (Updated) (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/JCR-3261?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Unico Hommes updated JCR-3261:
------------------------------

    Attachment: bdpm_allids2.patch
    
> Problems with BundleDbPersistenceManager getAllNodeIds
> ------------------------------------------------------
>
>                 Key: JCR-3261
>                 URL: https://issues.apache.org/jira/browse/JCR-3261
>             Project: Jackrabbit Content Repository
>          Issue Type: Bug
>    Affects Versions: 2.4
>            Reporter: Unico Hommes
>             Fix For: 2.4.1
>
>         Attachments: bdbpm_allids.patch, bdpm_allids2.patch
>
>
> When using MySQL:
> The problem arises when the method parameter maxcount is less than the total amount of records in the bundle table.
> First of all I found out that mysql orders the nodeid objects different than jackrabbit does. The following test describes this idea:
>     public void testMySQLOrderByNodeId() throws Exception {
>         NodeId nodeId1 = new NodeId("7ff9e87c-f87f-4d35-9d61-2e298e56ac37");
>         NodeId nodeId2 = new NodeId("9fd0d452-b5d0-426b-8a0f-bef830ba0495");
>         PreparedStatement stmt = connection.prepareStatement("SELECT NODE_ID FROM DEFAULT_BUNDLE WHERE NODE_ID = ? OR NODE_ID = ? ORDER BY NODE_ID");
>         Object[] params = new Object[] { nodeId1.getRawBytes(), nodeId2.getRawBytes() };
>         stmt.setObject(1, params[0]);
>         stmt.setObject(2, params[1]);
>         ArrayList<NodeId> nodeIds = new ArrayList<NodeId>();
>         ResultSet resultSet = stmt.executeQuery();
>         while(resultSet.next()) {
>             NodeId nodeId = new NodeId(resultSet.getBytes(1));
>             System.out.println(nodeId);
>             nodeIds.add(nodeId);
>         }
>         Collections.sort(nodeIds);
>         for (NodeId nodeId : nodeIds) {
>             System.out.println(nodeId);
>         }
>     }
> Which results in the following output:
> 7ff9e87c-f87f-4d35-9d61-2e298e56ac37
> 9fd0d452-b5d0-426b-8a0f-bef830ba0495
> 9fd0d452-b5d0-426b-8a0f-bef830ba0495
> 7ff9e87c-f87f-4d35-9d61-2e298e56ac37
> Now the problem with the getAllNodeIds method is that it fetches an extra 10 records on top of maxcount (to avoid a problem where the first key is not the one you that is wanted). Afterwards it skips a number of records again, this time using nodeid.compareto. This compareto statement returns true unexpectedly for mysql because the code doesn't expect the mysql ordering.
> I had the situation where I had about 17000 records in the bundle table but consecutively getting the ids a thousand records at a time returned only about 8000 records in all.
> With Derby DB there is an infinite loop when it is used as it is for instance by ConsistencyCheckerImpl. This is because of the bundleSelectAllIdsFromSQL sql query that is done in the case of SM_LONGLONG_KEYS which is used in the case of Derby. Compare for instance that statement with the corresponding statement for SM_BINARY_KEYS. You will see that the former uses >= while the latter only >. The latter is correct. We want only records that are bigger than the passed in parameter, not bigger or equal.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Updated] (JCR-3261) Problems with BundleDbPersistenceManager getAllNodeIds

Posted by "Unico Hommes (Updated) (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/JCR-3261?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Unico Hommes updated JCR-3261:
------------------------------

    Attachment:     (was: bdpm_allids2.patch)
    
> Problems with BundleDbPersistenceManager getAllNodeIds
> ------------------------------------------------------
>
>                 Key: JCR-3261
>                 URL: https://issues.apache.org/jira/browse/JCR-3261
>             Project: Jackrabbit Content Repository
>          Issue Type: Bug
>    Affects Versions: 2.4
>            Reporter: Unico Hommes
>             Fix For: 2.4.1
>
>         Attachments: bdbpm_allids.patch
>
>
> When using MySQL:
> The problem arises when the method parameter maxcount is less than the total amount of records in the bundle table.
> First of all I found out that mysql orders the nodeid objects different than jackrabbit does. The following test describes this idea:
>     public void testMySQLOrderByNodeId() throws Exception {
>         NodeId nodeId1 = new NodeId("7ff9e87c-f87f-4d35-9d61-2e298e56ac37");
>         NodeId nodeId2 = new NodeId("9fd0d452-b5d0-426b-8a0f-bef830ba0495");
>         PreparedStatement stmt = connection.prepareStatement("SELECT NODE_ID FROM DEFAULT_BUNDLE WHERE NODE_ID = ? OR NODE_ID = ? ORDER BY NODE_ID");
>         Object[] params = new Object[] { nodeId1.getRawBytes(), nodeId2.getRawBytes() };
>         stmt.setObject(1, params[0]);
>         stmt.setObject(2, params[1]);
>         ArrayList<NodeId> nodeIds = new ArrayList<NodeId>();
>         ResultSet resultSet = stmt.executeQuery();
>         while(resultSet.next()) {
>             NodeId nodeId = new NodeId(resultSet.getBytes(1));
>             System.out.println(nodeId);
>             nodeIds.add(nodeId);
>         }
>         Collections.sort(nodeIds);
>         for (NodeId nodeId : nodeIds) {
>             System.out.println(nodeId);
>         }
>     }
> Which results in the following output:
> 7ff9e87c-f87f-4d35-9d61-2e298e56ac37
> 9fd0d452-b5d0-426b-8a0f-bef830ba0495
> 9fd0d452-b5d0-426b-8a0f-bef830ba0495
> 7ff9e87c-f87f-4d35-9d61-2e298e56ac37
> Now the problem with the getAllNodeIds method is that it fetches an extra 10 records on top of maxcount (to avoid a problem where the first key is not the one you that is wanted). Afterwards it skips a number of records again, this time using nodeid.compareto. This compareto statement returns true unexpectedly for mysql because the code doesn't expect the mysql ordering.
> I had the situation where I had about 17000 records in the bundle table but consecutively getting the ids a thousand records at a time returned only about 8000 records in all.
> With Derby DB there is an infinite loop when it is used as it is for instance by ConsistencyCheckerImpl. This is because of the bundleSelectAllIdsFromSQL sql query that is done in the case of SM_LONGLONG_KEYS which is used in the case of Derby. Compare for instance that statement with the corresponding statement for SM_BINARY_KEYS. You will see that the former uses >= while the latter only >. The latter is correct. We want only records that are bigger than the passed in parameter, not bigger or equal.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (JCR-3261) Problems with BundleDbPersistenceManager getAllNodeIds

Posted by "Julian Reschke (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/JCR-3261?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13232598#comment-13232598 ] 

Julian Reschke commented on JCR-3261:
-------------------------------------

> OK I see. For data store garbage collection, you could live without the method, as persistence manager scan is optional there. But the consistency checker needs it, and changing the consistency checker to do a regular node traversal would be probably quite a lot of work. 

It would also defeat part of the purpose of the consistency checker (finding orphaned nodes). 
                
> Problems with BundleDbPersistenceManager getAllNodeIds
> ------------------------------------------------------
>
>                 Key: JCR-3261
>                 URL: https://issues.apache.org/jira/browse/JCR-3261
>             Project: Jackrabbit Content Repository
>          Issue Type: Bug
>    Affects Versions: 2.4
>            Reporter: Unico Hommes
>            Assignee: Bart van der Schans
>             Fix For: 2.4.1, 2.6
>
>         Attachments: bdbpm_allids.patch
>
>
> When using MySQL:
> The problem arises when the method parameter maxcount is less than the total amount of records in the bundle table.
> First of all I found out that mysql orders the nodeid objects different than jackrabbit does. The following test describes this idea:
>     public void testMySQLOrderByNodeId() throws Exception {
>         NodeId nodeId1 = new NodeId("7ff9e87c-f87f-4d35-9d61-2e298e56ac37");
>         NodeId nodeId2 = new NodeId("9fd0d452-b5d0-426b-8a0f-bef830ba0495");
>         PreparedStatement stmt = connection.prepareStatement("SELECT NODE_ID FROM DEFAULT_BUNDLE WHERE NODE_ID = ? OR NODE_ID = ? ORDER BY NODE_ID");
>         Object[] params = new Object[] { nodeId1.getRawBytes(), nodeId2.getRawBytes() };
>         stmt.setObject(1, params[0]);
>         stmt.setObject(2, params[1]);
>         ArrayList<NodeId> nodeIds = new ArrayList<NodeId>();
>         ResultSet resultSet = stmt.executeQuery();
>         while(resultSet.next()) {
>             NodeId nodeId = new NodeId(resultSet.getBytes(1));
>             System.out.println(nodeId);
>             nodeIds.add(nodeId);
>         }
>         Collections.sort(nodeIds);
>         for (NodeId nodeId : nodeIds) {
>             System.out.println(nodeId);
>         }
>     }
> Which results in the following output:
> 7ff9e87c-f87f-4d35-9d61-2e298e56ac37
> 9fd0d452-b5d0-426b-8a0f-bef830ba0495
> 9fd0d452-b5d0-426b-8a0f-bef830ba0495
> 7ff9e87c-f87f-4d35-9d61-2e298e56ac37
> Now the problem with the getAllNodeIds method is that it fetches an extra 10 records on top of maxcount (to avoid a problem where the first key is not the one you that is wanted). Afterwards it skips a number of records again, this time using nodeid.compareto. This compareto statement returns true unexpectedly for mysql because the code doesn't expect the mysql ordering.
> I had the situation where I had about 17000 records in the bundle table but consecutively getting the ids a thousand records at a time returned only about 8000 records in all.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (JCR-3261) Problems with BundleDbPersistenceManager getAllNodeIds

Posted by "Thomas Mueller (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/JCR-3261?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13232498#comment-13232498 ] 

Thomas Mueller commented on JCR-3261:
-------------------------------------

What is your use case, that is, why do you need getAllNodeIds? I'm just wondering if we really need this method... If we could remove it then we wouldn't have to deal with such problems.
                
> Problems with BundleDbPersistenceManager getAllNodeIds
> ------------------------------------------------------
>
>                 Key: JCR-3261
>                 URL: https://issues.apache.org/jira/browse/JCR-3261
>             Project: Jackrabbit Content Repository
>          Issue Type: Bug
>    Affects Versions: 2.4
>            Reporter: Unico Hommes
>             Fix For: 2.4.1
>
>         Attachments: bdbpm_allids.patch
>
>
> When using MySQL:
> The problem arises when the method parameter maxcount is less than the total amount of records in the bundle table.
> First of all I found out that mysql orders the nodeid objects different than jackrabbit does. The following test describes this idea:
>     public void testMySQLOrderByNodeId() throws Exception {
>         NodeId nodeId1 = new NodeId("7ff9e87c-f87f-4d35-9d61-2e298e56ac37");
>         NodeId nodeId2 = new NodeId("9fd0d452-b5d0-426b-8a0f-bef830ba0495");
>         PreparedStatement stmt = connection.prepareStatement("SELECT NODE_ID FROM DEFAULT_BUNDLE WHERE NODE_ID = ? OR NODE_ID = ? ORDER BY NODE_ID");
>         Object[] params = new Object[] { nodeId1.getRawBytes(), nodeId2.getRawBytes() };
>         stmt.setObject(1, params[0]);
>         stmt.setObject(2, params[1]);
>         ArrayList<NodeId> nodeIds = new ArrayList<NodeId>();
>         ResultSet resultSet = stmt.executeQuery();
>         while(resultSet.next()) {
>             NodeId nodeId = new NodeId(resultSet.getBytes(1));
>             System.out.println(nodeId);
>             nodeIds.add(nodeId);
>         }
>         Collections.sort(nodeIds);
>         for (NodeId nodeId : nodeIds) {
>             System.out.println(nodeId);
>         }
>     }
> Which results in the following output:
> 7ff9e87c-f87f-4d35-9d61-2e298e56ac37
> 9fd0d452-b5d0-426b-8a0f-bef830ba0495
> 9fd0d452-b5d0-426b-8a0f-bef830ba0495
> 7ff9e87c-f87f-4d35-9d61-2e298e56ac37
> Now the problem with the getAllNodeIds method is that it fetches an extra 10 records on top of maxcount (to avoid a problem where the first key is not the one you that is wanted). Afterwards it skips a number of records again, this time using nodeid.compareto. This compareto statement returns true unexpectedly for mysql because the code doesn't expect the mysql ordering.
> I had the situation where I had about 17000 records in the bundle table but consecutively getting the ids a thousand records at a time returned only about 8000 records in all.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira