You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@jackrabbit.apache.org by Mark Moales <ma...@ansys.com> on 2016/02/05 23:12:57 UTC

Performance using FileDataStore/MySql vs. MongoDB in Oak

Hi,

We are currently using Jackrabbit 2 in a production application and are
looking to move to Oak. In our Jackrabbit 2 application, we are using
MySql, BundleDbPersistenceManager, and a FileDataStore. When using a
similar configuration with Oak, we are noticing significant performance
issues when removing nodes from the repository. To show the performance
issue, I have created a simple example that creates a single parent node of
nt:unstructured type and then adds 150 more child nodes of the same type.
Each node, including the parent, has 10 properties and has a
mix:referenceable mixin.  Here is how I create the DocumentNodeStore:

MySql:

    static FileStore store;
    static FileDataStore fds;

    static void initFileStore() throws IOException {
        File testDir = new File("F:\\oak-datastore");
        FileDataStore _fds = new OakFileDataStore();
        _fds.init(testDir.getPath());
        FileStore.Builder fileStoreBuilder = FileStore.newFileStore(testDir)
            .withBlobStore(new DataStoreBlobStore(_fds))
            .withMaxFileSize(256)
            .withCacheSize(64)
            .withMemoryMapping(false);
        store = fileStoreBuilder.create();
        fds = _fds;
    }

    static DocumentNodeStore createDocumentNodeStoreMySql() {
        System.out.println("Using MySql/FileDataStore");
        String url = "jdbc:mysql://localhost:3306/oak";
        String userName = "root";
        String password = "password";
        String driver = "org.mariadb.jdbc.Driver";
        String prefix = "T" + UUID.randomUUID().toString().replace("-", "");
        DataSource ds = RDBDataSourceFactory.forJdbcUrl(url, userName,
password, driver);
        RDBOptions options = new
RDBOptions().tablePrefix(prefix).dropTablesOnClose(true);
        DocumentNodeStore ns = new
DocumentMK.Builder().setAsyncDelay(0).setBlobStore(
            new DataStoreBlobStore(fds)).setClusterId(1).setRDBConnection(
                ds, options).getNodeStore();
        return ns;
    }

MongoDB:

    static DocumentNodeStore createDocumentNodeStoreMongo() {
        System.out.println("Using MongoDB");
        DB db = new MongoClient("127.0.0.1", 27017).getDB("oak");
        DocumentNodeStore ns = new
DocumentMK.Builder().setMongoDB(db).getNodeStore();
        return ns;
    }

And here is how I create the nodes:

    static Node createNodes(Session session) throws RepositoryException {
        Node root = session.getRootNode();
        Date start = new Date();
        Node parent = root.addNode("Test", "nt:unstructured");
        totalNodes++;
        //
        // Commenting out the following line speeds up delete.
        //
        parent.addMixin("mix:referenceable");

        for (int p = 0; p < 10; p++) {
            parent.setProperty("count" + p, p);
        }
        for (int i = 0; i < 150; i++) {
            createChild(parent, i);
        }
        session.save();
        Date end = new Date();
        double seconds = (double) (end.getTime() - start.getTime()) /
1000.00;
        System.out.println("Created " + totalNodes + " nodes in " + seconds
+ " seconds");
        return parent;
    }

    static void createChild(Node parent, int index)
        throws RepositoryException {
        Node child = parent.addNode("Test" + index, "nt:unstructured");
        totalNodes++;
        //
        // Commenting out the following line speeds up delete.
        //
        child.addMixin("mix:referenceable");

        for (int p = 0; p < 10; p++) {
            child.setProperty("count" + p, p);
        }
    }

Finally, here is how I remove the parent node:

    static void deleteNodes(Session session, Node parent) throws
RepositoryException {
        Date start = new Date();
        parent.remove();
        session.save();
        Date end = new Date();
        double seconds = (double) (end.getTime() - start.getTime()) /
1000.00;
        System.out.println("Deleted " + totalNodes + " nodes in " + seconds
+ " seconds");
    }

When I run using MongoDB:

Using MongoDB
Created 151 nodes in 0.31 seconds
Deleted 151 nodes in 0.341 seconds

When I run using MySql/FileDataStore:

Using MySql/FileDataStore
Created 151 nodes in 0.391 seconds
Deleted 151 nodes in 10.946 seconds

As you can see, deleting the nodes is quite slow using MySql/FileDataStore.
Using Jackrabbit 2, a similar type of operation using MySql and file data
store takes approximately 1 second. Can anyone shed any light on why it is
taking so long to remove the nodes? Is there something I should be doing
differently when creating the DocumentNodeStore?

Thanks,

Mark

Re: Performance using FileDataStore/MySql vs. MongoDB in Oak

Posted by Clay Ferguson <wc...@gmail.com>.
Mark,
Right off the top of my head, I'd guess MySql is doing 151 commits somehow.
MongoDb will definitely be orders of magnitude faster than MySQL on large
numbers of node operations, because it use a 'delayed read', 'delayed
write', lazy operation rather than being ACID Commits like RDBMSes do.
There may be a setting somewhere to make the MySql do batch operations
rather than individual commits. It may even be something as simple as
turning off autocommit at the JDBC layer somewhere or something like that.
Check and see if there is some logging you can turn on trace-level using
log4j options targeting the JDBC classes and see if you can get it to log
out all the sql commits/transactions. I moved from MySql to mongo months
ago or else I'd be of more help.



Best regards,
Clay Ferguson
wclayf@gmail.com


On Fri, Feb 5, 2016 at 4:12 PM, Mark Moales <ma...@ansys.com> wrote:

> Hi,
>
> We are currently using Jackrabbit 2 in a production application and are
> looking to move to Oak. In our Jackrabbit 2 application, we are using
> MySql, BundleDbPersistenceManager, and a FileDataStore. When using a
> similar configuration with Oak, we are noticing significant performance
> issues when removing nodes from the repository. To show the performance
> issue, I have created a simple example that creates a single parent node of
> nt:unstructured type and then adds 150 more child nodes of the same type.
> Each node, including the parent, has 10 properties and has a
> mix:referenceable mixin.  Here is how I create the DocumentNodeStore:
>
> MySql:
>
>     static FileStore store;
>     static FileDataStore fds;
>
>     static void initFileStore() throws IOException {
>         File testDir = new File("F:\\oak-datastore");
>         FileDataStore _fds = new OakFileDataStore();
>         _fds.init(testDir.getPath());
>         FileStore.Builder fileStoreBuilder =
> FileStore.newFileStore(testDir)
>             .withBlobStore(new DataStoreBlobStore(_fds))
>             .withMaxFileSize(256)
>             .withCacheSize(64)
>             .withMemoryMapping(false);
>         store = fileStoreBuilder.create();
>         fds = _fds;
>     }
>
>     static DocumentNodeStore createDocumentNodeStoreMySql() {
>         System.out.println("Using MySql/FileDataStore");
>         String url = "jdbc:mysql://localhost:3306/oak";
>         String userName = "root";
>         String password = "password";
>         String driver = "org.mariadb.jdbc.Driver";
>         String prefix = "T" + UUID.randomUUID().toString().replace("-",
> "");
>         DataSource ds = RDBDataSourceFactory.forJdbcUrl(url, userName,
> password, driver);
>         RDBOptions options = new
> RDBOptions().tablePrefix(prefix).dropTablesOnClose(true);
>         DocumentNodeStore ns = new
> DocumentMK.Builder().setAsyncDelay(0).setBlobStore(
>             new DataStoreBlobStore(fds)).setClusterId(1).setRDBConnection(
>                 ds, options).getNodeStore();
>         return ns;
>     }
>
> MongoDB:
>
>     static DocumentNodeStore createDocumentNodeStoreMongo() {
>         System.out.println("Using MongoDB");
>         DB db = new MongoClient("127.0.0.1", 27017).getDB("oak");
>         DocumentNodeStore ns = new
> DocumentMK.Builder().setMongoDB(db).getNodeStore();
>         return ns;
>     }
>
> And here is how I create the nodes:
>
>     static Node createNodes(Session session) throws RepositoryException {
>         Node root = session.getRootNode();
>         Date start = new Date();
>         Node parent = root.addNode("Test", "nt:unstructured");
>         totalNodes++;
>         //
>         // Commenting out the following line speeds up delete.
>         //
>         parent.addMixin("mix:referenceable");
>
>         for (int p = 0; p < 10; p++) {
>             parent.setProperty("count" + p, p);
>         }
>         for (int i = 0; i < 150; i++) {
>             createChild(parent, i);
>         }
>         session.save();
>         Date end = new Date();
>         double seconds = (double) (end.getTime() - start.getTime()) /
> 1000.00;
>         System.out.println("Created " + totalNodes + " nodes in " +
> seconds + " seconds");
>         return parent;
>     }
>
>     static void createChild(Node parent, int index)
>         throws RepositoryException {
>         Node child = parent.addNode("Test" + index, "nt:unstructured");
>         totalNodes++;
>         //
>         // Commenting out the following line speeds up delete.
>         //
>         child.addMixin("mix:referenceable");
>
>         for (int p = 0; p < 10; p++) {
>             child.setProperty("count" + p, p);
>         }
>     }
>
> Finally, here is how I remove the parent node:
>
>     static void deleteNodes(Session session, Node parent) throws
> RepositoryException {
>         Date start = new Date();
>         parent.remove();
>         session.save();
>         Date end = new Date();
>         double seconds = (double) (end.getTime() - start.getTime()) /
> 1000.00;
>         System.out.println("Deleted " + totalNodes + " nodes in " +
> seconds + " seconds");
>     }
>
> When I run using MongoDB:
>
> Using MongoDB
> Created 151 nodes in 0.31 seconds
> Deleted 151 nodes in 0.341 seconds
>
> When I run using MySql/FileDataStore:
>
> Using MySql/FileDataStore
> Created 151 nodes in 0.391 seconds
> Deleted 151 nodes in 10.946 seconds
>
> As you can see, deleting the nodes is quite slow using
> MySql/FileDataStore. Using Jackrabbit 2, a similar type of operation using
> MySql and file data store takes approximately 1 second. Can anyone shed
> any light on why it is taking so long to remove the nodes? Is there
> something I should be doing differently when creating the DocumentNodeStore?
>
> Thanks,
>
> Mark
>
>