You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@sling.apache.org by Christoph Thodte <ch...@ht-solutions.de> on 2016/10/28 14:55:07 UTC

Creating lots of nodes

Hello!

What is the best and fast way to create a lot of resources in Sling? I import 200.000 data rows in jcr. My importer is very fast for the 30.000 nodes than it will be very slow down. I commit my resourceresolver ervery 100 resources. The committing is fine but the time for creation of the resource is increased very fast. After 40.000 nodes the time is around 20min for creation of 100 nodes.

What is the problem? How can I speed up. Can anyone support or explain this?
As datastore I use the mongodb. With tar it's slower than mongo. I use Sling API not the JCR Api. That's the problem?

Christoph



Re: Creating lots of nodes

Posted by Bertrand Delacretaz <bd...@apache.org>.
Hi,

On Fri, Oct 28, 2016 at 4:55 PM, Christoph Thodte
<ch...@ht-solutions.de> wrote:
> ...My importer is very fast for the 30.000 nodes than it will be very slow down....

It's not likely that Sling is the cause of this, but to find out the
best way by far is profiling or at least looking at DEBUG log messages
to get a feel for what's being slow.

-Bertrand

Re: Creating lots of nodes

Posted by Jason E Bailey <ja...@24601.org>.
Additionally, if you are creating child nodes, you want them to be using
a non ordered structure. If you're using an ordered parent I could see
it creating a significant impact after a while.

Also, indexing. If you've got indexing going on that includes what
you're inserting that may have an impact as well.

--
Jason

On Sat, Oct 29, 2016, at 02:27 PM, Steven Walters wrote:
> On Fri, Oct 28, 2016 at 11:55 PM, Christoph Thodte
> <ch...@ht-solutions.de> wrote:
> > Hello!
> >
> > What is the best and fast way to create a lot of resources in Sling? I import 200.000 data rows in jcr. My importer is very fast for the 30.000 nodes than it will be very slow down. I commit my resourceresolver ervery 100 resources. The committing is fine but the time for creation of the resource is increased very fast. After 40.000 nodes the time is around 20min for creation of 100 nodes.
> >
> > What is the problem? How can I speed up. Can anyone support or explain this?
> > As datastore I use the mongodb. With tar it's slower than mongo. I use Sling API not the JCR Api. That's the problem?
> 
> I've not seen any particular performance difference in the past
> between using the Sling API vs the JCR API for massive data creation
> like this.
> 
> Can you elaborate a bit more on how you're organizing the data that
> you're creating within Sling?
> 
> That is, in the past there have been known performance problems with
> having a large number of direct children nodes/resources under a
> single parent within the JCR.
> So just wondering how you're structuring the data as you're creating
> it within Sling.
> Without such information, it's mostly grabbing at straws to guess what
> your problem may be.

Re: Creating lots of nodes

Posted by Steven Walters <ke...@gmail.com>.
On Fri, Oct 28, 2016 at 11:55 PM, Christoph Thodte
<ch...@ht-solutions.de> wrote:
> Hello!
>
> What is the best and fast way to create a lot of resources in Sling? I import 200.000 data rows in jcr. My importer is very fast for the 30.000 nodes than it will be very slow down. I commit my resourceresolver ervery 100 resources. The committing is fine but the time for creation of the resource is increased very fast. After 40.000 nodes the time is around 20min for creation of 100 nodes.
>
> What is the problem? How can I speed up. Can anyone support or explain this?
> As datastore I use the mongodb. With tar it's slower than mongo. I use Sling API not the JCR Api. That's the problem?

I've not seen any particular performance difference in the past
between using the Sling API vs the JCR API for massive data creation
like this.

Can you elaborate a bit more on how you're organizing the data that
you're creating within Sling?

That is, in the past there have been known performance problems with
having a large number of direct children nodes/resources under a
single parent within the JCR.
So just wondering how you're structuring the data as you're creating
it within Sling.
Without such information, it's mostly grabbing at straws to guess what
your problem may be.