You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@jackrabbit.apache.org by Wolfgang Gehner <wg...@infonoia.com> on 2004/11/10 17:41:46 UTC

Multirow update/insert/delete issue

As discussed with David offline, when 1000 nodes are inserted, in the current implementation the PersistenceMgr.store() method 
is called a 1000 times. So the XMLPersistenceMgr takes 30 seconds to do those 1000 write operations. A JDBC implementation of the current PersistenceMgr API is "condemned" to do the same thing. We'd really look to a way to bundle those 1000 writes into one "transaction", so we can take 2-3 seconds on a relational database rather than 30.

So we'd like to throw into the discussion the following thoughts:
- how about maintaining an instance of of PersistenceMgr (pm) not on (Persistent)NodeState but on NodeImpl
- the implementation of node.save() to collect info what nodes incl. children to save and call a persistenceMgr.store(
nodesToUpdate, nodesToInsert, nodesToDelete) just once. That way the pm could bundle operations in line with the
repository requirements.

This would make Jackrabbit's persistence model follow the DAO (data access object) pattern as we understand it. 

Would be pleased to elaborate and discuss. And share our JDBC PersistenceMgr prototype with anyone interested (it passes the current api unit test, but has a very non-optimized ER design and is inflicted with the issue discussed in this message).

Best regards,

Infonoia S.A.
rue de Berne 7
1201 Geneva
Tel: +41 22 9000 009
Fax: +41 22 9000 018
wgehner@infonoia.com
http://www.infonoia.com

Re: Multirow update/insert/delete issue

Posted by Wolfgang Gehner <wg...@infonoia.com>.
That's great!

Best regards,

Wolfgang

----- Original Message ----- 
From: "Stefan Guggisberg" <st...@gmail.com>
To: <ja...@incubator.apache.org>
Sent: Friday, November 12, 2004 12:45 PM
Subject: Re: Multirow update/insert/delete issue


> On Fri, 12 Nov 2004 07:29:05 +0100, Wolfgang Gehner
> <wg...@infonoia.com> wrote:
> > Maybe we talk about the same thing in different ways?
> >
> > So you do
> >
> > dbtransaction.begin()
> > insert ... (one row)
> > dbtransaction.commit()
> > dbtransaction.begin()
> > insert ... (one row)
> > dbtransaction.commit()
> >
> > a thousand times?
> >
> > We want to do
> > dbtransaction.begin()
> > insert .. (one row)
> > insert .. (one row)
> > insert .. (one row)
> > etc..
> > dbtransaction.commit()
> > ... which I hope you will concede would me more efficient, and
> > where we can do a thousand in no time at all, pretty much no matter what
the
> > underlying database. BTW, what's your configuration?
>
> i tested with hsqldb, auto-commit turned on.
>
> >
> > Of cource a user might also *want* to ensure that either all operations
> > succeed or none.
>
> the transaction support currently in jackrabbit does not depend on
> a persistence manager being transactional.
>
> >
> > ...and we wonder how we can realize this observing the current
> > PersistenceMgr api, and thought you might have an idea. A
> > persistenceMgr.store(nodesToUpdate, nodesToInsert, nodesToDelete) would
be
> > useful for us, but we were also thinking of consuming a save() event so
we
> > know when to commit.
> >
>
> we are thinking of changing the persistence manager interface to
> enable/help implementors using as much of jackrabbit's code as possible
> on top of their own persistence data model. this has a lot of implications
> and requires a partial redesign of the current implementation (e.g. the
> transaction support is affected), not just adding a bulk persist method
> the the PersistenceManager interface.
>
> the ease of adapting arbitrary legacy data models hasn't been a design
goal
> when i started the implementation but i agree that it is certainly a good
thing
> (as long as it doesn't compromise/limit jackrabbit's current
functionality).
>
> there will probably a method similar to one you suggested and i'll keep
you
> posted on the progress of the redesign.  is this ok with you?
>
> cheers
> stefan
>
> >
> >
> >
> > Wolfgang
> >
> > ----- Original Message -----
> > From: "Stefan Guggisberg" <st...@gmail.com>
> > To: <ja...@incubator.apache.org>
> > Sent: Thursday, November 11, 2004 6:36 PM
> > Subject: Re: Multirow update/insert/delete issue
> >
> > > On Thu, 11 Nov 2004 12:32:27 +0100, Wolfgang Gehner
> > > <wg...@infonoia.com> wrote:
> > > > We're fully aware of the good benchmarks when not using
LocalFileSystem.
> > > > "3. Object with LocalFileSystem, not surprisingly either, showed the
> > worst
> > > >    performance: ca. 30 sec./1000 nodes"
> > > >
> > > > So there is no criticism implied or intended whatsoever.
> > > > I've just taken the analogy that writing to a db is like writing a
> > thousand
> > > > files *when it's done one by one*.
> > >
> > > sorry, i still don't buy this. the jdbc based persistence manager i
hacked
> > > together is just doing that: if 1000 nodes are added and saved in one
> > call,
> > > it is inserting 1000 node records plus 1000 property records *one by
one*.
> > > i ran the test and it averaged at 3 - 3.5 sec./1000 nodes. in fact it
came
> > > close to the best results that i got with the b-tree based persistence
> > > managers.
> > >
> > >
> > > >
> > > > We are new to the Jackrabbit api and wonder how we can wrap multiple
> > node
> > > > writes/inserts/or deletes in one db transaction with the current
> > > > PersistenceMgr API. When we can do that, performance will be no
issue.
> > We
> > > > might have PersistentMgr listen to an event emitted by node.save(),
and
> > > > persist only then? What do you think?
> > >
> > > the bad performance you are experiencing is imo not caused by the data
> > > model of your underlying persistence layer, not by the current
> > implementation
> > > of jackrabbit. if you send me the schema that you are using for
> > > persisting nodes and properties in a rdbms, i will have a look at it.
> > >
> > > >
> > > > Would you like to look at our code as is?
> > >
> > > sure.
> > >
> > > regards
> > > stefan
> > >
> > > >
> > > > Stefan, we look forward to your recommendation.
> > > >
> > > > Best regards,
> > > >
> > > > Wolfgang
> > > >
> > > >
> > > >
> > > > ----- Original Message -----
> > > > From: "Stefan Guggisberg" <st...@gmail.com>
> > > > To: <ja...@incubator.apache.org>
> > > > Sent: Wednesday, November 10, 2004 6:36 PM
> > > > Subject: Re: Multirow update/insert/delete issue
> > > >
> > > > > a few comments/clarifcations inline...
> > > > >
> > > > > On Wed, 10 Nov 2004 17:41:46 +0100, Wolfgang Gehner
> > > > > <wg...@infonoia.com> wrote:
> > > > > >
> > > > > > As discussed with David offline, when 1000 nodes are inserted,
in
> > the
> > > > current implementation the PersistenceMgr.store() method
> > > > > > is called a 1000 times. So the XMLPersistenceMgr takes 30
seconds to
> > do
> > > > those 1000 write operations.
> > > > >
> > > > > not quite correct: i said that the XML/ObjectPersistenceManager in
> > > > > combination on a CQFileSystem takes ca. 5 sec. for adding and
saving
> > > > > 1000 nodes (that's
> > > > > 2000 write operations, 1000 nodes + 1000 properties).
> > > > >
> > > > > > A JDBC implementation of the current PersistenceMgr API is
> > "condemned"
> > > > to do the same thing. We'd really look to a way to bundle those 1000
> > writes
> > > > into one "transaction", so we can take 2-3 seconds on a relational
> > database
> > > > rather than 30.
> > > > >
> > > > > again, a jdbc implementation is *not* condemned to take 30 sec.!
> > > > > i hacked a quick&dirty implementation of a jdbc persistence
manager
> > (with
> > > > a very
> > > > > *primitive* schema) that took less than 5 sec. for adding and
saving
> > 1000
> > > > nodes.
> > > > >
> > > > > >
> > > > > > So we'd like to throw into the discussion the following
thoughts:
> > > > > > - how about maintaining an instance of of PersistenceMgr (pm)
not on
> > > > (Persistent)NodeState but on NodeImpl
> > > > > > - the implementation of node.save() to collect info what nodes
incl.
> > > > children to save and call a persistenceMgr.store(
> > > > > > nodesToUpdate, nodesToInsert, nodesToDelete) just once. That way
the
> > pm
> > > > could bundle operations in line with the
> > > > > > repository requirements.
> > > > > >
> > > > > > This would make Jackrabbit's persistence model follow the DAO
(data
> > > > access object) pattern as we understand it.
> > > > > >
> > > > > > Would be pleased to elaborate and discuss. And share our JDBC
> > > > PersistenceMgr prototype with anyone interested (it passes the
current
> > api
> > > > unit test, but has a very non-optimized ER design and is inflicted
with
> > the
> > > > issue discussed in this message).
> > > > > >
> > > > > > Best regards,
> > > > > >
> > > > > > Infonoia S.A.
> > > > > > rue de Berne 7
> > > > > > 1201 Geneva
> > > > > > Tel: +41 22 9000 009
> > > > > > Fax: +41 22 9000 018
> > > > > > wgehner@infonoia.com
> > > > > > http://www.infonoia.com
> > > > > >
> > > >
> > > >
> >
> >


Re: Multirow update/insert/delete issue

Posted by Stefan Guggisberg <st...@gmail.com>.
On Fri, 12 Nov 2004 07:29:05 +0100, Wolfgang Gehner
<wg...@infonoia.com> wrote:
> Maybe we talk about the same thing in different ways?
> 
> So you do
> 
> dbtransaction.begin()
> insert ... (one row)
> dbtransaction.commit()
> dbtransaction.begin()
> insert ... (one row)
> dbtransaction.commit()
> 
> a thousand times?
> 
> We want to do
> dbtransaction.begin()
> insert .. (one row)
> insert .. (one row)
> insert .. (one row)
> etc..
> dbtransaction.commit()
> ... which I hope you will concede would me more efficient, and
> where we can do a thousand in no time at all, pretty much no matter what the
> underlying database. BTW, what's your configuration?

i tested with hsqldb, auto-commit turned on.

> 
> Of cource a user might also *want* to ensure that either all operations
> succeed or none.

the transaction support currently in jackrabbit does not depend on
a persistence manager being transactional.

> 
> ...and we wonder how we can realize this observing the current
> PersistenceMgr api, and thought you might have an idea. A
> persistenceMgr.store(nodesToUpdate, nodesToInsert, nodesToDelete) would be
> useful for us, but we were also thinking of consuming a save() event so we
> know when to commit.
> 

we are thinking of changing the persistence manager interface to 
enable/help implementors using as much of jackrabbit's code as possible
on top of their own persistence data model. this has a lot of implications
and requires a partial redesign of the current implementation (e.g. the
transaction support is affected), not just adding a bulk persist method
the the PersistenceManager interface. 

the ease of adapting arbitrary legacy data models hasn't been a design goal
when i started the implementation but i agree that it is certainly a good thing
(as long as it doesn't compromise/limit jackrabbit's current functionality).

there will probably a method similar to one you suggested and i'll keep you 
posted on the progress of the redesign.  is this ok with you?

cheers
stefan

> 
> 
> 
> Wolfgang
> 
> ----- Original Message -----
> From: "Stefan Guggisberg" <st...@gmail.com>
> To: <ja...@incubator.apache.org>
> Sent: Thursday, November 11, 2004 6:36 PM
> Subject: Re: Multirow update/insert/delete issue
> 
> > On Thu, 11 Nov 2004 12:32:27 +0100, Wolfgang Gehner
> > <wg...@infonoia.com> wrote:
> > > We're fully aware of the good benchmarks when not using LocalFileSystem.
> > > "3. Object with LocalFileSystem, not surprisingly either, showed the
> worst
> > >    performance: ca. 30 sec./1000 nodes"
> > >
> > > So there is no criticism implied or intended whatsoever.
> > > I've just taken the analogy that writing to a db is like writing a
> thousand
> > > files *when it's done one by one*.
> >
> > sorry, i still don't buy this. the jdbc based persistence manager i hacked
> > together is just doing that: if 1000 nodes are added and saved in one
> call,
> > it is inserting 1000 node records plus 1000 property records *one by one*.
> > i ran the test and it averaged at 3 - 3.5 sec./1000 nodes. in fact it came
> > close to the best results that i got with the b-tree based persistence
> > managers.
> >
> >
> > >
> > > We are new to the Jackrabbit api and wonder how we can wrap multiple
> node
> > > writes/inserts/or deletes in one db transaction with the current
> > > PersistenceMgr API. When we can do that, performance will be no issue.
> We
> > > might have PersistentMgr listen to an event emitted by node.save(), and
> > > persist only then? What do you think?
> >
> > the bad performance you are experiencing is imo not caused by the data
> > model of your underlying persistence layer, not by the current
> implementation
> > of jackrabbit. if you send me the schema that you are using for
> > persisting nodes and properties in a rdbms, i will have a look at it.
> >
> > >
> > > Would you like to look at our code as is?
> >
> > sure.
> >
> > regards
> > stefan
> >
> > >
> > > Stefan, we look forward to your recommendation.
> > >
> > > Best regards,
> > >
> > > Wolfgang
> > >
> > >
> > >
> > > ----- Original Message -----
> > > From: "Stefan Guggisberg" <st...@gmail.com>
> > > To: <ja...@incubator.apache.org>
> > > Sent: Wednesday, November 10, 2004 6:36 PM
> > > Subject: Re: Multirow update/insert/delete issue
> > >
> > > > a few comments/clarifcations inline...
> > > >
> > > > On Wed, 10 Nov 2004 17:41:46 +0100, Wolfgang Gehner
> > > > <wg...@infonoia.com> wrote:
> > > > >
> > > > > As discussed with David offline, when 1000 nodes are inserted, in
> the
> > > current implementation the PersistenceMgr.store() method
> > > > > is called a 1000 times. So the XMLPersistenceMgr takes 30 seconds to
> do
> > > those 1000 write operations.
> > > >
> > > > not quite correct: i said that the XML/ObjectPersistenceManager in
> > > > combination on a CQFileSystem takes ca. 5 sec. for adding and saving
> > > > 1000 nodes (that's
> > > > 2000 write operations, 1000 nodes + 1000 properties).
> > > >
> > > > > A JDBC implementation of the current PersistenceMgr API is
> "condemned"
> > > to do the same thing. We'd really look to a way to bundle those 1000
> writes
> > > into one "transaction", so we can take 2-3 seconds on a relational
> database
> > > rather than 30.
> > > >
> > > > again, a jdbc implementation is *not* condemned to take 30 sec.!
> > > > i hacked a quick&dirty implementation of a jdbc persistence manager
> (with
> > > a very
> > > > *primitive* schema) that took less than 5 sec. for adding and saving
> 1000
> > > nodes.
> > > >
> > > > >
> > > > > So we'd like to throw into the discussion the following thoughts:
> > > > > - how about maintaining an instance of of PersistenceMgr (pm) not on
> > > (Persistent)NodeState but on NodeImpl
> > > > > - the implementation of node.save() to collect info what nodes incl.
> > > children to save and call a persistenceMgr.store(
> > > > > nodesToUpdate, nodesToInsert, nodesToDelete) just once. That way the
> pm
> > > could bundle operations in line with the
> > > > > repository requirements.
> > > > >
> > > > > This would make Jackrabbit's persistence model follow the DAO (data
> > > access object) pattern as we understand it.
> > > > >
> > > > > Would be pleased to elaborate and discuss. And share our JDBC
> > > PersistenceMgr prototype with anyone interested (it passes the current
> api
> > > unit test, but has a very non-optimized ER design and is inflicted with
> the
> > > issue discussed in this message).
> > > > >
> > > > > Best regards,
> > > > >
> > > > > Infonoia S.A.
> > > > > rue de Berne 7
> > > > > 1201 Geneva
> > > > > Tel: +41 22 9000 009
> > > > > Fax: +41 22 9000 018
> > > > > wgehner@infonoia.com
> > > > > http://www.infonoia.com
> > > > >
> > >
> > >
> 
>

Re: Multirow update/insert/delete issue

Posted by Wolfgang Gehner <wg...@infonoia.com>.
Maybe we talk about the same thing in different ways?

So you do

dbtransaction.begin()
insert ... (one row)
dbtransaction.commit()
dbtransaction.begin()
insert ... (one row)
dbtransaction.commit()

a thousand times?

We want to do
dbtransaction.begin()
insert .. (one row)
insert .. (one row)
insert .. (one row)
etc..
dbtransaction.commit()
... which I hope you will concede would me more efficient, and
where we can do a thousand in no time at all, pretty much no matter what the
underlying database. BTW, what's your configuration?

Of cource a user might also *want* to ensure that either all operations
succeed or none.

...and we wonder how we can realize this observing the current
PersistenceMgr api, and thought you might have an idea. A
persistenceMgr.store(nodesToUpdate, nodesToInsert, nodesToDelete) would be
useful for us, but we were also thinking of consuming a save() event so we
know when to commit.


Wolfgang

----- Original Message ----- 
From: "Stefan Guggisberg" <st...@gmail.com>
To: <ja...@incubator.apache.org>
Sent: Thursday, November 11, 2004 6:36 PM
Subject: Re: Multirow update/insert/delete issue


> On Thu, 11 Nov 2004 12:32:27 +0100, Wolfgang Gehner
> <wg...@infonoia.com> wrote:
> > We're fully aware of the good benchmarks when not using LocalFileSystem.
> > "3. Object with LocalFileSystem, not surprisingly either, showed the
worst
> >    performance: ca. 30 sec./1000 nodes"
> >
> > So there is no criticism implied or intended whatsoever.
> > I've just taken the analogy that writing to a db is like writing a
thousand
> > files *when it's done one by one*.
>
> sorry, i still don't buy this. the jdbc based persistence manager i hacked
> together is just doing that: if 1000 nodes are added and saved in one
call,
> it is inserting 1000 node records plus 1000 property records *one by one*.
> i ran the test and it averaged at 3 - 3.5 sec./1000 nodes. in fact it came
> close to the best results that i got with the b-tree based persistence
> managers.
>
>
> >
> > We are new to the Jackrabbit api and wonder how we can wrap multiple
node
> > writes/inserts/or deletes in one db transaction with the current
> > PersistenceMgr API. When we can do that, performance will be no issue.
We
> > might have PersistentMgr listen to an event emitted by node.save(), and
> > persist only then? What do you think?
>
> the bad performance you are experiencing is imo not caused by the data
> model of your underlying persistence layer, not by the current
implementation
> of jackrabbit. if you send me the schema that you are using for
> persisting nodes and properties in a rdbms, i will have a look at it.
>
> >
> > Would you like to look at our code as is?
>
> sure.
>
> regards
> stefan
>
> >
> > Stefan, we look forward to your recommendation.
> >
> > Best regards,
> >
> > Wolfgang
> >
> >
> >
> > ----- Original Message -----
> > From: "Stefan Guggisberg" <st...@gmail.com>
> > To: <ja...@incubator.apache.org>
> > Sent: Wednesday, November 10, 2004 6:36 PM
> > Subject: Re: Multirow update/insert/delete issue
> >
> > > a few comments/clarifcations inline...
> > >
> > > On Wed, 10 Nov 2004 17:41:46 +0100, Wolfgang Gehner
> > > <wg...@infonoia.com> wrote:
> > > >
> > > > As discussed with David offline, when 1000 nodes are inserted, in
the
> > current implementation the PersistenceMgr.store() method
> > > > is called a 1000 times. So the XMLPersistenceMgr takes 30 seconds to
do
> > those 1000 write operations.
> > >
> > > not quite correct: i said that the XML/ObjectPersistenceManager in
> > > combination on a CQFileSystem takes ca. 5 sec. for adding and saving
> > > 1000 nodes (that's
> > > 2000 write operations, 1000 nodes + 1000 properties).
> > >
> > > > A JDBC implementation of the current PersistenceMgr API is
"condemned"
> > to do the same thing. We'd really look to a way to bundle those 1000
writes
> > into one "transaction", so we can take 2-3 seconds on a relational
database
> > rather than 30.
> > >
> > > again, a jdbc implementation is *not* condemned to take 30 sec.!
> > > i hacked a quick&dirty implementation of a jdbc persistence manager
(with
> > a very
> > > *primitive* schema) that took less than 5 sec. for adding and saving
1000
> > nodes.
> > >
> > > >
> > > > So we'd like to throw into the discussion the following thoughts:
> > > > - how about maintaining an instance of of PersistenceMgr (pm) not on
> > (Persistent)NodeState but on NodeImpl
> > > > - the implementation of node.save() to collect info what nodes incl.
> > children to save and call a persistenceMgr.store(
> > > > nodesToUpdate, nodesToInsert, nodesToDelete) just once. That way the
pm
> > could bundle operations in line with the
> > > > repository requirements.
> > > >
> > > > This would make Jackrabbit's persistence model follow the DAO (data
> > access object) pattern as we understand it.
> > > >
> > > > Would be pleased to elaborate and discuss. And share our JDBC
> > PersistenceMgr prototype with anyone interested (it passes the current
api
> > unit test, but has a very non-optimized ER design and is inflicted with
the
> > issue discussed in this message).
> > > >
> > > > Best regards,
> > > >
> > > > Infonoia S.A.
> > > > rue de Berne 7
> > > > 1201 Geneva
> > > > Tel: +41 22 9000 009
> > > > Fax: +41 22 9000 018
> > > > wgehner@infonoia.com
> > > > http://www.infonoia.com
> > > >
> >
> >


Re: Multirow update/insert/delete issue

Posted by Stefan Guggisberg <st...@gmail.com>.
On Thu, 11 Nov 2004 12:32:27 +0100, Wolfgang Gehner
<wg...@infonoia.com> wrote:
> We're fully aware of the good benchmarks when not using LocalFileSystem.
> "3. Object with LocalFileSystem, not surprisingly either, showed the worst
>    performance: ca. 30 sec./1000 nodes"
> 
> So there is no criticism implied or intended whatsoever.
> I've just taken the analogy that writing to a db is like writing a thousand
> files *when it's done one by one*.

sorry, i still don't buy this. the jdbc based persistence manager i hacked 
together is just doing that: if 1000 nodes are added and saved in one call, 
it is inserting 1000 node records plus 1000 property records *one by one*.
i ran the test and it averaged at 3 - 3.5 sec./1000 nodes. in fact it came
close to the best results that i got with the b-tree based persistence 
managers.


> 
> We are new to the Jackrabbit api and wonder how we can wrap multiple node
> writes/inserts/or deletes in one db transaction with the current
> PersistenceMgr API. When we can do that, performance will be no issue. We
> might have PersistentMgr listen to an event emitted by node.save(), and
> persist only then? What do you think?

the bad performance you are experiencing is imo not caused by the data
model of your underlying persistence layer, not by the current implementation 
of jackrabbit. if you send me the schema that you are using for 
persisting nodes and properties in a rdbms, i will have a look at it.

> 
> Would you like to look at our code as is?

sure. 

regards
stefan

> 
> Stefan, we look forward to your recommendation.
> 
> Best regards,
> 
> Wolfgang
> 
> 
> 
> ----- Original Message -----
> From: "Stefan Guggisberg" <st...@gmail.com>
> To: <ja...@incubator.apache.org>
> Sent: Wednesday, November 10, 2004 6:36 PM
> Subject: Re: Multirow update/insert/delete issue
> 
> > a few comments/clarifcations inline...
> >
> > On Wed, 10 Nov 2004 17:41:46 +0100, Wolfgang Gehner
> > <wg...@infonoia.com> wrote:
> > >
> > > As discussed with David offline, when 1000 nodes are inserted, in the
> current implementation the PersistenceMgr.store() method
> > > is called a 1000 times. So the XMLPersistenceMgr takes 30 seconds to do
> those 1000 write operations.
> >
> > not quite correct: i said that the XML/ObjectPersistenceManager in
> > combination on a CQFileSystem takes ca. 5 sec. for adding and saving
> > 1000 nodes (that's
> > 2000 write operations, 1000 nodes + 1000 properties).
> >
> > > A JDBC implementation of the current PersistenceMgr API is "condemned"
> to do the same thing. We'd really look to a way to bundle those 1000 writes
> into one "transaction", so we can take 2-3 seconds on a relational database
> rather than 30.
> >
> > again, a jdbc implementation is *not* condemned to take 30 sec.!
> > i hacked a quick&dirty implementation of a jdbc persistence manager (with
> a very
> > *primitive* schema) that took less than 5 sec. for adding and saving 1000
> nodes.
> >
> > >
> > > So we'd like to throw into the discussion the following thoughts:
> > > - how about maintaining an instance of of PersistenceMgr (pm) not on
> (Persistent)NodeState but on NodeImpl
> > > - the implementation of node.save() to collect info what nodes incl.
> children to save and call a persistenceMgr.store(
> > > nodesToUpdate, nodesToInsert, nodesToDelete) just once. That way the pm
> could bundle operations in line with the
> > > repository requirements.
> > >
> > > This would make Jackrabbit's persistence model follow the DAO (data
> access object) pattern as we understand it.
> > >
> > > Would be pleased to elaborate and discuss. And share our JDBC
> PersistenceMgr prototype with anyone interested (it passes the current api
> unit test, but has a very non-optimized ER design and is inflicted with the
> issue discussed in this message).
> > >
> > > Best regards,
> > >
> > > Infonoia S.A.
> > > rue de Berne 7
> > > 1201 Geneva
> > > Tel: +41 22 9000 009
> > > Fax: +41 22 9000 018
> > > wgehner@infonoia.com
> > > http://www.infonoia.com
> > >
> 
>

Re: Multirow update/insert/delete issue

Posted by Wolfgang Gehner <wg...@infonoia.com>.
We're fully aware of the good benchmarks when not using LocalFileSystem.
"3. Object with LocalFileSystem, not surprisingly either, showed the worst
    performance: ca. 30 sec./1000 nodes"

So there is no criticism implied or intended whatsoever.
I've just taken the analogy that writing to a db is like writing a thousand
files *when it's done one by one*.

We are new to the Jackrabbit api and wonder how we can wrap multiple node
writes/inserts/or deletes in one db transaction with the current
PersistenceMgr API. When we can do that, performance will be no issue. We
might have PersistentMgr listen to an event emitted by node.save(), and
persist only then? What do you think?

Would you like to look at our code as is?

Stefan, we look forward to your recommendation.

Best regards,

Wolfgang

----- Original Message ----- 
From: "Stefan Guggisberg" <st...@gmail.com>
To: <ja...@incubator.apache.org>
Sent: Wednesday, November 10, 2004 6:36 PM
Subject: Re: Multirow update/insert/delete issue


> a few comments/clarifcations inline...
>
> On Wed, 10 Nov 2004 17:41:46 +0100, Wolfgang Gehner
> <wg...@infonoia.com> wrote:
> >
> > As discussed with David offline, when 1000 nodes are inserted, in the
current implementation the PersistenceMgr.store() method
> > is called a 1000 times. So the XMLPersistenceMgr takes 30 seconds to do
those 1000 write operations.
>
> not quite correct: i said that the XML/ObjectPersistenceManager in
> combination on a CQFileSystem takes ca. 5 sec. for adding and saving
> 1000 nodes (that's
> 2000 write operations, 1000 nodes + 1000 properties).
>
> > A JDBC implementation of the current PersistenceMgr API is "condemned"
to do the same thing. We'd really look to a way to bundle those 1000 writes
into one "transaction", so we can take 2-3 seconds on a relational database
rather than 30.
>
> again, a jdbc implementation is *not* condemned to take 30 sec.!
> i hacked a quick&dirty implementation of a jdbc persistence manager (with
a very
> *primitive* schema) that took less than 5 sec. for adding and saving 1000
nodes.
>
> >
> > So we'd like to throw into the discussion the following thoughts:
> > - how about maintaining an instance of of PersistenceMgr (pm) not on
(Persistent)NodeState but on NodeImpl
> > - the implementation of node.save() to collect info what nodes incl.
children to save and call a persistenceMgr.store(
> > nodesToUpdate, nodesToInsert, nodesToDelete) just once. That way the pm
could bundle operations in line with the
> > repository requirements.
> >
> > This would make Jackrabbit's persistence model follow the DAO (data
access object) pattern as we understand it.
> >
> > Would be pleased to elaborate and discuss. And share our JDBC
PersistenceMgr prototype with anyone interested (it passes the current api
unit test, but has a very non-optimized ER design and is inflicted with the
issue discussed in this message).
> >
> > Best regards,
> >
> > Infonoia S.A.
> > rue de Berne 7
> > 1201 Geneva
> > Tel: +41 22 9000 009
> > Fax: +41 22 9000 018
> > wgehner@infonoia.com
> > http://www.infonoia.com
> >


Re: Multirow update/insert/delete issue

Posted by Stefan Guggisberg <st...@gmail.com>.
a few comments/clarifcations inline...

On Wed, 10 Nov 2004 17:41:46 +0100, Wolfgang Gehner
<wg...@infonoia.com> wrote:
> 
> As discussed with David offline, when 1000 nodes are inserted, in the current implementation the PersistenceMgr.store() method
> is called a 1000 times. So the XMLPersistenceMgr takes 30 seconds to do those 1000 write operations. 

not quite correct: i said that the XML/ObjectPersistenceManager in
combination on a CQFileSystem takes ca. 5 sec. for adding and saving
1000 nodes (that's
2000 write operations, 1000 nodes + 1000 properties).

> A JDBC implementation of the current PersistenceMgr API is "condemned" to do the same thing. We'd really look to a way to bundle those 1000 writes into one "transaction", so we can take 2-3 seconds on a relational database rather than 30.

again, a jdbc implementation is *not* condemned to take 30 sec.! 
i hacked a quick&dirty implementation of a jdbc persistence manager (with a very
*primitive* schema) that took less than 5 sec. for adding and saving 1000 nodes.

> 
> So we'd like to throw into the discussion the following thoughts:
> - how about maintaining an instance of of PersistenceMgr (pm) not on (Persistent)NodeState but on NodeImpl
> - the implementation of node.save() to collect info what nodes incl. children to save and call a persistenceMgr.store(
> nodesToUpdate, nodesToInsert, nodesToDelete) just once. That way the pm could bundle operations in line with the
> repository requirements.
> 
> This would make Jackrabbit's persistence model follow the DAO (data access object) pattern as we understand it.
> 
> Would be pleased to elaborate and discuss. And share our JDBC PersistenceMgr prototype with anyone interested (it passes the current api unit test, but has a very non-optimized ER design and is inflicted with the issue discussed in this message).
> 
> Best regards,
> 
> Infonoia S.A.
> rue de Berne 7
> 1201 Geneva
> Tel: +41 22 9000 009
> Fax: +41 22 9000 018
> wgehner@infonoia.com
> http://www.infonoia.com
>