You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@ofbiz.apache.org by "Mirko Vogelsmeier (JIRA)" <ji...@apache.org> on 2012/05/11 00:34:49 UTC

[jira] [Created] (OFBIZ-4870) Multithreading in GenericDAO / Delegator

Mirko Vogelsmeier created OFBIZ-4870:
----------------------------------------

             Summary: Multithreading in GenericDAO / Delegator
                 Key: OFBIZ-4870
                 URL: https://issues.apache.org/jira/browse/OFBIZ-4870
             Project: OFBiz
          Issue Type: Improvement
            Reporter: Mirko Vogelsmeier


Hey there,

some time ago there were some commits of Adam that brought in first ideas of multi threaded delegator useage (r1139700).
Depending on how intense the data useage or data size is, there are performance issues we cannot scale by pure hardware and/or configuration as there is just one Delegator object per datasource.
I wanted to check on the progress of this very helpfull feature. Are there any further thoughts to work on this?

Greetings,
Mirko

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (OFBIZ-4870) Multithreading in GenericDAO / Delegator

Posted by "Adrian Crum (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/OFBIZ-4870?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13272860#comment-13272860 ] 

Adrian Crum commented on OFBIZ-4870:
------------------------------------

I don't understand how one Delegator object per data source is a performance issue. I believe the real issue is multi-threading.

A single object can scale well if it is designed properly.

                
> Multithreading in GenericDAO / Delegator
> ----------------------------------------
>
>                 Key: OFBIZ-4870
>                 URL: https://issues.apache.org/jira/browse/OFBIZ-4870
>             Project: OFBiz
>          Issue Type: Improvement
>            Reporter: Mirko Vogelsmeier
>
> Hey there,
> some time ago there were some commits of Adam that brought in first ideas of multi threaded delegator useage (r1139700).
> Depending on how intense the data useage or data size is, there are performance issues we cannot scale by pure hardware and/or configuration as there is just one Delegator object per datasource.
> I wanted to check on the progress of this very helpfull feature. Are there any further thoughts to work on this?
> Greetings,
> Mirko

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Re: OFBiz multi-threading

Posted by Adrian Crum <ad...@sandglass-software.com>.
On 5/11/2012 1:28 AM, Adam Heath wrote:
> On 05/10/2012 06:58 PM, Adrian Crum wrote:
>> On 5/11/2012 12:45 AM, Adam Heath wrote:
>>> On 05/10/2012 06:40 PM, Adrian Crum wrote:
>>>> I wanted to discuss this again.
>>>>
>>>> Some time ago I modified my local copy of OFBiz to execute the the ant
>>>> run-install task faster by using multi-threaded entity creation and
>>>> data loading. I don't remember what i was working on at the time, but
>>>> I needed the process to run much faster.
>>>>
>>>> Adam and I discussed my design and he said it sounded like it was
>>>> similar to SEDA (http://www.eecs.harvard.edu/~mdw/proj/seda/).
>>>> Actually, my approach was not that sophisticated, I just used a
>>>> consumer-provider design pattern based on a FIFO queue.
>>>>
>>>> Anyway, based on that conversation, Adam committed the multi-threaded
>>>> entity creation/data loading code
>>>> (http://mail-archives.apache.org/mod_mbox/ofbiz-commits/201106.mbox/%3C20110626025049.F302E2388B45@eris.apache.org%3E). 
>>>>
>>>>
>>>> The work Adam committed greatly improved the database creation/data
>>>> loading time.
>>>>
>>>> In this Jira issue, Adam mentions how transactions are tied to a
>>>> single thread. This is due to a fundamental weakness in OFBiz - the
>>>> use of ThreadLocal variables. In order to truly remove the bottlenecks
>>>> in OFBiz, we need to avoid the use of ThreadLocal variables - because
>>>> they prohibit handing off tasks to other threads. The Execution
>>>> Context proposed by David Jones some time ago is a step in the right
>>>> direction, but it uses ThreadLocal variables too.
>>> My statement has *nothing* to do with ThreadLocal(well, at least not
>>> used by ofbiz). javax.transaction is for the *current* thread. When
>>> an xml import is requested, it *must* be done in the current thread,
>>> in the current transaction. It's not possible to suspend the
>>> transaction for the foreground thread, and resume it in the
>>> background(I tried).
>>>
>>> I may not have explained myself fully in the jira issue, or maybe you
>>> didn't understand. In any case, the rest of your explanation seems to
>>> not apply now.
>>
>> I did not understand that the transaction-to-thread relationship was
>> required in the javax.transaction API. That is a real problem. It makes
>> me wonder how SEDA-style designs can work in transaction-based
>> applications.
>
> In this case, you also can't split each row in the xml into a separate 
> threead(transaction deadlock), nor into batches in separate 
> threads(postgresql has delayed foreign key resolution).
>
> In a more general sense, javax.transaction has resource callbacks that 
> happen at various transaction states.  Those have to run in the 
> foreground.
>
> In a more general sense, the foreground thread made have added 
> callbacks to the general purpose container.  Said container needs to 
> be careful about which parts it puts into the background.  For 
> instance, UtilCache has support for listeners when things get 
> added/removed.  Those also should be run in the foreground.
>
> The only parts that can be pushed onto other work systems must be 
> *completely* controlled.  *No* abstact classes or interfaces, or 
> otherwise the control code can't guarantee what is happening.  This 
> generalization is the same as the locking requirements talked about by 
> Java Concurrency in Practice.
>
> If you really want to improve threading in ofbiz, a good first start 
> would be removing 'synchronized' from places.  But that is tricky if 
> you haven't done it before.  I've got several commits locally that 
> remove synchronized from lots of places, but need time to finish them 
> off.

Btw, I wasn't trying to improve the threading, I was trying to improve 
responsiveness under heavy load. The feedback loop in SEDA prevents a 
user from encountering a request that seems to lock-up or timeout due to 
a busy server.

-Adrian


Re: OFBiz multi-threading

Posted by Adam Heath <do...@brainfood.com>.
On 05/10/2012 06:58 PM, Adrian Crum wrote:
> On 5/11/2012 12:45 AM, Adam Heath wrote:
>> On 05/10/2012 06:40 PM, Adrian Crum wrote:
>>> I wanted to discuss this again.
>>>
>>> Some time ago I modified my local copy of OFBiz to execute the the ant
>>> run-install task faster by using multi-threaded entity creation and
>>> data loading. I don't remember what i was working on at the time, but
>>> I needed the process to run much faster.
>>>
>>> Adam and I discussed my design and he said it sounded like it was
>>> similar to SEDA (http://www.eecs.harvard.edu/~mdw/proj/seda/).
>>> Actually, my approach was not that sophisticated, I just used a
>>> consumer-provider design pattern based on a FIFO queue.
>>>
>>> Anyway, based on that conversation, Adam committed the multi-threaded
>>> entity creation/data loading code
>>> (http://mail-archives.apache.org/mod_mbox/ofbiz-commits/201106.mbox/%3C20110626025049.F302E2388B45@eris.apache.org%3E).
>>>
>>> The work Adam committed greatly improved the database creation/data
>>> loading time.
>>>
>>> In this Jira issue, Adam mentions how transactions are tied to a
>>> single thread. This is due to a fundamental weakness in OFBiz - the
>>> use of ThreadLocal variables. In order to truly remove the bottlenecks
>>> in OFBiz, we need to avoid the use of ThreadLocal variables - because
>>> they prohibit handing off tasks to other threads. The Execution
>>> Context proposed by David Jones some time ago is a step in the right
>>> direction, but it uses ThreadLocal variables too.
>> My statement has *nothing* to do with ThreadLocal(well, at least not
>> used by ofbiz). javax.transaction is for the *current* thread. When
>> an xml import is requested, it *must* be done in the current thread,
>> in the current transaction. It's not possible to suspend the
>> transaction for the foreground thread, and resume it in the
>> background(I tried).
>>
>> I may not have explained myself fully in the jira issue, or maybe you
>> didn't understand. In any case, the rest of your explanation seems to
>> not apply now.
>
> I did not understand that the transaction-to-thread relationship was
> required in the javax.transaction API. That is a real problem. It makes
> me wonder how SEDA-style designs can work in transaction-based
> applications.

In this case, you also can't split each row in the xml into a separate 
threead(transaction deadlock), nor into batches in separate 
threads(postgresql has delayed foreign key resolution).

In a more general sense, javax.transaction has resource callbacks that 
happen at various transaction states.  Those have to run in the foreground.

In a more general sense, the foreground thread made have added callbacks 
to the general purpose container.  Said container needs to be careful 
about which parts it puts into the background.  For instance, UtilCache 
has support for listeners when things get added/removed.  Those also 
should be run in the foreground.

The only parts that can be pushed onto other work systems must be 
*completely* controlled.  *No* abstact classes or interfaces, or 
otherwise the control code can't guarantee what is happening.  This 
generalization is the same as the locking requirements talked about by 
Java Concurrency in Practice.

If you really want to improve threading in ofbiz, a good first start 
would be removing 'synchronized' from places.  But that is tricky if you 
haven't done it before.  I've got several commits locally that remove 
synchronized from lots of places, but need time to finish them off.

Re: OFBiz multi-threading

Posted by Adrian Crum <ad...@sandglass-software.com>.
On 5/11/2012 12:45 AM, Adam Heath wrote:
> On 05/10/2012 06:40 PM, Adrian Crum wrote:
>> I wanted to discuss this again.
>>
>> Some time ago I modified my local copy of OFBiz to execute the the ant
>> run-install task faster by using multi-threaded entity creation and
>> data loading. I don't remember what i was working on at the time, but
>> I needed the process to run much faster.
>>
>> Adam and I discussed my design and he said it sounded like it was
>> similar to SEDA (http://www.eecs.harvard.edu/~mdw/proj/seda/).
>> Actually, my approach was not that sophisticated, I just used a
>> consumer-provider design pattern based on a FIFO queue.
>>
>> Anyway, based on that conversation, Adam committed the multi-threaded
>> entity creation/data loading code
>> (http://mail-archives.apache.org/mod_mbox/ofbiz-commits/201106.mbox/%3C20110626025049.F302E2388B45@eris.apache.org%3E).
>> The work Adam committed greatly improved the database creation/data
>> loading time.
>>
>> In this Jira issue, Adam mentions how transactions are tied to a
>> single thread. This is due to a fundamental weakness in OFBiz - the
>> use of ThreadLocal variables. In order to truly remove the bottlenecks
>> in OFBiz, we need to avoid the use of ThreadLocal variables - because
>> they prohibit handing off tasks to other threads. The Execution
>> Context proposed by David Jones some time ago is a step in the right
>> direction, but it uses ThreadLocal variables too.
> My statement has *nothing* to do with ThreadLocal(well, at least not
> used by ofbiz).  javax.transaction is for the *current* thread.  When
> an xml import is requested, it *must* be done in the current thread,
> in the current transaction.  It's not possible to suspend the
> transaction for the foreground thread, and resume it in the
> background(I tried).
>
> I may not have explained myself fully in the jira issue, or maybe you
> didn't understand.  In any case, the rest of your explanation seems to
> not apply now.

I did not understand that the transaction-to-thread relationship was 
required in the javax.transaction API. That is a real problem. It makes 
me wonder how SEDA-style designs can work in transaction-based applications.

-Adrian


Re: OFBiz multi-threading

Posted by Adam Heath <do...@brainfood.com>.
On 05/10/2012 06:40 PM, Adrian Crum wrote:
> I wanted to discuss this again.
> 
> Some time ago I modified my local copy of OFBiz to execute the the ant
> run-install task faster by using multi-threaded entity creation and
> data loading. I don't remember what i was working on at the time, but
> I needed the process to run much faster.
> 
> Adam and I discussed my design and he said it sounded like it was
> similar to SEDA (http://www.eecs.harvard.edu/~mdw/proj/seda/).
> Actually, my approach was not that sophisticated, I just used a
> consumer-provider design pattern based on a FIFO queue.
> 
> Anyway, based on that conversation, Adam committed the multi-threaded
> entity creation/data loading code
> (http://mail-archives.apache.org/mod_mbox/ofbiz-commits/201106.mbox/%3C20110626025049.F302E2388B45@eris.apache.org%3E).
> The work Adam committed greatly improved the database creation/data
> loading time.
> 
> In this Jira issue, Adam mentions how transactions are tied to a
> single thread. This is due to a fundamental weakness in OFBiz - the
> use of ThreadLocal variables. In order to truly remove the bottlenecks
> in OFBiz, we need to avoid the use of ThreadLocal variables - because
> they prohibit handing off tasks to other threads. The Execution
> Context proposed by David Jones some time ago is a step in the right
> direction, but it uses ThreadLocal variables too.

My statement has *nothing* to do with ThreadLocal(well, at least not
used by ofbiz).  javax.transaction is for the *current* thread.  When
an xml import is requested, it *must* be done in the current thread,
in the current transaction.  It's not possible to suspend the
transaction for the foreground thread, and resume it in the
background(I tried).

I may not have explained myself fully in the jira issue, or maybe you
didn't understand.  In any case, the rest of your explanation seems to
not apply now.

> So, we really need an object that is passed around the framework that
> represents a TASK state, not a THREAD state - so that tasks can be
> handed off to multiple threads. Transactions are tasks, so transaction
> state would be contained in the object, not in ThreadLocal variables.
> I believe that approach would solve the issue here.

OFBiz multi-threading (was: [jira] [Commented] (OFBIZ-4870) Multithreading in GenericDAO / Delegator)

Posted by Adrian Crum <ad...@sandglass-software.com>.
I wanted to discuss this again.

Some time ago I modified my local copy of OFBiz to execute the the ant 
run-install task faster by using multi-threaded entity creation and data 
loading. I don't remember what i was working on at the time, but I 
needed the process to run much faster.

Adam and I discussed my design and he said it sounded like it was 
similar to SEDA (http://www.eecs.harvard.edu/~mdw/proj/seda/). Actually, 
my approach was not that sophisticated, I just used a consumer-provider 
design pattern based on a FIFO queue.

Anyway, based on that conversation, Adam committed the multi-threaded 
entity creation/data loading code 
(http://mail-archives.apache.org/mod_mbox/ofbiz-commits/201106.mbox/%3C20110626025049.F302E2388B45@eris.apache.org%3E). 
The work Adam committed greatly improved the database creation/data 
loading time.

In this Jira issue, Adam mentions how transactions are tied to a single 
thread. This is due to a fundamental weakness in OFBiz - the use of 
ThreadLocal variables. In order to truly remove the bottlenecks in 
OFBiz, we need to avoid the use of ThreadLocal variables - because they 
prohibit handing off tasks to other threads. The Execution Context 
proposed by David Jones some time ago is a step in the right direction, 
but it uses ThreadLocal variables too.

So, we really need an object that is passed around the framework that 
represents a TASK state, not a THREAD state - so that tasks can be 
handed off to multiple threads. Transactions are tasks, so transaction 
state would be contained in the object, not in ThreadLocal variables. I 
believe that approach would solve the issue here.

-Adrian


On 5/10/2012 11:48 PM, Adam Heath (JIRA) wrote:
>      [ https://issues.apache.org/jira/browse/OFBIZ-4870?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13272862#comment-13272862 ]
>
> Adam Heath commented on OFBIZ-4870:
> -----------------------------------
>
> There is multi-threading per-datasource during startup.  Multiple threads creating the tables/keys/etc.  Seed data(from xml) is still single-threaded.
>
> I have a series of changes to fix this locally, but it's *really* complex.  Since an xml import may already be part of an *existing* transaction, I absolutely *must* manipulate the database in the current thread.  The parsing, however, can be done in a background thread, in parallel.  Unfortunately, the xml parser in use is event based, in the wrong direction(push instead of pull), so the bulk of the change is flipping that around.  Tbh, the complexity wasn't really worth the speedup(just a few percentage points, afaicr).
>
> What threaded stuff are you still seeking?
>
>> Multithreading in GenericDAO / Delegator
>> ----------------------------------------
>>
>>                  Key: OFBIZ-4870
>>                  URL: https://issues.apache.org/jira/browse/OFBIZ-4870
>>              Project: OFBiz
>>           Issue Type: Improvement
>>             Reporter: Mirko Vogelsmeier
>>
>> Hey there,
>> some time ago there were some commits of Adam that brought in first ideas of multi threaded delegator useage (r1139700).
>> Depending on how intense the data useage or data size is, there are performance issues we cannot scale by pure hardware and/or configuration as there is just one Delegator object per datasource.
>> I wanted to check on the progress of this very helpfull feature. Are there any further thoughts to work on this?
>> Greetings,
>> Mirko
> --
> This message is automatically generated by JIRA.
> If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
> For more information on JIRA, see: http://www.atlassian.com/software/jira
>
>

[jira] [Commented] (OFBIZ-4870) Multithreading in GenericDAO / Delegator

Posted by "Adam Heath (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/OFBIZ-4870?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13272862#comment-13272862 ] 

Adam Heath commented on OFBIZ-4870:
-----------------------------------

There is multi-threading per-datasource during startup.  Multiple threads creating the tables/keys/etc.  Seed data(from xml) is still single-threaded.

I have a series of changes to fix this locally, but it's *really* complex.  Since an xml import may already be part of an *existing* transaction, I absolutely *must* manipulate the database in the current thread.  The parsing, however, can be done in a background thread, in parallel.  Unfortunately, the xml parser in use is event based, in the wrong direction(push instead of pull), so the bulk of the change is flipping that around.  Tbh, the complexity wasn't really worth the speedup(just a few percentage points, afaicr).

What threaded stuff are you still seeking?
                
> Multithreading in GenericDAO / Delegator
> ----------------------------------------
>
>                 Key: OFBIZ-4870
>                 URL: https://issues.apache.org/jira/browse/OFBIZ-4870
>             Project: OFBiz
>          Issue Type: Improvement
>            Reporter: Mirko Vogelsmeier
>
> Hey there,
> some time ago there were some commits of Adam that brought in first ideas of multi threaded delegator useage (r1139700).
> Depending on how intense the data useage or data size is, there are performance issues we cannot scale by pure hardware and/or configuration as there is just one Delegator object per datasource.
> I wanted to check on the progress of this very helpfull feature. Are there any further thoughts to work on this?
> Greetings,
> Mirko

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (OFBIZ-4870) Multithreading in GenericDAO / Delegator

Posted by "Adrian Crum (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/OFBIZ-4870?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13272868#comment-13272868 ] 

Adrian Crum commented on OFBIZ-4870:
------------------------------------

Adam, I know your question is for the reporter, but I would like for us to discuss the SEDA approach again - but on the dev mailing list and not here.


                
> Multithreading in GenericDAO / Delegator
> ----------------------------------------
>
>                 Key: OFBIZ-4870
>                 URL: https://issues.apache.org/jira/browse/OFBIZ-4870
>             Project: OFBiz
>          Issue Type: Improvement
>            Reporter: Mirko Vogelsmeier
>
> Hey there,
> some time ago there were some commits of Adam that brought in first ideas of multi threaded delegator useage (r1139700).
> Depending on how intense the data useage or data size is, there are performance issues we cannot scale by pure hardware and/or configuration as there is just one Delegator object per datasource.
> I wanted to check on the progress of this very helpfull feature. Are there any further thoughts to work on this?
> Greetings,
> Mirko

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira