You are viewing a plain text version of this content. The canonical link for it is here.
Posted to log4j-user@logging.apache.org by Tim Watts <ti...@cliftonfarm.org> on 2012/10/16 18:41:28 UTC

[OT] Re: logging parallel threads

First, remember this is the Log4J Users List, not a forum for working
out design issues related to storing data as XML.

But let's back up for a second.  I'm assuming that:
     1. you have a set of activities/tasks (possibly nested?) and you're
        interested in recording the state of these in an XML file (e.g.
        where is it in its life cycle? was it successful? details about
        the task parameters etc.)  
     2. you're NOT really interested in recording the detailed events
        produced by the activity while it executes.

Or maybe you need both?  Log4J WOULD be a good choice for #2 but not for
#1.  And if you need to store the logging output with the state data
that's doable but storing all this in a single XML file is probably not
the way to go.

If my original assumptions are correct, I can offer some general advice.
But I think you should then do further exploration on other forums.

I would say "XML => DOM => {domEdits} => XML" would be wildly
inefficient for a system that has to support frequent, parallel updates.
(Yes, DOM generally keeps the whole document in memory.)  Depending on
the scale of the system you could do "XML => DOM" at initialization,
keeping the model in memory, and then "{domEdits} => XML" on each
update.  But DOM itself is pretty inefficient for that scenario too.
You might want to consider using JAXB instead:

http://docs.oracle.com/javase/7/docs/technotes/guides/xml/jaxb/index.html

This assumes that your app is the only system that does updates to the
data.

If you're talking about a very large system where holding everything in
memory would be prohibitive (e.g. the "activities" have to be kept
around long after they've completed) or where multiple systems need to
update the data, then you really need to consider using a database for
storage.  In that case, something like JPA might be appropriate.  See
http://openjpa.apache.org/ .  JPA doesn't provide a database per se but
provides object storage backed by a database.

Alternatively, you could use the filesystem as your database and persist
each "activity" to it's own XML file using JAXB or DOM.

It's not clear to me why you're wedded to a single XML file as the
storage medium.  What consumes the XML?  What kind of operations do they
perform on it?  What would happen when a consumer starts reading the
file mid-update?  Maybe you've already thought that through but it seems
to me that a single XML file is not the best storage choice for a system
that has high concurrency requirements.

As far as storing logging output with state data here are a couple
approaches to consider:

Create a distinct Logger for each task configured with a WriterAppender
connected to a java.io.StringWriter.  Persist the (in memory) logging
output with the state data whenever the state object (Activity) is
updated.  Expensive operation!

Write the logging output to a unique file for each task.  Then either
reference the file in the state data or suck the whole log into the
state object once the task is complete.

If all of this is getting too complicated you could always just use a
simple log file and build the XML file by reading the log.  For example,
if the XML is just for reporting then whenever a report is requested:
        if logFile is newer than xmlFile
        	rebuildXmlFile
        generateReport

Hope this helps.  Good Luck!


On Tue, 2012-10-16 at 05:43 -0700, pa7751 wrote:
> Fine, so if we go with the assumption that log4j is an overkill, then what
> could be the best way to do this. We have so many reads and writes happening
> to the same xml file, so is it ok to have a DOM parser like xerces, getting
> the whole xml DOM in memory, then make updates at every task and then
> persist at every task, then again read and update and so on
> 
> 
> 
> Tim Watts-3 wrote:
> > 
> > Again, this isn't a logging problem you're describing.  You should be
> > looking to some sort of database for solutions.  You could probably
> > torture log4j into doing some of this but there's no advantage to it.
> > There are better suited tools for the problem you describe.
> > 
> > 
> > On Mon, 2012-10-15 at 00:43 -0700, pa7751 wrote:
> >> Hi
> >> 
> >> Basically I have an xml document that contains many tasks. As each of
> >> these
> >> tasks gets executed, I log in a file (which is again an xml) whether the
> >> task got successfully completed. So the first question is that whether
> >> log4j
> >> can be used for this?
> >> 
> >> Next since these tasks can also be executed in parallel by the engine, I
> >> want to know how logging can be done to a single log file.
> >> 
> >> Third, since for each of the tasks we will be recording a status like
> >> 'begin', 'end' to signify that a thread started or a thread completed
> >> execution or a thread that stared but did not complete execution maybe
> >> due
> >> to a sudden power failure etc. Since this will be done for every task,
> >> there
> >> will be many edits happening to the log file. Then how can we ensure that
> >> the structure of the xml remains correct
> >> 
> >> Fourth, considering the many edits/inserts happening in the xml, how can
> >> we
> >> get a good performance?
> >> 
> >> 
> >> Tim Watts-3 wrote:
> >> > 
> >> > On Sat, 2012-10-13 at 05:55 -0700, pa7751 wrote:
> >> >> Hi
> >> >> 
> >> >> I need some help to know if log4j can be used in the following
> >> scenarios
> >> >> and
> >> >> how
> >> >> 
> >> >> 1. In my application there are many small tasks that I need to do, say
> >> >> for
> >> >> e.g., 100. At every task, I need to log to an xml file, the structure
> >> of
> >> >> which I have to define. Is this possible using log4j?
> >> >> 
> >> >> 2. Since every task has to write to the same log file, and many of
> >> these
> >> >> tasks could be executing in parallel, how can I ensure the integrity
> >> of
> >> >> the
> >> >> structure of the xml log file i.e. if I have an <activity> tag and
> >> >> activity
> >> >> can have more activities or tasks, then can I ensure that tasks get
> >> >> added/deleted only to the activity that I suggest for parallel running
> >> >> tasks?
> >> >> 
> >> >> 3. Considering the multiple updates happening to the xml file i.e. at
> >> >> every
> >> >> task, the xml file will be edited so does that mean that the DOM is
> >> >> created
> >> >> in memory for every task. So how will performance be impacted? Is
> >> there
> >> >> any
> >> >> way we can get good performance?
> >> > 
> >> > Presumably you're targeting log4j 1.2.x.
> >> > 
> >> > Are you thinking of using o.a.log4j.xml.XMLLayout?  Doesn't sound like
> >> > it.
> >> > 
> >> > You write about adding & deleting activities, and editing the xml
> >> > output.  This doesn't sound like a logging problem.  It sounds like
> >> > you're trying to capture "current state" not "a history of events". 
> >> The
> >> > latter fits a logging problem, the former does not.  So I would say
> >> > log4j is probably not the right tool given your problem description.
> >> > Maybe if you restructure the problem it could simplify things for you?
> >> > 
> >> > That said, log4j can support multi-threaded /logging/ efficiently.
> >> > 
> >> > 
> >> > 
> >> >  
> >> > 
> >> 
> > 
> > 
> >  
> > 
> 


Re: [OT] Re: logging parallel threads

Posted by Tim Watts <ti...@cliftonfarm.org>.
I would go with JAXB that I mentioned below.  If you're not familiar
with it spend a couple hours with the tutorial and user guide.

Basically, the way it works is:
      * You create an XML schema that describes the elements and
        attributes of your XML doc.
      * Run this through the jaxb schema compiler, xjc.  This generates
        a set of beans and supporting classes which represent the
        elements and attributes.
      * Your controller instantiates these classes then gets and sets
        values on them.
      * You use a Marshaller to convert from java object to XML and an
        Unmarshaller to go the other way.

The nice thing about this is you're mostly interacting with java beans
instead of the awkward DOM API.

Eclipse has support for creating an XML schema.  I'd be surprised if
NetBeans didn't have this as well.

Have Fun!


On Tue, 2012-10-16 at 10:23 -0700, pa7751 wrote:
> Actually the option#1, that you mentioned is what I have to do. I have an
> input XML file that contains some tasks. I need to execute those tasks
> sequentially or in parallel. As these tasks get executed, I have to record
> the statuses of these tasks in a new xml file(which I create if does not
> exist already)
> 
> So e.g., in my input xml I can have: <task name="a" action="aa"
> someOtherAttributes/> 
> When control reaches the above task, it will record in the xml log as 
> <task name="a" status="begin" />
> 
>  This would mean that the control reached task 'a' and assigned a thread to
> it. Now if 'a' is complete, I will change the status to 'end'. So an
> operational guy, by looking at the xml can exactly know what is happening as
> the input xml is getting processed. So the whole thing is that what should I
> do to create this output xml
> 
> 
> Tim Watts-3 wrote:
> > 
> > First, remember this is the Log4J Users List, not a forum for working
> > out design issues related to storing data as XML.
> > 
> > But let's back up for a second.  I'm assuming that:
> >      1. you have a set of activities/tasks (possibly nested?) and you're
> >         interested in recording the state of these in an XML file (e.g.
> >         where is it in its life cycle? was it successful? details about
> >         the task parameters etc.)  
> >      2. you're NOT really interested in recording the detailed events
> >         produced by the activity while it executes.
> > 
> > Or maybe you need both?  Log4J WOULD be a good choice for #2 but not for
> > #1.  And if you need to store the logging output with the state data
> > that's doable but storing all this in a single XML file is probably not
> > the way to go.
> > 
> > If my original assumptions are correct, I can offer some general advice.
> > But I think you should then do further exploration on other forums.
> > 
> > I would say "XML => DOM => {domEdits} => XML" would be wildly
> > inefficient for a system that has to support frequent, parallel updates.
> > (Yes, DOM generally keeps the whole document in memory.)  Depending on
> > the scale of the system you could do "XML => DOM" at initialization,
> > keeping the model in memory, and then "{domEdits} => XML" on each
> > update.  But DOM itself is pretty inefficient for that scenario too.
> > You might want to consider using JAXB instead:
> > 
> > http://docs.oracle.com/javase/7/docs/technotes/guides/xml/jaxb/index.html
> > 
> > This assumes that your app is the only system that does updates to the
> > data.
> > 
> > If you're talking about a very large system where holding everything in
> > memory would be prohibitive (e.g. the "activities" have to be kept
> > around long after they've completed) or where multiple systems need to
> > update the data, then you really need to consider using a database for
> > storage.  In that case, something like JPA might be appropriate.  See
> > http://openjpa.apache.org/ .  JPA doesn't provide a database per se but
> > provides object storage backed by a database.
> > 
> > Alternatively, you could use the filesystem as your database and persist
> > each "activity" to it's own XML file using JAXB or DOM.
> > 
> > It's not clear to me why you're wedded to a single XML file as the
> > storage medium.  What consumes the XML?  What kind of operations do they
> > perform on it?  What would happen when a consumer starts reading the
> > file mid-update?  Maybe you've already thought that through but it seems
> > to me that a single XML file is not the best storage choice for a system
> > that has high concurrency requirements.
> > 
> > As far as storing logging output with state data here are a couple
> > approaches to consider:
> > 
> > Create a distinct Logger for each task configured with a WriterAppender
> > connected to a java.io.StringWriter.  Persist the (in memory) logging
> > output with the state data whenever the state object (Activity) is
> > updated.  Expensive operation!
> > 
> > Write the logging output to a unique file for each task.  Then either
> > reference the file in the state data or suck the whole log into the
> > state object once the task is complete.
> > 
> > If all of this is getting too complicated you could always just use a
> > simple log file and build the XML file by reading the log.  For example,
> > if the XML is just for reporting then whenever a report is requested:
> >         if logFile is newer than xmlFile
> >         	rebuildXmlFile
> >         generateReport
> > 
> > Hope this helps.  Good Luck!
> > 
> > 
> > On Tue, 2012-10-16 at 05:43 -0700, pa7751 wrote:
> >> Fine, so if we go with the assumption that log4j is an overkill, then
> >> what
> >> could be the best way to do this. We have so many reads and writes
> >> happening
> >> to the same xml file, so is it ok to have a DOM parser like xerces,
> >> getting
> >> the whole xml DOM in memory, then make updates at every task and then
> >> persist at every task, then again read and update and so on
> >> 
> >> 
> >> 
> >> Tim Watts-3 wrote:
> >> > 
> >> > Again, this isn't a logging problem you're describing.  You should be
> >> > looking to some sort of database for solutions.  You could probably
> >> > torture log4j into doing some of this but there's no advantage to it.
> >> > There are better suited tools for the problem you describe.
> >> > 
> >> > 
> >> > On Mon, 2012-10-15 at 00:43 -0700, pa7751 wrote:
> >> >> Hi
> >> >> 
> >> >> Basically I have an xml document that contains many tasks. As each of
> >> >> these
> >> >> tasks gets executed, I log in a file (which is again an xml) whether
> >> the
> >> >> task got successfully completed. So the first question is that whether
> >> >> log4j
> >> >> can be used for this?
> >> >> 
> >> >> Next since these tasks can also be executed in parallel by the engine,
> >> I
> >> >> want to know how logging can be done to a single log file.
> >> >> 
> >> >> Third, since for each of the tasks we will be recording a status like
> >> >> 'begin', 'end' to signify that a thread started or a thread completed
> >> >> execution or a thread that stared but did not complete execution maybe
> >> >> due
> >> >> to a sudden power failure etc. Since this will be done for every task,
> >> >> there
> >> >> will be many edits happening to the log file. Then how can we ensure
> >> that
> >> >> the structure of the xml remains correct
> >> >> 
> >> >> Fourth, considering the many edits/inserts happening in the xml, how
> >> can
> >> >> we
> >> >> get a good performance?
> >> >> 
> >> >> 
> >> >> Tim Watts-3 wrote:
> >> >> > 
> >> >> > On Sat, 2012-10-13 at 05:55 -0700, pa7751 wrote:
> >> >> >> Hi
> >> >> >> 
> >> >> >> I need some help to know if log4j can be used in the following
> >> >> scenarios
> >> >> >> and
> >> >> >> how
> >> >> >> 
> >> >> >> 1. In my application there are many small tasks that I need to do,
> >> say
> >> >> >> for
> >> >> >> e.g., 100. At every task, I need to log to an xml file, the
> >> structure
> >> >> of
> >> >> >> which I have to define. Is this possible using log4j?
> >> >> >> 
> >> >> >> 2. Since every task has to write to the same log file, and many of
> >> >> these
> >> >> >> tasks could be executing in parallel, how can I ensure the
> >> integrity
> >> >> of
> >> >> >> the
> >> >> >> structure of the xml log file i.e. if I have an <activity> tag and
> >> >> >> activity
> >> >> >> can have more activities or tasks, then can I ensure that tasks get
> >> >> >> added/deleted only to the activity that I suggest for parallel
> >> running
> >> >> >> tasks?
> >> >> >> 
> >> >> >> 3. Considering the multiple updates happening to the xml file i.e.
> >> at
> >> >> >> every
> >> >> >> task, the xml file will be edited so does that mean that the DOM is
> >> >> >> created
> >> >> >> in memory for every task. So how will performance be impacted? Is
> >> >> there
> >> >> >> any
> >> >> >> way we can get good performance?
> >> >> > 
> >> >> > Presumably you're targeting log4j 1.2.x.
> >> >> > 
> >> >> > Are you thinking of using o.a.log4j.xml.XMLLayout?  Doesn't sound
> >> like
> >> >> > it.
> >> >> > 
> >> >> > You write about adding & deleting activities, and editing the xml
> >> >> > output.  This doesn't sound like a logging problem.  It sounds like
> >> >> > you're trying to capture "current state" not "a history of events". 
> >> >> The
> >> >> > latter fits a logging problem, the former does not.  So I would say
> >> >> > log4j is probably not the right tool given your problem description.
> >> >> > Maybe if you restructure the problem it could simplify things for
> >> you?
> >> >> > 
> >> >> > That said, log4j can support multi-threaded /logging/ efficiently.
> >> >> > 
> >> >> > 
> >> >> > 
> >> >> >  
> >> >> > 
> >> >> 
> >> > 
> >> > 
> >> >  
> >> > 
> >> 
> > 
> > 
> >  
> > 
> 


Re: [OT] logging parallel threads

Posted by Ralph Goers <ra...@dslextreme.com>.
Why does the output have to be XML?  It would be far simpler if you could record the events in some other way and then generate the reports (or whatever you need in XML).

To give you an idea, the architecture I am using (and why I have put so much effort into Log4j 2) is;

1. Audit events using a standard logging framework that won't lose events (Apache Log4j 2)
2. Pass the events to a messaging system with guaranteed delivery (Apache Flume)
3. Write to a repository that can be distributed across data centers (Apache Cassandra)
4. Use REST apis to report on the events, one of which is a service that will generate an XML file suitable for import into Microsoft Excel.


Ralph


On Oct 16, 2012, at 10:23 AM, pa7751 wrote:

> 
> Actually the option#1, that you mentioned is what I have to do. I have an
> input XML file that contains some tasks. I need to execute those tasks
> sequentially or in parallel. As these tasks get executed, I have to record
> the statuses of these tasks in a new xml file(which I create if does not
> exist already)
> 
> So e.g., in my input xml I can have: <task name="a" action="aa"
> someOtherAttributes/> 
> When control reaches the above task, it will record in the xml log as 
> <task name="a" status="begin" />
> 
> This would mean that the control reached task 'a' and assigned a thread to
> it. Now if 'a' is complete, I will change the status to 'end'. So an
> operational guy, by looking at the xml can exactly know what is happening as
> the input xml is getting processed. So the whole thing is that what should I
> do to create this output xml
> 
> 
> Tim Watts-3 wrote:
>> 
>> First, remember this is the Log4J Users List, not a forum for working
>> out design issues related to storing data as XML.
>> 
>> But let's back up for a second.  I'm assuming that:
>>     1. you have a set of activities/tasks (possibly nested?) and you're
>>        interested in recording the state of these in an XML file (e.g.
>>        where is it in its life cycle? was it successful? details about
>>        the task parameters etc.)  
>>     2. you're NOT really interested in recording the detailed events
>>        produced by the activity while it executes.
>> 
>> Or maybe you need both?  Log4J WOULD be a good choice for #2 but not for
>> #1.  And if you need to store the logging output with the state data
>> that's doable but storing all this in a single XML file is probably not
>> the way to go.
>> 
>> If my original assumptions are correct, I can offer some general advice.
>> But I think you should then do further exploration on other forums.
>> 
>> I would say "XML => DOM => {domEdits} => XML" would be wildly
>> inefficient for a system that has to support frequent, parallel updates.
>> (Yes, DOM generally keeps the whole document in memory.)  Depending on
>> the scale of the system you could do "XML => DOM" at initialization,
>> keeping the model in memory, and then "{domEdits} => XML" on each
>> update.  But DOM itself is pretty inefficient for that scenario too.
>> You might want to consider using JAXB instead:
>> 
>> http://docs.oracle.com/javase/7/docs/technotes/guides/xml/jaxb/index.html
>> 
>> This assumes that your app is the only system that does updates to the
>> data.
>> 
>> If you're talking about a very large system where holding everything in
>> memory would be prohibitive (e.g. the "activities" have to be kept
>> around long after they've completed) or where multiple systems need to
>> update the data, then you really need to consider using a database for
>> storage.  In that case, something like JPA might be appropriate.  See
>> http://openjpa.apache.org/ .  JPA doesn't provide a database per se but
>> provides object storage backed by a database.
>> 
>> Alternatively, you could use the filesystem as your database and persist
>> each "activity" to it's own XML file using JAXB or DOM.
>> 
>> It's not clear to me why you're wedded to a single XML file as the
>> storage medium.  What consumes the XML?  What kind of operations do they
>> perform on it?  What would happen when a consumer starts reading the
>> file mid-update?  Maybe you've already thought that through but it seems
>> to me that a single XML file is not the best storage choice for a system
>> that has high concurrency requirements.
>> 
>> As far as storing logging output with state data here are a couple
>> approaches to consider:
>> 
>> Create a distinct Logger for each task configured with a WriterAppender
>> connected to a java.io.StringWriter.  Persist the (in memory) logging
>> output with the state data whenever the state object (Activity) is
>> updated.  Expensive operation!
>> 
>> Write the logging output to a unique file for each task.  Then either
>> reference the file in the state data or suck the whole log into the
>> state object once the task is complete.
>> 
>> If all of this is getting too complicated you could always just use a
>> simple log file and build the XML file by reading the log.  For example,
>> if the XML is just for reporting then whenever a report is requested:
>>        if logFile is newer than xmlFile
>>        	rebuildXmlFile
>>        generateReport
>> 
>> Hope this helps.  Good Luck!
>> 
>> 
>> On Tue, 2012-10-16 at 05:43 -0700, pa7751 wrote:
>>> Fine, so if we go with the assumption that log4j is an overkill, then
>>> what
>>> could be the best way to do this. We have so many reads and writes
>>> happening
>>> to the same xml file, so is it ok to have a DOM parser like xerces,
>>> getting
>>> the whole xml DOM in memory, then make updates at every task and then
>>> persist at every task, then again read and update and so on
>>> 
>>> 
>>> 
>>> Tim Watts-3 wrote:
>>>> 
>>>> Again, this isn't a logging problem you're describing.  You should be
>>>> looking to some sort of database for solutions.  You could probably
>>>> torture log4j into doing some of this but there's no advantage to it.
>>>> There are better suited tools for the problem you describe.
>>>> 
>>>> 
>>>> On Mon, 2012-10-15 at 00:43 -0700, pa7751 wrote:
>>>>> Hi
>>>>> 
>>>>> Basically I have an xml document that contains many tasks. As each of
>>>>> these
>>>>> tasks gets executed, I log in a file (which is again an xml) whether
>>> the
>>>>> task got successfully completed. So the first question is that whether
>>>>> log4j
>>>>> can be used for this?
>>>>> 
>>>>> Next since these tasks can also be executed in parallel by the engine,
>>> I
>>>>> want to know how logging can be done to a single log file.
>>>>> 
>>>>> Third, since for each of the tasks we will be recording a status like
>>>>> 'begin', 'end' to signify that a thread started or a thread completed
>>>>> execution or a thread that stared but did not complete execution maybe
>>>>> due
>>>>> to a sudden power failure etc. Since this will be done for every task,
>>>>> there
>>>>> will be many edits happening to the log file. Then how can we ensure
>>> that
>>>>> the structure of the xml remains correct
>>>>> 
>>>>> Fourth, considering the many edits/inserts happening in the xml, how
>>> can
>>>>> we
>>>>> get a good performance?
>>>>> 
>>>>> 
>>>>> Tim Watts-3 wrote:
>>>>>> 
>>>>>> On Sat, 2012-10-13 at 05:55 -0700, pa7751 wrote:
>>>>>>> Hi
>>>>>>> 
>>>>>>> I need some help to know if log4j can be used in the following
>>>>> scenarios
>>>>>>> and
>>>>>>> how
>>>>>>> 
>>>>>>> 1. In my application there are many small tasks that I need to do,
>>> say
>>>>>>> for
>>>>>>> e.g., 100. At every task, I need to log to an xml file, the
>>> structure
>>>>> of
>>>>>>> which I have to define. Is this possible using log4j?
>>>>>>> 
>>>>>>> 2. Since every task has to write to the same log file, and many of
>>>>> these
>>>>>>> tasks could be executing in parallel, how can I ensure the
>>> integrity
>>>>> of
>>>>>>> the
>>>>>>> structure of the xml log file i.e. if I have an <activity> tag and
>>>>>>> activity
>>>>>>> can have more activities or tasks, then can I ensure that tasks get
>>>>>>> added/deleted only to the activity that I suggest for parallel
>>> running
>>>>>>> tasks?
>>>>>>> 
>>>>>>> 3. Considering the multiple updates happening to the xml file i.e.
>>> at
>>>>>>> every
>>>>>>> task, the xml file will be edited so does that mean that the DOM is
>>>>>>> created
>>>>>>> in memory for every task. So how will performance be impacted? Is
>>>>> there
>>>>>>> any
>>>>>>> way we can get good performance?
>>>>>> 
>>>>>> Presumably you're targeting log4j 1.2.x.
>>>>>> 
>>>>>> Are you thinking of using o.a.log4j.xml.XMLLayout?  Doesn't sound
>>> like
>>>>>> it.
>>>>>> 
>>>>>> You write about adding & deleting activities, and editing the xml
>>>>>> output.  This doesn't sound like a logging problem.  It sounds like
>>>>>> you're trying to capture "current state" not "a history of events". 
>>>>> The
>>>>>> latter fits a logging problem, the former does not.  So I would say
>>>>>> log4j is probably not the right tool given your problem description.
>>>>>> Maybe if you restructure the problem it could simplify things for
>>> you?
>>>>>> 
>>>>>> That said, log4j can support multi-threaded /logging/ efficiently.
>>>>>> 
>>>>>> 
>>>>>> 
>>>>>> 
>>>>>> 
>>>>> 
>>>> 
>>>> 
>>>> 
>>>> 
>>> 
>> 
>> 
>> 
>> 
> 
> -- 
> View this message in context: http://old.nabble.com/logging-parallel-threads-tp34550945p34564491.html
> Sent from the Log4j - Users mailing list archive at Nabble.com.
> 
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: log4j-user-unsubscribe@logging.apache.org
> For additional commands, e-mail: log4j-user-help@logging.apache.org
> 


---------------------------------------------------------------------
To unsubscribe, e-mail: log4j-user-unsubscribe@logging.apache.org
For additional commands, e-mail: log4j-user-help@logging.apache.org


Re: [OT] Re: logging parallel threads

Posted by pa7751 <pa...@gmail.com>.
Actually the option#1, that you mentioned is what I have to do. I have an
input XML file that contains some tasks. I need to execute those tasks
sequentially or in parallel. As these tasks get executed, I have to record
the statuses of these tasks in a new xml file(which I create if does not
exist already)

So e.g., in my input xml I can have: <task name="a" action="aa"
someOtherAttributes/> 
When control reaches the above task, it will record in the xml log as 
<task name="a" status="begin" />

 This would mean that the control reached task 'a' and assigned a thread to
it. Now if 'a' is complete, I will change the status to 'end'. So an
operational guy, by looking at the xml can exactly know what is happening as
the input xml is getting processed. So the whole thing is that what should I
do to create this output xml


Tim Watts-3 wrote:
> 
> First, remember this is the Log4J Users List, not a forum for working
> out design issues related to storing data as XML.
> 
> But let's back up for a second.  I'm assuming that:
>      1. you have a set of activities/tasks (possibly nested?) and you're
>         interested in recording the state of these in an XML file (e.g.
>         where is it in its life cycle? was it successful? details about
>         the task parameters etc.)  
>      2. you're NOT really interested in recording the detailed events
>         produced by the activity while it executes.
> 
> Or maybe you need both?  Log4J WOULD be a good choice for #2 but not for
> #1.  And if you need to store the logging output with the state data
> that's doable but storing all this in a single XML file is probably not
> the way to go.
> 
> If my original assumptions are correct, I can offer some general advice.
> But I think you should then do further exploration on other forums.
> 
> I would say "XML => DOM => {domEdits} => XML" would be wildly
> inefficient for a system that has to support frequent, parallel updates.
> (Yes, DOM generally keeps the whole document in memory.)  Depending on
> the scale of the system you could do "XML => DOM" at initialization,
> keeping the model in memory, and then "{domEdits} => XML" on each
> update.  But DOM itself is pretty inefficient for that scenario too.
> You might want to consider using JAXB instead:
> 
> http://docs.oracle.com/javase/7/docs/technotes/guides/xml/jaxb/index.html
> 
> This assumes that your app is the only system that does updates to the
> data.
> 
> If you're talking about a very large system where holding everything in
> memory would be prohibitive (e.g. the "activities" have to be kept
> around long after they've completed) or where multiple systems need to
> update the data, then you really need to consider using a database for
> storage.  In that case, something like JPA might be appropriate.  See
> http://openjpa.apache.org/ .  JPA doesn't provide a database per se but
> provides object storage backed by a database.
> 
> Alternatively, you could use the filesystem as your database and persist
> each "activity" to it's own XML file using JAXB or DOM.
> 
> It's not clear to me why you're wedded to a single XML file as the
> storage medium.  What consumes the XML?  What kind of operations do they
> perform on it?  What would happen when a consumer starts reading the
> file mid-update?  Maybe you've already thought that through but it seems
> to me that a single XML file is not the best storage choice for a system
> that has high concurrency requirements.
> 
> As far as storing logging output with state data here are a couple
> approaches to consider:
> 
> Create a distinct Logger for each task configured with a WriterAppender
> connected to a java.io.StringWriter.  Persist the (in memory) logging
> output with the state data whenever the state object (Activity) is
> updated.  Expensive operation!
> 
> Write the logging output to a unique file for each task.  Then either
> reference the file in the state data or suck the whole log into the
> state object once the task is complete.
> 
> If all of this is getting too complicated you could always just use a
> simple log file and build the XML file by reading the log.  For example,
> if the XML is just for reporting then whenever a report is requested:
>         if logFile is newer than xmlFile
>         	rebuildXmlFile
>         generateReport
> 
> Hope this helps.  Good Luck!
> 
> 
> On Tue, 2012-10-16 at 05:43 -0700, pa7751 wrote:
>> Fine, so if we go with the assumption that log4j is an overkill, then
>> what
>> could be the best way to do this. We have so many reads and writes
>> happening
>> to the same xml file, so is it ok to have a DOM parser like xerces,
>> getting
>> the whole xml DOM in memory, then make updates at every task and then
>> persist at every task, then again read and update and so on
>> 
>> 
>> 
>> Tim Watts-3 wrote:
>> > 
>> > Again, this isn't a logging problem you're describing.  You should be
>> > looking to some sort of database for solutions.  You could probably
>> > torture log4j into doing some of this but there's no advantage to it.
>> > There are better suited tools for the problem you describe.
>> > 
>> > 
>> > On Mon, 2012-10-15 at 00:43 -0700, pa7751 wrote:
>> >> Hi
>> >> 
>> >> Basically I have an xml document that contains many tasks. As each of
>> >> these
>> >> tasks gets executed, I log in a file (which is again an xml) whether
>> the
>> >> task got successfully completed. So the first question is that whether
>> >> log4j
>> >> can be used for this?
>> >> 
>> >> Next since these tasks can also be executed in parallel by the engine,
>> I
>> >> want to know how logging can be done to a single log file.
>> >> 
>> >> Third, since for each of the tasks we will be recording a status like
>> >> 'begin', 'end' to signify that a thread started or a thread completed
>> >> execution or a thread that stared but did not complete execution maybe
>> >> due
>> >> to a sudden power failure etc. Since this will be done for every task,
>> >> there
>> >> will be many edits happening to the log file. Then how can we ensure
>> that
>> >> the structure of the xml remains correct
>> >> 
>> >> Fourth, considering the many edits/inserts happening in the xml, how
>> can
>> >> we
>> >> get a good performance?
>> >> 
>> >> 
>> >> Tim Watts-3 wrote:
>> >> > 
>> >> > On Sat, 2012-10-13 at 05:55 -0700, pa7751 wrote:
>> >> >> Hi
>> >> >> 
>> >> >> I need some help to know if log4j can be used in the following
>> >> scenarios
>> >> >> and
>> >> >> how
>> >> >> 
>> >> >> 1. In my application there are many small tasks that I need to do,
>> say
>> >> >> for
>> >> >> e.g., 100. At every task, I need to log to an xml file, the
>> structure
>> >> of
>> >> >> which I have to define. Is this possible using log4j?
>> >> >> 
>> >> >> 2. Since every task has to write to the same log file, and many of
>> >> these
>> >> >> tasks could be executing in parallel, how can I ensure the
>> integrity
>> >> of
>> >> >> the
>> >> >> structure of the xml log file i.e. if I have an <activity> tag and
>> >> >> activity
>> >> >> can have more activities or tasks, then can I ensure that tasks get
>> >> >> added/deleted only to the activity that I suggest for parallel
>> running
>> >> >> tasks?
>> >> >> 
>> >> >> 3. Considering the multiple updates happening to the xml file i.e.
>> at
>> >> >> every
>> >> >> task, the xml file will be edited so does that mean that the DOM is
>> >> >> created
>> >> >> in memory for every task. So how will performance be impacted? Is
>> >> there
>> >> >> any
>> >> >> way we can get good performance?
>> >> > 
>> >> > Presumably you're targeting log4j 1.2.x.
>> >> > 
>> >> > Are you thinking of using o.a.log4j.xml.XMLLayout?  Doesn't sound
>> like
>> >> > it.
>> >> > 
>> >> > You write about adding & deleting activities, and editing the xml
>> >> > output.  This doesn't sound like a logging problem.  It sounds like
>> >> > you're trying to capture "current state" not "a history of events". 
>> >> The
>> >> > latter fits a logging problem, the former does not.  So I would say
>> >> > log4j is probably not the right tool given your problem description.
>> >> > Maybe if you restructure the problem it could simplify things for
>> you?
>> >> > 
>> >> > That said, log4j can support multi-threaded /logging/ efficiently.
>> >> > 
>> >> > 
>> >> > 
>> >> >  
>> >> > 
>> >> 
>> > 
>> > 
>> >  
>> > 
>> 
> 
> 
>  
> 

-- 
View this message in context: http://old.nabble.com/logging-parallel-threads-tp34550945p34564491.html
Sent from the Log4j - Users mailing list archive at Nabble.com.


---------------------------------------------------------------------
To unsubscribe, e-mail: log4j-user-unsubscribe@logging.apache.org
For additional commands, e-mail: log4j-user-help@logging.apache.org