You are viewing a plain text version of this content. The canonical link for it is here.
Posted to log4j-dev@logging.apache.org by Ceki Gülcü <ce...@qos.ch> on 2004/12/20 15:33:41 UTC
TimeZone and locale for PatternLayout Was: [RESULT][VOTE]
At 12:55 AM 12/19/2004, Curt Arnold wrote:
CachedDateFormat is pretty interesting. However, I suspect that it may
not work properly in case the data format passed by the user causes
the length of the returned data to change over time. For example, if
the format is 'yyyy-MMMMM-dd HH:mm:ss' and the cache is initialized on
4th of July, it will work properly for the remainder of the month but
start returning erroneous results on August. Do you concur with this
analysis?
As for the locale support in PatternLayout, it is probably
overkill. It makes the code harder to understand and to maintain. I'd
wait for someone using a different numbering system to contact us
before adding doing it on our own.
>Add TimeZone and locale for PatternLayout, remove obs DateFormat
>http://issues.eu.apache.org/bugzilla/show_bug.cgi?id=32064
>
>This last one I need to review after the recent discussions on
>jakarta-commons. It may be more appropriate to specify locale on the
>appender instead of the layout.
--
Ceki Gülcü
The complete log4j manual: http://qos.ch/log4j/
---------------------------------------------------------------------
To unsubscribe, e-mail: log4j-dev-unsubscribe@logging.apache.org
For additional commands, e-mail: log4j-dev-help@logging.apache.org
Re: TimeZone and locale for PatternLayout Was: [RESULT][VOTE]
Posted by Curt Arnold <ca...@houston.rr.com>.
>> It could only happen if at some point in time, some field or field we
>> shorted and one or more fields were shortened by the same amount. If
>> a simple algorithm could be devised to detect such cases beforehand,
>> then CachedDateFormat is a winner.
>
> I had mentioned "SS0" was a potentially malicious time format. In the
> code where PatternLayout creates the SimpleDateFormat, the format
> string could be sanity checked. If it appeared troublesome, then the
> SimpleDateFormat would not be wrapped with CachedDateFormat. If you
> wanted to be very careful, you'd get skip caching anything containing
> "G", "MMM", "E", "z" and things with only one or two "S".
>
> Another approach would be to staleness date the millisecond field so
> that you don't trust it if it is, say, over a minute old.
>
>
Or probably better, redetermine the millisecond location if the length
changed (the current criteria) or if the corresponding characters are
not digits. You'd still need a guard that you either had "SSS" or no
"S", but I think the additional check would make it extremely hard to
trick the cache into making an invalidate date.
---------------------------------------------------------------------
To unsubscribe, e-mail: log4j-dev-unsubscribe@logging.apache.org
For additional commands, e-mail: log4j-dev-help@logging.apache.org
Re: TimeZone and locale for PatternLayout Was: [RESULT][VOTE]
Posted by Curt Arnold <ca...@apache.org>.
On Dec 21, 2004, at 9:54 AM, Ceki Gülcü wrote:
>
>> CachedDateFormat would not be able to detect the milliseconds field
>> on RelativeTimeDateFormat unless the starting time was an integral
>> second and would not be able to detect millisecond fields if
>> non-arabic digits were set. In either of these cases, you would have
>> an extra call per format evaluation. I believe the original patch
>> avoided caching RelativeTimeDateFormat.
>
> Making DatePatternConverter aware of CachedDateFormat would avoid
> caching RelativeTimeDateFormat.
>
I believe the original integration with pattern layout did attempt to
cache RelativeTimeDateFormat. I was just saying that it would be
technically possible to hand code that combination and if you did, it
would just perform a little slower.
>> The worse-case scenario is if you could construct a date-time format
>> where the location of the millisecond field changed, but the total
>> length of the field did not. I don't think that you could create one
>> with SimpleDateFormat, however you could obviously write a custom
>> DateFormat that did.
>
> It could only happen if at some point in time, some field or field we
> shorted and one or more fields were shortened by the same amount. If a
> simple algorithm could be devised to detect such cases beforehand,
> then CachedDateFormat is a winner.
I had mentioned "SS0" was a potentially malicious time format. In the
code where PatternLayout creates the SimpleDateFormat, the format
string could be sanity checked. If it appeared troublesome, then the
SimpleDateFormat would not be wrapped with CachedDateFormat. If you
wanted to be very careful, you'd get skip caching anything containing
"G", "MMM", "E", "z" and things with only one or two "S".
Another approach would be to staleness date the millisecond field so
that you don't trust it if it is, say, over a minute old.
>
>> There is an observable difference when running the performance tests
>> to a null appender with CachedDateFormat. However, it may not be
>> significant in more realistic deployments. It is a significant
>> improvement over the flawed (and currently unused) caching code in
>> the original DateFormats. However, the original motivation for the
>> caching may no longer be relevant and so a new CachedDateFormat may
>> not have a performance benefit that justifies the added complexity.
>
> Mostly agreed. However, I still wonder whether with a little extra
> work CachedDateFormat could not be polished to become a pearl.
>
>> The second pass used localized values of both 0 and 9 to identify the
>> millisecond field. If the default locale changed, CachedDateFormat
>> would not switch locales until the next integral second. There may
>> be other issues that come to pass with any locale rework, so maybe
>> the best approach is to leave CachedDateFormat out for now. It will
>> be available in Bugzilla in case someone ever wants to add it later.
>
> I'll give it a shot if you don't mind.
---------------------------------------------------------------------
To unsubscribe, e-mail: log4j-dev-unsubscribe@logging.apache.org
For additional commands, e-mail: log4j-dev-help@logging.apache.org
Re: TimeZone and locale for PatternLayout Was: [RESULT][VOTE]
Posted by Ceki Gülcü <ce...@qos.ch>.
Curt,
You are several steps ahead of me. I had seen but had not paid attention to:
if (now < previousTime + 1000L && now >= previousTime) {
...
else {
....
//
// if the length changed then
// recalculate the millisecond position
if (cache.length() != prevLength) {
....
// detect the start of the millisecond field
millisecondStart = findMillisecondStart(previousTime,
tempBuffer.toString(),
formatter);
}
The above lines nicely take care of the case where the position of the
millisecond field in the formatted output varies over time. The above
is both simple and efficient - brilliant stuff.
At 12:01 AM 12/21/2004, Curt Arnold wrote:
>For those tuning in late: The basic idea of the cached date format is that
>if the time is within the same integral second as a previous request, then
>only the milliseconds field needs to be rewritten. To find the
>milliseconds field, on the first request (or any request where the total
>length of the formatted field has changed), two times only differing in
>the number of milliseconds are output and the results are analyzed. If
>the milliseconds format is unrecognized, then the CachedDateFormat will
>simply delegate to the underlying DateFormat.
I apologize for forcing you to explain this allover again and wasting your
time.
>CachedDateFormat would not be able to detect the milliseconds field on
>RelativeTimeDateFormat unless the starting time was an integral second and
>would not be able to detect millisecond fields if non-arabic digits were
>set. In either of these cases, you would have an extra call per format
>evaluation. I believe the original patch avoided caching
>RelativeTimeDateFormat.
Making DatePatternConverter aware of CachedDateFormat would avoid caching
RelativeTimeDateFormat.
>The worse-case scenario is if you could construct a date-time format where
>the location of the millisecond field changed, but the total length of the
>field did not. I don't think that you could create one with
>SimpleDateFormat, however you could obviously write a custom DateFormat
>that did.
It could only happen if at some point in time, some field or field we
shorted and one or more fields were shortened by the same amount. If a
simple algorithm could be devised to detect such cases beforehand, then
CachedDateFormat is a winner.
>There is an observable difference when running the performance tests to a
>null appender with CachedDateFormat. However, it may not be significant
>in more realistic deployments. It is a significant improvement over the
>flawed (and currently unused) caching code in the original
>DateFormats. However, the original motivation for the caching may no
>longer be relevant and so a new CachedDateFormat may not have a
>performance benefit that justifies the added complexity.
Mostly agreed. However, I still wonder whether with a little extra work
CachedDateFormat could not be polished to become a pearl.
>The second pass used localized values of both 0 and 9 to identify the
>millisecond field. If the default locale changed, CachedDateFormat would
>not switch locales until the next integral second. There may be other
>issues that come to pass with any locale rework, so maybe the best
>approach is to leave CachedDateFormat out for now. It will be available
>in Bugzilla in case someone ever wants to add it later.
I'll give it a shot if you don't mind.
--
Ceki Gülcü
The complete log4j manual: http://qos.ch/log4j/
---------------------------------------------------------------------
To unsubscribe, e-mail: log4j-dev-unsubscribe@logging.apache.org
For additional commands, e-mail: log4j-dev-help@logging.apache.org
Re: TimeZone and locale for PatternLayout Was: [RESULT][VOTE]
Posted by Ceki Gülcü <ce...@qos.ch>.
At 07:01 PM 12/21/2004, Curt Arnold wrote:
>Allowing TZ to be specified at the repository level would add an
>interaction between layout and repository that I don't believe currently
>exists and I don't see it adds much additional value.
Excellent point. Now note that as explained in [1], there are several
cases where it makes sense to let most, if not all, log4j components
know about the LoggerRepository they are attached to.
Use case 1: Loggers retrieve resource bundles from the LR. (Loggers
already know their LR.) BTW, this assumes that only a single location
is performed for all appenders. If localization is performed per
appender, then it makes sense for appender to retrieve locale specific
resource bundles from the LR, implying that appenders know about their
LR.
Use case 2: PatternLayout retrieves new conversion words from its
LR. (In 1.3, PatternLayout can learn new conversion words on the fly.)
Mrs. Piggy would set up new conversion words in a XML config
file. These new rules are placed within the LR and all PattternLayout
instances inherit those new words from the LR.
Use case 3: the %logger2 pattern converter will shorten package or
class names according to a mapping specified by the user. Again,
Mrs. Piggy would specify the mapping in a config file. This mapping
would be shared by all instances of %logger2.
Use case 4: Properties (key=value pairs) can be set at the LR
level. It kinda makes sense to share these values across components.
[1] http://marc.theaimsgroup.com/?l=log4j-dev&m=110357800507837&w=2
>I think that we should pick one place to add it and the gather feedback to
>see if we picked the wrong place or need to add additional specification
>points.
Expecting the end-user to understand all possibilities and return with
valuable suggestions does not always work and in general takes a very
long time. Actually, expecting experts to come back with valuable
input does not always work either. The best approach is thinking
about the problem as hard as you can, covering as many aspects as you
can, and come back with a proposal, or even better, with an
implementation. However, you seem to know this better than I do.
--
Ceki Gülcü
The complete log4j manual: http://qos.ch/log4j/
---------------------------------------------------------------------
To unsubscribe, e-mail: log4j-dev-unsubscribe@logging.apache.org
For additional commands, e-mail: log4j-dev-help@logging.apache.org
Re: TimeZone and locale for PatternLayout Was: [RESULT][VOTE]
Posted by Curt Arnold <ca...@apache.org>.
On Dec 21, 2004, at 11:41 AM, Mark R Durman/CA/US/MQSolutions wrote:
>
> Setting the Time Zone to UTC does enable correlation of events across
> servers in different time zones (assuming they are all synchronized
> with a common time source). WebSphere MQ stores all timestamps in UTC
> for that reason. For a distributed application, this can be useful.
>
> If the user can set the TZ at the appender level, why not support a TZ
> at the repository level as well? Most of the time it won't be used,
> but if a user wants all appenders to use a non-default TZ they can set
> it there.
Allowing TZ to be specified at the repository level would add an
interaction between layout and repository that I don't believe
currently exists and I don't see it adds much additional value. I
think that we should pick one place to add it and the gather feedback
to see if we picked the wrong place or need to add additional
specification points.
Do you have a preference between specifying timezone within the pattern
layout (which would allow multiple time renderings in difference zones
within one rendered message) or as an attribute of the layout.
>
> The distributed app scenario also supports the outputting of events
> through different appenders in different languages. I worked on a
> large airline reservations system in Germany a while back that had
> support groups in Germany, France and Italy. I'm sure they would have
> liked to see events in their own language. It also comes into play for
> product support--a support group may want a customer to enable an
> appender for troubleshooting and have the events output in the
> language they understand, not the language the customer uses.
>
I don't see the use case as unreasonable and think that it should not
be prematurely discarded. It will take a bit of exploring to see how
hard and expensive (or not) it would be in practice.
---------------------------------------------------------------------
To unsubscribe, e-mail: log4j-dev-unsubscribe@logging.apache.org
For additional commands, e-mail: log4j-dev-help@logging.apache.org
Re: TimeZone and locale for PatternLayout Was: [RESULT][VOTE]
Posted by Mark R Durman/CA/US/MQSolutions <md...@mqsolutions.com>.
Setting the Time Zone to UTC does enable correlation of events across
servers in different time zones (assuming they are all synchronized with a
common time source). WebSphere MQ stores all timestamps in UTC for that
reason. For a distributed application, this can be useful.
If the user can set the TZ at the appender level, why not support a TZ at
the repository level as well? Most of the time it won't be used, but if a
user wants all appenders to use a non-default TZ they can set it there.
The distributed app scenario also supports the outputting of events
through different appenders in different languages. I worked on a large
airline reservations system in Germany a while back that had support
groups in Germany, France and Italy. I'm sure they would have liked to see
events in their own language. It also comes into play for product
support--a support group may want a customer to enable an appender for
troubleshooting and have the events output in the language they
understand, not the language the customer uses.
Mark Durman
Curt Arnold <ca...@apache.org>
12/21/2004 09:16 AM
Please respond to
"Log4J Developers List" <lo...@logging.apache.org>
To
Log4J Developers List <lo...@logging.apache.org>
cc
Subject
Re: TimeZone and locale for PatternLayout Was: [RESULT][VOTE]
On Dec 21, 2004, at 10:44 AM, Ceki Gülcü wrote:
> At 12:01 AM 12/21/2004, Curt Arnold wrote:
>
> The Object to String conversion is done once and only once. The result
> is cached and subsequently shared by all appenders. While I can
> imagine having two emails (thus two layouts) having different
> timezones, I can't see the use case for outputting an event in German
> through one appender, Dutch in another and in English in a third.
>
I think the multi-locale use case is reasonable, however it may be one
that we reject as a requirement. I'd like to see how it plays out
before rejecting it.
>> Locale and timezone, like layout, are accommodations of the
>> preferences of a particular audience being reached through the
>> appender. Can you think of reasons that you would want to specify
>> them at a higher level?
>
> I have to admit that I actually have a hard time imagining any use for
> setting the TimeZone because:
>
> 1) if events are to be viewed locally, they will be formatted using
> the system default TimeZone which is usually the same as the desired
> TimeZone.
>
> 2) if events need to be viewed remotely, they are transmitted in a
> TZ-neutral form over the wire. The recipient can output the event in
> its local TimeZone.
>
> The only remaining case is then the user wanting to output the event
> in a TZ other than that of the her system (default) TZ. However, in
> that case we could ask the user to specify a non default TZ for each
> of her PatternLayouts that require it. No need to specify it at the
> LoggerRepository level.
I would think the most common use for an arbitrary timezone would be to
specify that the time should be rendered as UTC. I do think that is
still at the layout level (whether or not embedded in the pattern
specification)
>
>> Implementing appender level locale rendering would likely involve
>> creating threads to do rendering on non-default locales in some
>> instances and would likely have some performance hit, but shouldn't
>> significantly performance when not specified. However, it is going
>> to take some experimentation to see where it can be effectively
>> performed.
>
> The conjunction of the words "level" and "locale" made me think of the
> case where the level string was output in the user's locale. So,
> English speaking users would see TRACE, DEBUG, INFO, WARN, ERROR,
> EMERG a French speaker would see TRACE, BOGUE, INFO, AVERTISSEMENT,
> ERREUR, URGENCE a Turkish speaker would see IZ, DEBUG, BILGI, DIKKAT,
> YANLIS, IMDAT.
>
I think working the locale issue through is going to take some
experimentation. It would probably be best to take a shot at a
regexp-based message localizing layout. Once we have that piece, then
we can experiment on the localization of level names and
ObjectRendering. However, I've going to have to say that I can't do
that until after log4cxx 0.9.8 snapshot is stable.
---------------------------------------------------------------------
To unsubscribe, e-mail: log4j-dev-unsubscribe@logging.apache.org
For additional commands, e-mail: log4j-dev-help@logging.apache.org
Re: TimeZone and locale for PatternLayout Was: [RESULT][VOTE]
Posted by Curt Arnold <ca...@apache.org>.
On Dec 21, 2004, at 10:44 AM, Ceki Gülcü wrote:
> At 12:01 AM 12/21/2004, Curt Arnold wrote:
>
> The Object to String conversion is done once and only once. The result
> is cached and subsequently shared by all appenders. While I can
> imagine having two emails (thus two layouts) having different
> timezones, I can't see the use case for outputting an event in German
> through one appender, Dutch in another and in English in a third.
>
I think the multi-locale use case is reasonable, however it may be one
that we reject as a requirement. I'd like to see how it plays out
before rejecting it.
>> Locale and timezone, like layout, are accommodations of the
>> preferences of a particular audience being reached through the
>> appender. Can you think of reasons that you would want to specify
>> them at a higher level?
>
> I have to admit that I actually have a hard time imagining any use for
> setting the TimeZone because:
>
> 1) if events are to be viewed locally, they will be formatted using
> the system default TimeZone which is usually the same as the desired
> TimeZone.
>
> 2) if events need to be viewed remotely, they are transmitted in a
> TZ-neutral form over the wire. The recipient can output the event in
> its local TimeZone.
>
> The only remaining case is then the user wanting to output the event
> in a TZ other than that of the her system (default) TZ. However, in
> that case we could ask the user to specify a non default TZ for each
> of her PatternLayouts that require it. No need to specify it at the
> LoggerRepository level.
I would think the most common use for an arbitrary timezone would be to
specify that the time should be rendered as UTC. I do think that is
still at the layout level (whether or not embedded in the pattern
specification)
>
>> Implementing appender level locale rendering would likely involve
>> creating threads to do rendering on non-default locales in some
>> instances and would likely have some performance hit, but shouldn't
>> significantly performance when not specified. However, it is going
>> to take some experimentation to see where it can be effectively
>> performed.
>
> The conjunction of the words "level" and "locale" made me think of the
> case where the level string was output in the user's locale. So,
> English speaking users would see TRACE, DEBUG, INFO, WARN, ERROR,
> EMERG a French speaker would see TRACE, BOGUE, INFO, AVERTISSEMENT,
> ERREUR, URGENCE a Turkish speaker would see IZ, DEBUG, BILGI, DIKKAT,
> YANLIS, IMDAT.
>
I think working the locale issue through is going to take some
experimentation. It would probably be best to take a shot at a
regexp-based message localizing layout. Once we have that piece, then
we can experiment on the localization of level names and
ObjectRendering. However, I've going to have to say that I can't do
that until after log4cxx 0.9.8 snapshot is stable.
---------------------------------------------------------------------
To unsubscribe, e-mail: log4j-dev-unsubscribe@logging.apache.org
For additional commands, e-mail: log4j-dev-help@logging.apache.org
Re: TimeZone and locale for PatternLayout Was: [RESULT][VOTE]
Posted by Ceki Gülcü <ce...@qos.ch>.
At 12:01 AM 12/21/2004, Curt Arnold wrote:
>I'm going to have do some research before I can make a reasonable proposal.
>
>Here is a use case that I think suggests that Layout or Appender is the
>right level: Send logging events to Ceki in fr-CH localized email messages
>with time in Central European Timezone and to Curt in en-US email messages
>with time in US Central time zone.
>
>However, if you were using a SocketAppender instead and receiving in
>Chainsaw, there would not be a layout involved, however you would want to
>be able to control the locale used in the Object.toString() call used to
>render non-string messages. Timezone would not come into play until a
>layout was involved.
The Object to String conversion is done once and only once. The result is
cached and subsequently shared by all appenders. While I can imagine having
two emails (thus two layouts) having different timezones, I can't see the
use case for outputting an event in German through one appender, Dutch in
another and in English in a third.
>Locale and timezone, like layout, are accommodations of the preferences of
>a particular audience being reached through the appender. Can you think
>of reasons that you would want to specify them at a higher level?
I have to admit that I actually have a hard time imagining any use for
setting the TimeZone because:
1) if events are to be viewed locally, they will be formatted using the
system default TimeZone which is usually the same as the desired TimeZone.
2) if events need to be viewed remotely, they are transmitted in a
TZ-neutral form over the wire. The recipient can output the event in its
local TimeZone.
The only remaining case is then the user wanting to output the event in a
TZ other than that of the her system (default) TZ. However, in that case we
could ask the user to specify a non default TZ for each of her
PatternLayouts that require it. No need to specify it at the
LoggerRepository level.
>Implementing appender level locale rendering would likely involve creating
>threads to do rendering on non-default locales in some instances and would
>likely have some performance hit, but shouldn't significantly performance
>when not specified. However, it is going to take some experimentation to
>see where it can be effectively performed.
The conjunction of the words "level" and "locale" made me think of the case
where the level string was output in the user's locale. So, English
speaking users would see TRACE, DEBUG, INFO, WARN, ERROR, EMERG a French
speaker would see TRACE, BOGUE, INFO, AVERTISSEMENT, ERREUR, URGENCE a
Turkish speaker would see IZ, DEBUG, BILGI, DIKKAT, YANLIS, IMDAT.
Many developers would want to see the level strings translated to their own
language while many others would prefer the English terms. However, their
preference would probably be the same for all appenders. So, setting the
Locale at the LoggerRepository level makes sense. The user could later
override the output language of the level strings (i.e. the locale) for a
specific layout instance (most probably within the %level pattern word).
If the Locale could be set at the LoggerRepository level, in order to
obtain localized level strings, the user could simply write:
<configuration xmlns="http://...>
<!-- new action: -->
<localized-level-strings/>
<!-- the rest remains same as before -->
<appender name="A1"> ... <appender>
<appender name="A2"> ... <appender>
<root> <appender-ref ref="A1"> .... </root>
<configration>
Thanks for bearing with me this far.
--
Ceki Gülcü
The complete log4j manual: http://qos.ch/log4j/
---------------------------------------------------------------------
To unsubscribe, e-mail: log4j-dev-unsubscribe@logging.apache.org
For additional commands, e-mail: log4j-dev-help@logging.apache.org
Re: TimeZone and locale for PatternLayout Was: [RESULT][VOTE]
Posted by Curt Arnold <ca...@apache.org>.
On Dec 20, 2004, at 3:26 PM, Ceki Gülcü wrote:
> As invoked earlier, I think CachedDateFormat may fail for certain
> patterns at certain dates. If we can recognize the limited number of
> formats for which it fails (if it does) and sidestep those, then
> fine. Before going any further, do you agree that patterns causing
> CachedDateFormat to fail exist and that it's just not me making things
> up?
>
For those tuning in late: The basic idea of the cached date format is
that if the time is within the same integral second as a previous
request, then only the milliseconds field needs to be rewritten. To
find the milliseconds field, on the first request (or any request where
the total length of the formatted field has changed), two times only
differing in the number of milliseconds are output and the results are
analyzed. If the milliseconds format is unrecognized, then the
CachedDateFormat will simply delegate to the underlying DateFormat.
CachedDateFormat would not be able to detect the milliseconds field on
RelativeTimeDateFormat unless the starting time was an integral second
and would not be able to detect millisecond fields if non-arabic digits
were set. In either of these cases, you would have an extra call per
format evaluation. I believe the original patch avoided caching
RelativeTimeDateFormat.
The worse-case scenario is if you could construct a date-time format
where the location of the millisecond field changed, but the total
length of the field did not. I don't think that you could create one
with SimpleDateFormat, however you could obviously write a custom
DateFormat that did.
There is an observable difference when running the performance tests to
a null appender with CachedDateFormat. However, it may not be
significant in more realistic deployments. It is a significant
improvement over the flawed (and currently unused) caching code in the
original DateFormats. However, the original motivation for the caching
may no longer be relevant and so a new CachedDateFormat may not have a
performance benefit that justifies the added complexity.
>
>> CachedDateFormat attempted to support multiple digit sets. However,
>> I couldn't find any stock Java locales that used a digit set other
>> than 0-9 in its date formats. I had expected that the Thai locale
>> would use Thai digits, but I was wrong.
>
> If I am not mistaken, the existing code in CachedDateFormat only
> localized the digit 0. Which may be enough in case the
> SimpleDateFormat intance and CachedDateFormat instance use the same
> Localization but if not, then the output will be inconsistent.
The second pass used localized values of both 0 and 9 to identify the
millisecond field. If the default locale changed, CachedDateFormat
would not switch locales until the next integral second. There may be
other issues that come to pass with any locale rework, so maybe the
best approach is to leave CachedDateFormat out for now. It will be
available in Bugzilla in case someone ever wants to add it later.
>
>> Date formatting was affected by the current locale and timezone of
>> the thread and there was no mechanism to configure a timezone or
>> locale to be used. The existing patches added configurable timezones
>> and locales to the pattern layout which would modify the behavior of
>> the date formats. Based on some of the previous discussions on the
>> Jakarta Commons Dev list, I'd like to evaluate whether Appender is a
>> better place for the locale to be specified.
>>
>> What I'd like to do is:
>>
>> Commit simplifications to the DateFormat's and add CachedDateFormat
>> but simplified to only recognized arabic digit sets.
>
> That would be good.
>
>> Review configurable locales and timezones and come back to the list
>> with a specific recommendation. My current take is that appender is
>> probably a more appropriate place to specify locale. However, that
>> should be considered in a bigger scope where locale affects both the
>> layout and rendering non-string messages. TimeZone is likely still
>> appropriate to configure on the layout.
>
> This raises a much wider question. Should a given customization be
> allowed at the logger repository level, logger level, appender level,
> at the layout level or at the pattern converter level? Getting the
> answer right provides tremendous added value. For example, the named
> logger hierarchy propagates 'level' values according the level
> inheritance rule. This in turn provides a very fast, yet meaningful
> filtering mechanism for categorizing logging statements. The fact that
> we got this question right is one of the main reasons behind log4j's
> success. Appender additivity is another example showing that getting
> the collaboration rules between components correctly makes a big
> difference.
>
> I happen to think that the logger repository should/can be viewed as
> the central point influencing all the components attached to it. For
> example,
>
> 1) properties of the logger repository should/can be visible at all
> components levels.
>
> 2) new pattern conversion rules defined at the logger repository level
> should/can be shared by all the instances of PatternLayout attached to
> that logger repository.
>
> 3) a resource bundles attached to a logger repository should/can be
> shared by all *loggers* (hint hint), appenders and layouts.
>
> 4) The mapping URL (defined below) attached to a a logger repository
> should/can shared by all instances of %logger2 pattern converter.
>
> In "should/can", the "should" part signifies my current inclination to
> think of the above as good design. The "can" part means that design is
> still open for debate.
>
> What is the mapping URL?
> ------------------------
>
> We routinely write o.a.l.r.RollingFileAppender instead of
> org.apache.log4j.rolling.RollingFileAppender. The first form is almost
> as precise and much shorter. Whenever I get the chance, I'd like to
> implement a pattern converter named %logger2 which instead of printing
> org.apache.log4j.rolling.RollingFileAppender will print
> o.a.l.r.RollingFileAppender. The shortened forms will be defined in a
> properties file defined by the user. (We will provide a default
> mapping.)
> The location of this mapping will be specified with a URL hence the
> term "mapping URL".
>
> Coming back to the TimeZone question, we could imagine that a TimeZone
> could be set at the LoggerRepository level. This TImeZone would
> percolate down to all levels below. However, if needed it could be
> overridden at a lower level, e.g. the pattern converter level. Can the
> TimeZone influence multiple pattern converters of a PatternLayout? If
> that's not a plausible scenario, then it does not make sense to define
> a TimeZone at the Appender level nor at the PatternLayout level.
>
> Providing too many or meaningless extension/customization points will
> confuse the user, make thins harder to manage for her, and makes the
> code harder to maintain for us. Getting the collaborations rules right
> makes all the difference in the world.
>
I'm going to have do some research before I can make a reasonable
proposal.
Here is a use case that I think suggests that Layout or Appender is the
right level: Send logging events to Ceki in fr-CH localized email
messages with time in Central European Timezone and to Curt in en-US
email messages with time in US Central time zone.
However, if you were using a SocketAppender instead and receiving in
Chainsaw, there would not be a layout involved, however you would want
to be able to control the locale used in the Object.toString() call
used to render non-string messages. Timezone would not come into play
until a layout was involved.
You could either specify TimeZone as a property of the Layout, in
which case all time formats (likely one, but possibly more) within a
message would be in a single time zone, or you could extend the pattern
syntax for dates to to include a timezone specifier. The second would
allow you to represent the time, for example, in both GMT and a local
time within the same formatted message. I chose the first since I'm a
wimp and it was easier.
Locale and timezone, like layout, are accommodations of the preferences
of a particular audience being reached through the appender. Can you
think of reasons that you would want to specify them at a higher level?
Implementing appender level locale rendering would likely involve
creating threads to do rendering on non-default locales in some
instances and would likely have some performance hit, but shouldn't
significantly performance when not specified. However, it is going to
take some experimentation to see where it can be effectively performed.
---------------------------------------------------------------------
To unsubscribe, e-mail: log4j-dev-unsubscribe@logging.apache.org
For additional commands, e-mail: log4j-dev-help@logging.apache.org
Re: TimeZone and locale for PatternLayout Was: [RESULT][VOTE]
Posted by Curt Arnold <ca...@apache.org>.
>
> At this moment, I'm too tired to try to fully understand why it fails
> and how it could be fixed. More tomorrow.
>
>
The underlying code did not anticipate the use of only two 'SS' which I
assume that milliseconds 0 to 99 are represented with two digits and
100-999 with three. I have attached another patch file to the Bug with
the fix and the your test added to the unit test. Basically if it
doesn't see "000" and "987", it will just delegate to the inner date
format. Previously, it only checked for a "0" and "9". You could
probably still mess up the caching by specifying a "SS0" format.
I'm fine with dropping CachedDateFormat. I wrote the original
iteration before I realized that the buggy caching code was no longer
used.
Specifying the Locale should probably be held off to be done in
conjunction with the localizing layout that I had discussed on on the
commons-dev mailing list.
Probably should address TimeZone in the near future. Specifying it on
the layout is simpler and keeps the content between the curly braces
consistent with JDK's SimpleDateFormat. However, it doesn't allow you
to use multiple time zones in one log message.
How about:
' uses an optional {tz=} following %d to specify time zone
%d{tz=GMT}{yyyy-MM-dd HH:mm:ss,SSS} Z : %d{yyyy-MM-dd HH:mm:ss,SSS z} -
%m
---------------------------------------------------------------------
To unsubscribe, e-mail: log4j-dev-unsubscribe@logging.apache.org
For additional commands, e-mail: log4j-dev-help@logging.apache.org
Re: TimeZone and locale for PatternLayout Was: [RESULT][VOTE]
Posted by Ceki Gülcü <ce...@qos.ch>.
Not necessarily the most convincing use case, but the following fails,
import org.apache.log4j.*;
import org.apache.log4j.helpers.*;
import java.text.*;
import java.util.*;
public class CDF {
protected static FieldPosition pos = new FieldPosition(0);
public static void main(String[] args) throws Exception {
SimpleDateFormat sdf1 = new SimpleDateFormat("yyyy-MMMM-dd HH:mm:ss,SS
Z");
CachedDateFormat cdf1 = new CachedDateFormat(sdf1);
StringBuffer buf = new StringBuffer();
Calendar c = Calendar.getInstance();
c.set(2004, Calendar.DECEMBER, 12, 20, 0);
c.set(Calendar.SECOND, 37);
c.set(Calendar.MILLISECOND, 23);
cdf1.format(c.getTime(), buf, pos);
System.out.println(buf.toString());
buf.setLength(0);
cdf1.format(c.getTime(), buf, pos);
System.out.println(buf.toString());
buf.setLength(0);
c.set(2005, Calendar.JANUARY, 1, 0, 0);
c.set(Calendar.SECOND, 13);
c.set(Calendar.MILLISECOND, 905);
cdf1.format(c.getTime(), buf, pos);
System.out.println(buf.toString());
buf.setLength(0);
cdf1.format(c.getTime(), buf, pos);
System.out.println(buf.toString());
}
}
It incorrectly outputs
2004-December-12 20:00:37,23 +0100
2004-December-12 20:00:37,023+0100
2005-January-01 00:00:13,905 +0100
2005-January-01 00:00:13,9905+0100
At this moment, I'm too tired to try to fully understand why it fails and
how it could be fixed. More tomorrow.
--
Ceki Gülcü
The complete log4j manual: http://qos.ch/log4j/
---------------------------------------------------------------------
To unsubscribe, e-mail: log4j-dev-unsubscribe@logging.apache.org
For additional commands, e-mail: log4j-dev-help@logging.apache.org
Re: TimeZone and locale for PatternLayout Was: [RESULT][VOTE]
Posted by Ceki Gülcü <ce...@qos.ch>.
Curt,
At 10:26 PM 12/20/2004, Ceki Gülcü wrote:
>As invoked earlier, I think CachedDateFormat may fail for certain
>patterns at certain dates. If we can recognize the limited number of
>formats for which it fails (if it does) and sidestep those, then
>fine. Before going any further, do you agree that patterns causing
>CachedDateFormat to fail exist and that it's just not me making things
>up?
I had predicted the following would produce incorrect results.
import org.apache.log4j.*;
import org.apache.log4j.helpers.*;
import java.text.*;
import java.util.*;
public class CDF {
protected static FieldPosition pos = new FieldPosition(0);
public static void main(String[] args) throws Exception {
SimpleDateFormat sdf1 = new SimpleDateFormat("yyyy-MMMM-dd HH:mm:ss,SSS");
CachedDateFormat cdf1 = new CachedDateFormat(sdf1);
StringBuffer buf = new StringBuffer();
Calendar c = Calendar.getInstance();
c.set(2004, Calendar.DECEMBER, 12, 20, 0);
cdf1.format(c.getTime(), buf, pos);
System.out.println(buf.toString());
buf.setLength(0);
cdf1.format(c.getTime(), buf, pos);
System.out.println(buf.toString());
buf.setLength(0);
c.set(2005, Calendar.JANUARY, 1, 0, 0);
cdf1.format(c.getTime(), buf, pos);
System.out.println(buf.toString());
buf.setLength(0);
cdf1.format(c.getTime(), buf, pos);
System.out.println(buf.toString());
}
}
Instead, it produces
2004-December-12 20:00:11,200
2004-December-12 20:00:11,200
2005-January-01 00:00:11,200
2005-January-01 00:00:11,200
which is correct. So much for my predictions.
--
Ceki Gülcü
The complete log4j manual: http://qos.ch/log4j/
---------------------------------------------------------------------
To unsubscribe, e-mail: log4j-dev-unsubscribe@logging.apache.org
For additional commands, e-mail: log4j-dev-help@logging.apache.org
Re: TimeZone and locale for PatternLayout Was: [RESULT][VOTE]
Posted by Ceki Gülcü <ce...@qos.ch>.
At 06:11 PM 12/20/2004, Curt Arnold wrote:
>The existing AbsoluteTimeDateFormat, ISO8601DateFormat, and
>DateTimeDateFormat contained buggy caching code and had been effectively
>abandoned since PatternLayout no longer created these classes, but created
>java.text.SimpleDateFormat objects. The proposed resolution was to
>reimplement those classes as wrappers of SimpleDateFormat.
yes, and imho it's quite a bright proposal too.
>The flawed caching code in the unused DateFormat's, if properly
>implemented, could result in a noticable performance benefit. A new
>class, CachedDateFormat, was written that could wrap any DateFormat.
>If the class is introduced, then PatternLayout should be modified to wrap
>the DateFormat that it constructs with CachedDateFormat. If
>CachedDateFormat proved to be unreliable, then it would be trivial to
>remove by changing a line or so in PatternLayout.
As invoked earlier, I think CachedDateFormat may fail for certain
patterns at certain dates. If we can recognize the limited number of
formats for which it fails (if it does) and sidestep those, then
fine. Before going any further, do you agree that patterns causing
CachedDateFormat to fail exist and that it's just not me making things
up?
>CachedDateFormat attempted to support multiple digit sets. However, I
>couldn't find any stock Java locales that used a digit set other than 0-9
>in its date formats. I had expected that the Thai locale would use Thai
>digits, but I was wrong.
If I am not mistaken, the existing code in CachedDateFormat only
localized the digit 0. Which may be enough in case the
SimpleDateFormat intance and CachedDateFormat instance use the same
Localization but if not, then the output will be inconsistent.
>Date formatting was affected by the current locale and timezone of the
>thread and there was no mechanism to configure a timezone or locale to be
>used. The existing patches added configurable timezones and locales to
>the pattern layout which would modify the behavior of the date
>formats. Based on some of the previous discussions on the Jakarta Commons
>Dev list, I'd like to evaluate whether Appender is a better place for the
>locale to be specified.
>
>What I'd like to do is:
>
>Commit simplifications to the DateFormat's and add CachedDateFormat but
>simplified to only recognized arabic digit sets.
That would be good.
>Review configurable locales and timezones and come back to the list with a
>specific recommendation. My current take is that appender is probably a
>more appropriate place to specify locale. However, that should be
>considered in a bigger scope where locale affects both the layout and
>rendering non-string messages. TimeZone is likely still appropriate to
>configure on the layout.
This raises a much wider question. Should a given customization be
allowed at the logger repository level, logger level, appender level,
at the layout level or at the pattern converter level? Getting the
answer right provides tremendous added value. For example, the named
logger hierarchy propagates 'level' values according the level
inheritance rule. This in turn provides a very fast, yet meaningful
filtering mechanism for categorizing logging statements. The fact that
we got this question right is one of the main reasons behind log4j's
success. Appender additivity is another example showing that getting
the collaboration rules between components correctly makes a big
difference.
I happen to think that the logger repository should/can be viewed as
the central point influencing all the components attached to it. For
example,
1) properties of the logger repository should/can be visible at all
components levels.
2) new pattern conversion rules defined at the logger repository level
should/can be shared by all the instances of PatternLayout attached to
that logger repository.
3) a resource bundles attached to a logger repository should/can be
shared by all *loggers* (hint hint), appenders and layouts.
4) The mapping URL (defined below) attached to a a logger repository
should/can shared by all instances of %logger2 pattern converter.
In "should/can", the "should" part signifies my current inclination to
think of the above as good design. The "can" part means that design is
still open for debate.
What is the mapping URL?
------------------------
We routinely write o.a.l.r.RollingFileAppender instead of
org.apache.log4j.rolling.RollingFileAppender. The first form is almost
as precise and much shorter. Whenever I get the chance, I'd like to
implement a pattern converter named %logger2 which instead of printing
org.apache.log4j.rolling.RollingFileAppender will print
o.a.l.r.RollingFileAppender. The shortened forms will be defined in a
properties file defined by the user. (We will provide a default mapping.)
The location of this mapping will be specified with a URL hence the
term "mapping URL".
Coming back to the TimeZone question, we could imagine that a TimeZone
could be set at the LoggerRepository level. This TImeZone would
percolate down to all levels below. However, if needed it could be
overridden at a lower level, e.g. the pattern converter level. Can the
TimeZone influence multiple pattern converters of a PatternLayout? If
that's not a plausible scenario, then it does not make sense to define
a TimeZone at the Appender level nor at the PatternLayout level.
Providing too many or meaningless extension/customization points will
confuse the user, make thins harder to manage for her, and makes the
code harder to maintain for us. Getting the collaborations rules right
makes all the difference in the world.
--
Ceki Gülcü
The complete log4j manual: http://qos.ch/log4j/
---------------------------------------------------------------------
To unsubscribe, e-mail: log4j-dev-unsubscribe@logging.apache.org
For additional commands, e-mail: log4j-dev-help@logging.apache.org
Re: TimeZone and locale for PatternLayout Was: [RESULT][VOTE]
Posted by Curt Arnold <ca...@apache.org>.
On Dec 20, 2004, at 8:33 AM, Ceki Gülcü wrote:
>
> CachedDateFormat is pretty interesting. However, I suspect that it may
> not work properly in case the data format passed by the user causes
> the length of the returned data to change over time. For example, if
> the format is 'yyyy-MMMMM-dd HH:mm:ss' and the cache is initialized on
> 4th of July, it will work properly for the remainder of the month but
> start returning erroneous results on August. Do you concur with this
> analysis?
>
I'm pretty sure that this is your quote, not mine (though I haven't
gone through the list to check. I think that the current
implementation (in the last patch on the bug report) of
CachedDateFormat safely handles length transitions, however it would be
good to add unit tests to confirm this.
> As for the locale support in PatternLayout, it is probably
> overkill. It makes the code harder to understand and to maintain. I'd
> wait for someone using a different numbering system to contact us
> before adding doing it on our own.
>
>> Add TimeZone and locale for PatternLayout, remove obs DateFormat
>> http://issues.eu.apache.org/bugzilla/show_bug.cgi?id=32064
>>
>> This last one I need to review after the recent discussions on
>> jakarta-commons. It may be more appropriate to specify locale on the
>> appender instead of the layout.
>>
There are a couple of issues bundled into this one bug report and it
might be good to separate them and discuss and act on them
individually. The issues as I see them are:
The existing AbsoluteTimeDateFormat, ISO8601DateFormat, and
DateTimeDateFormat contained buggy caching code and had been
effectively abandoned since PatternLayout no longer created these
classes, but created java.text.SimpleDateFormat objects. The proposed
resolution was to reimplement those classes as wrappers of
SimpleDateFormat.
The flawed caching code in the unused DateFormat's, if properly
implemented, could result in a noticable performance benefit. A new
class, CachedDateFormat, was written that could wrap any DateFormat.
If the class is introduced, then PatternLayout should be modified to
wrap the DateFormat that it constructs with CachedDateFormat. If
CachedDateFormat proved to be unreliable, then it would be trivial to
remove by changing a line or so in PatternLayout.
CachedDateFormat attempted to support multiple digit sets. However, I
couldn't find any stock Java locales that used a digit set other than
0-9 in its date formats. I had expected that the Thai locale would use
Thai digits, but I was wrong.
Date formatting was affected by the current locale and timezone of the
thread and there was no mechanism to configure a timezone or locale to
be used. The existing patches added configurable timezones and locales
to the pattern layout which would modify the behavior of the date
formats. Based on some of the previous discussions on the Jakarta
Commons Dev list, I'd like to evaluate whether Appender is a better
place for the locale to be specified.
What I'd like to do is:
Commit simplifications to the DateFormat's and add CachedDateFormat but
simplified to only recognized arabic digit sets.
Review configurable locales and timezones and come back to the list
with a specific recommendation. My current take is that appender is
probably a more appropriate place to specify locale. However, that
should be considered in a bigger scope where locale affects both the
layout and rendering non-string messages. TimeZone is likely still
appropriate to configure on the layout.
---------------------------------------------------------------------
To unsubscribe, e-mail: log4j-dev-unsubscribe@logging.apache.org
For additional commands, e-mail: log4j-dev-help@logging.apache.org