You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@lucene.apache.org by Shai Erera <se...@gmail.com> on 2008/12/05 19:17:28 UTC

Java logging in Lucene

Hi

I was wondering why doesn't the Lucene code uses Java logging, instead of
the infoStream set in IndexWriter? Today, if I want to enable tracing of
Lucene code, the only thing I can do is set an infoStream, but then I get
many many messages. Moreoever, those messages seem to cover indexing code
only.

I hope to get some opinions on the use of Java logging instead of
infoStream, and hopefully to start addind logging messages in other places
in the code (like during search, query parsing etc.)

I feel that this is an approach the community has to decide on before we
start adding messages to the code. Using Java logging can greatly benefit
tracing of indexing applications who use Lucene. If the vote is +1 for using
Java logging, we can start by deprecating infoStream (in 2.9, remove in 3.0)
and use logging instead.

What do you think?

Shai

Re: Java logging in Lucene

Posted by Earwin Burrfoot <ea...@gmail.com>.
I referred to the case when you want normal production logs, like
access logs, or whatever.
Debugging with all common logging implementations is also broken,
because switching logging on/off dramatically changes multithreading
picture.

On Mon, Dec 8, 2008 at 17:02, Shai Erera <se...@gmail.com> wrote:
> {quote}
> My research shows there are no ready-made java logging frameworks that can
> be used in high-load production environment.
> {quote}
>
> I'm not sure I understand what you mean by that. We use Java logging in our
> high-profiled products, which support 100s of tps. Logging is usually turned
> off, and is being turned on only for debugging. We have not seen any
> problems with Java logging at runtime (i.e., w/o logging, when only
> logger.isLoggable calls are made) or at debug-time (when actual logging
> happens). Of course, at debug-time performance is slower, but that's debug
> time - you're not into performance, but for debugging.
>
> Anyway, as far as SLF4J goes, I've written a patch using it, and replacing
> infoStream. I'm about to open an issue and submit the patch, for everyone to
> review. We can continue the discussion there.
>
> Shai
>
> On Mon, Dec 8, 2008 at 10:13 AM, Earwin Burrfoot <ea...@gmail.com> wrote:
>>
>> The common problem with native logging, log4j and slf4j (logback impl)
>> is that they are totally unsuitable for actually logging something.
>> They do good work checking if the logging can be avoided, but use
>> almost-global locking if you really try to write this line to a file.
>> My research shows there are no ready-made java logging frameworks that
>> can be used in high-load production environment.
>>
>> On Sat, Dec 6, 2008 at 19:52, Shai Erera <se...@gmail.com> wrote:
>> > On the performance side, I don't expect to see any different performance
>> > than what we have today, since checking if infoStream != null should be
>> > similar to logger.isLoggable (or the equivalent methods from SLF4J).
>> >
>> > I'll look at SLF4J, open an issue and work out a patch.
>> >
>> > On Sat, Dec 6, 2008 at 1:22 PM, Grant Ingersoll <gs...@apache.org>
>> > wrote:
>> >>
>> >> On Dec 5, 2008, at 11:36 PM, Shai Erera wrote:
>> >>
>> >>>
>> >>> What do you have against JUL? I've used it and in my company (which is
>> >>> quite a large one btw) we've moved to JUL just because it's so easy to
>> >>> configure, comes already with the JDK and very intuitive. Perhaps it
>> >>> has
>> >>> some shortcomings which I'm not aware of, and I hope you can point me
>> >>> at
>> >>> them.
>> >>
>> >> See http://lucene.markmail.org/message/3t2qwbf7cc7wtx6h?q=Solr+logging
>> >> (or
>> >>
>> >> http://grantingersoll.com/2008/04/25/logging-frameworks-considered-harmful/
>> >> for
>> >> my rant on it!)  Frankly, I could live a quite happy life if I never
>> >> had to
>> >> think about logging frameworks again!
>> >>
>> >> As for JUL, the bottom line for me is (and perhaps I'm wrong):  It
>> >> doesn't
>> >> play nice with others (show me a system today that uses open source
>> >> projects
>> >> which doesn't have at least 2 diff. logging frameworks) and it usually
>> >> requires coding where other implementations don't.  My impression of
>> >> JUL is
>> >> that the designers wanted Log4j, but somehow they felt they had to come
>> >> up
>> >> with something "original", and in turn arrived at this thing that is
>> >> the
>> >> lowest common denominator.  But, like I said, it's a religious debate,
>> >> eh?
>> >> ;-)
>> >>
>> >> As for logging, you and Jason make good points.  I guess the first
>> >> thing
>> >> to do would be to submit a patch that adds SLF4J instead of infoStream
>> >> and
>> >> then we can test performance.  It still amazing, to me, however, that
>> >> Lucene
>> >> has made it this long with all but rudimentary logging and only during
>> >> indexing.
>> >>
>> >> ---------------------------------------------------------------------
>> >> To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
>> >> For additional commands, e-mail: java-dev-help@lucene.apache.org
>> >>
>> >
>> >
>>
>>
>>
>> --
>> Kirill Zakharenko/Кирилл Захаренко (earwin@gmail.com)
>> Home / Mobile: +7 (495) 683-567-4 / +7 (903) 5-888-423
>> ICQ: 104465785
>
>



-- 
Kirill Zakharenko/Кирилл Захаренко (earwin@gmail.com)
Home / Mobile: +7 (495) 683-567-4 / +7 (903) 5-888-423
ICQ: 104465785

Re: Java logging in Lucene

Posted by Shai Erera <se...@gmail.com>.
{quote}
My research shows there are no ready-made java logging frameworks that can
be used in high-load production environment.
{quote}

I'm not sure I understand what you mean by that. We use Java logging in our
high-profiled products, which support 100s of tps. Logging is usually turned
off, and is being turned on only for debugging. We have not seen any
problems with Java logging at runtime (i.e., w/o logging, when only
logger.isLoggable calls are made) or at debug-time (when actual logging
happens). Of course, at debug-time performance is slower, but that's debug
time - you're not into performance, but for debugging.

Anyway, as far as SLF4J goes, I've written a patch using it, and replacing
infoStream. I'm about to open an issue and submit the patch, for everyone to
review. We can continue the discussion there.

Shai

On Mon, Dec 8, 2008 at 10:13 AM, Earwin Burrfoot <ea...@gmail.com> wrote:

> The common problem with native logging, log4j and slf4j (logback impl)
> is that they are totally unsuitable for actually logging something.
> They do good work checking if the logging can be avoided, but use
> almost-global locking if you really try to write this line to a file.
> My research shows there are no ready-made java logging frameworks that
> can be used in high-load production environment.
>
> On Sat, Dec 6, 2008 at 19:52, Shai Erera <se...@gmail.com> wrote:
> > On the performance side, I don't expect to see any different performance
> > than what we have today, since checking if infoStream != null should be
> > similar to logger.isLoggable (or the equivalent methods from SLF4J).
> >
> > I'll look at SLF4J, open an issue and work out a patch.
> >
> > On Sat, Dec 6, 2008 at 1:22 PM, Grant Ingersoll <gs...@apache.org>
> wrote:
> >>
> >> On Dec 5, 2008, at 11:36 PM, Shai Erera wrote:
> >>
> >>>
> >>> What do you have against JUL? I've used it and in my company (which is
> >>> quite a large one btw) we've moved to JUL just because it's so easy to
> >>> configure, comes already with the JDK and very intuitive. Perhaps it
> has
> >>> some shortcomings which I'm not aware of, and I hope you can point me
> at
> >>> them.
> >>
> >> See http://lucene.markmail.org/message/3t2qwbf7cc7wtx6h?q=Solr+logging(or
> >>
> http://grantingersoll.com/2008/04/25/logging-frameworks-considered-harmful/for
> >> my rant on it!)  Frankly, I could live a quite happy life if I never had
> to
> >> think about logging frameworks again!
> >>
> >> As for JUL, the bottom line for me is (and perhaps I'm wrong):  It
> doesn't
> >> play nice with others (show me a system today that uses open source
> projects
> >> which doesn't have at least 2 diff. logging frameworks) and it usually
> >> requires coding where other implementations don't.  My impression of JUL
> is
> >> that the designers wanted Log4j, but somehow they felt they had to come
> up
> >> with something "original", and in turn arrived at this thing that is the
> >> lowest common denominator.  But, like I said, it's a religious debate,
> eh?
> >> ;-)
> >>
> >> As for logging, you and Jason make good points.  I guess the first thing
> >> to do would be to submit a patch that adds SLF4J instead of infoStream
> and
> >> then we can test performance.  It still amazing, to me, however, that
> Lucene
> >> has made it this long with all but rudimentary logging and only during
> >> indexing.
> >>
> >> ---------------------------------------------------------------------
> >> To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
> >> For additional commands, e-mail: java-dev-help@lucene.apache.org
> >>
> >
> >
>
>
>
> --
> Kirill Zakharenko/Кирилл Захаренко (earwin@gmail.com)
> Home / Mobile: +7 (495) 683-567-4 / +7 (903) 5-888-423
> ICQ: 104465785
>

Re: Java logging in Lucene

Posted by Yonik Seeley <yo...@apache.org>.
On Sat, Dec 6, 2008 at 11:52 AM, Shai Erera <se...@gmail.com> wrote:
> On the performance side, I don't expect to see any different performance
> than what we have today, since checking if infoStream != null should be
> similar to logger.isLoggable (or the equivalent methods from SLF4J).

I'm leery of going down this logging road because people may add
logging statements in inappropriate places, believing that
isLoggable() is about the same as infoStream != null

They seem roughly equivalent because of the context in which they are
tested: coarse grained logging where the surrounding operations
eclipse the logging check.

isLoggable() involves volatile reads, which prevent optimizations and
instruction reordering across the read.  On current x86 platforms, no
memory barrier instructions are needed for a volatile read, but that's
not true of other architectures.

-Yonik

---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-dev-help@lucene.apache.org


Re: Java logging in Lucene

Posted by Earwin Burrfoot <ea...@gmail.com>.
The common problem with native logging, log4j and slf4j (logback impl)
is that they are totally unsuitable for actually logging something.
They do good work checking if the logging can be avoided, but use
almost-global locking if you really try to write this line to a file.
My research shows there are no ready-made java logging frameworks that
can be used in high-load production environment.

On Sat, Dec 6, 2008 at 19:52, Shai Erera <se...@gmail.com> wrote:
> On the performance side, I don't expect to see any different performance
> than what we have today, since checking if infoStream != null should be
> similar to logger.isLoggable (or the equivalent methods from SLF4J).
>
> I'll look at SLF4J, open an issue and work out a patch.
>
> On Sat, Dec 6, 2008 at 1:22 PM, Grant Ingersoll <gs...@apache.org> wrote:
>>
>> On Dec 5, 2008, at 11:36 PM, Shai Erera wrote:
>>
>>>
>>> What do you have against JUL? I've used it and in my company (which is
>>> quite a large one btw) we've moved to JUL just because it's so easy to
>>> configure, comes already with the JDK and very intuitive. Perhaps it has
>>> some shortcomings which I'm not aware of, and I hope you can point me at
>>> them.
>>
>> See http://lucene.markmail.org/message/3t2qwbf7cc7wtx6h?q=Solr+logging (or
>> http://grantingersoll.com/2008/04/25/logging-frameworks-considered-harmful/ for
>> my rant on it!)  Frankly, I could live a quite happy life if I never had to
>> think about logging frameworks again!
>>
>> As for JUL, the bottom line for me is (and perhaps I'm wrong):  It doesn't
>> play nice with others (show me a system today that uses open source projects
>> which doesn't have at least 2 diff. logging frameworks) and it usually
>> requires coding where other implementations don't.  My impression of JUL is
>> that the designers wanted Log4j, but somehow they felt they had to come up
>> with something "original", and in turn arrived at this thing that is the
>> lowest common denominator.  But, like I said, it's a religious debate, eh?
>> ;-)
>>
>> As for logging, you and Jason make good points.  I guess the first thing
>> to do would be to submit a patch that adds SLF4J instead of infoStream and
>> then we can test performance.  It still amazing, to me, however, that Lucene
>> has made it this long with all but rudimentary logging and only during
>> indexing.
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
>> For additional commands, e-mail: java-dev-help@lucene.apache.org
>>
>
>



-- 
Kirill Zakharenko/Кирилл Захаренко (earwin@gmail.com)
Home / Mobile: +7 (495) 683-567-4 / +7 (903) 5-888-423
ICQ: 104465785

Re: Java logging in Lucene

Posted by Shai Erera <se...@gmail.com>.
On the performance side, I don't expect to see any different performance
than what we have today, since checking if infoStream != null should be
similar to logger.isLoggable (or the equivalent methods from SLF4J).

I'll look at SLF4J, open an issue and work out a patch.

On Sat, Dec 6, 2008 at 1:22 PM, Grant Ingersoll <gs...@apache.org> wrote:

>
> On Dec 5, 2008, at 11:36 PM, Shai Erera wrote:
>
>
>> What do you have against JUL? I've used it and in my company (which is
>> quite a large one btw) we've moved to JUL just because it's so easy to
>> configure, comes already with the JDK and very intuitive. Perhaps it has
>> some shortcomings which I'm not aware of, and I hope you can point me at
>> them.
>>
>
> See http://lucene.markmail.org/message/3t2qwbf7cc7wtx6h?q=Solr+logging (or
> http://grantingersoll.com/2008/04/25/logging-frameworks-considered-harmful/ for
> my rant on it!)  Frankly, I could live a quite happy life if I never had to
> think about logging frameworks again!
>
> As for JUL, the bottom line for me is (and perhaps I'm wrong):  It doesn't
> play nice with others (show me a system today that uses open source projects
> which doesn't have at least 2 diff. logging frameworks) and it usually
> requires coding where other implementations don't.  My impression of JUL is
> that the designers wanted Log4j, but somehow they felt they had to come up
> with something "original", and in turn arrived at this thing that is the
> lowest common denominator.  But, like I said, it's a religious debate, eh?
> ;-)
>
> As for logging, you and Jason make good points.  I guess the first thing to
> do would be to submit a patch that adds SLF4J instead of infoStream and then
> we can test performance.  It still amazing, to me, however, that Lucene has
> made it this long with all but rudimentary logging and only during indexing.
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-dev-help@lucene.apache.org
>
>

Re: Java logging in Lucene

Posted by Grant Ingersoll <gs...@apache.org>.
On Dec 5, 2008, at 11:36 PM, Shai Erera wrote:

>
> What do you have against JUL? I've used it and in my company (which  
> is quite a large one btw) we've moved to JUL just because it's so  
> easy to configure, comes already with the JDK and very intuitive.  
> Perhaps it has some shortcomings which I'm not aware of, and I hope  
> you can point me at them.

See http://lucene.markmail.org/message/3t2qwbf7cc7wtx6h?q=Solr+logging  
(or http://grantingersoll.com/2008/04/25/logging-frameworks-considered-harmful/ 
  for my rant on it!)  Frankly, I could live a quite happy life if I  
never had to think about logging frameworks again!

As for JUL, the bottom line for me is (and perhaps I'm wrong):  It  
doesn't play nice with others (show me a system today that uses open  
source projects which doesn't have at least 2 diff. logging  
frameworks) and it usually requires coding where other implementations  
don't.  My impression of JUL is that the designers wanted Log4j, but  
somehow they felt they had to come up with something "original", and  
in turn arrived at this thing that is the lowest common denominator.   
But, like I said, it's a religious debate, eh? ;-)

As for logging, you and Jason make good points.  I guess the first  
thing to do would be to submit a patch that adds SLF4J instead of  
infoStream and then we can test performance.  It still amazing, to me,  
however, that Lucene has made it this long with all but rudimentary  
logging and only during indexing.

---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-dev-help@lucene.apache.org


Re: Java logging in Lucene

Posted by Shai Erera <se...@gmail.com>.
Doug,

bq. It's tempting to add our own logging API, as you suggest, but I fear
that would re-invent what so many have re-invented before.

Haven't we already added our own logging API by introducing infoStream in
IndexWriter? All I'm proposing (as an alternative to Java logging) is to
make it a service for all of Lucene classes, even contrib. I didn't propose
to add Java logging-like capabilities, like levels (eventhough I think it's
useful), but instead take what IW has today (a message() method) and make a
static one for other classes.

bq. What do we tell folks who currently use both log4j and Lucene?  How
would they benefit from this?

I don't think it's such a big deal. To turn on Lucene logging, they have to
introduce some API (or UI) for users/administrators to configure. They then
probably set infoStream to the stream log4j uses.
By using Java logging, all we'll ask them is to configure the Java logging
system, which is pretty easy.

About SLF4J, I'm not familiar with it so I cannot comment. The only thing I
can comment about is the additional jar people would have to add to their
applications. That's really not an issue imo because people already add many
jars to support Lucene. If one uses any contrib package, it's an additional
jar. If one wants to use Snowball, it's 2 jars (the snowball and the contrib
analyzer).
When you use Apache HttpClient, you have to add several jars, which is ok
...

Grant,

What do you have against JUL? I've used it and in my company (which is quite
a large one btw) we've moved to JUL just because it's so easy to configure,
comes already with the JDK and very intuitive. Perhaps it has some
shortcomings which I'm not aware of, and I hope you can point me at them.

The argument on whether to choose JUL, commons, log4j or slf4j can go on, I
don't mind participating in it as I think it's interesting. But the core
question is whether the community (and especially the committers) think that
we need more logging in Lucene, except IW's infoStream. If so, we can start
by introducing that InfoStream service class, which willl only expose
today's functionality at start (i.e., only indexing code will use), but will
allow for other classes to log operations as well.

I personally would like to use more standard logging frameworks (and
preferrably JUL), but what I want more is the ability to debug my product
after it has been shipped. So eventhough it's not as great as standard
logging, the InfoStream service is still better than what Lucene has today.

My 2 cents.

Shai

On Sat, Dec 6, 2008 at 12:32 AM, Jason Rutherglen <
jason.rutherglen@gmail.com> wrote:

> As a developer who would like to eventually develop core code in Lucene (I
> started but then things changed drastically and so will wait for flexible
> indexing and other APIs?), a standard logging system would make development
> easier by making debugging easier.  I rely heavily on log analysis in
> developing and debugging search code.  A detailed view of what is happening
> internally will speed development, and as Shai mentioned allow production
> and pre-production systems to be monitored in new ways.
>
> -J
>
>
> On Fri, Dec 5, 2008 at 1:19 PM, Doug Cutting <cu...@apache.org> wrote:
>
>> John Wang wrote:
>>
>>> If we were to depend on a jar for logging, then why not log4j or
>>> commons-logging?
>>>
>>
>> Lucene is used by many applications.  Many of those applications already
>> perform some kind of logging.  We'd like whatever Lucene adds to fit in with
>> their existing logging framework, not conflict with it.  Thus the motivation
>> to use a meta-logging framwork like commons logging or slf4j.  And articles
>> like the following point towards slf4j, not commons logging.
>>
>> http://www.qos.ch/logging/thinkAgain.jsp
>>
>>
>> Doug
>>
>>
>>
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
>> For additional commands, e-mail: java-dev-help@lucene.apache.org
>>
>>
>

Re: Java logging in Lucene

Posted by Jason Rutherglen <ja...@gmail.com>.
As a developer who would like to eventually develop core code in Lucene (I
started but then things changed drastically and so will wait for flexible
indexing and other APIs?), a standard logging system would make development
easier by making debugging easier.  I rely heavily on log analysis in
developing and debugging search code.  A detailed view of what is happening
internally will speed development, and as Shai mentioned allow production
and pre-production systems to be monitored in new ways.

-J

On Fri, Dec 5, 2008 at 1:19 PM, Doug Cutting <cu...@apache.org> wrote:

> John Wang wrote:
>
>> If we were to depend on a jar for logging, then why not log4j or
>> commons-logging?
>>
>
> Lucene is used by many applications.  Many of those applications already
> perform some kind of logging.  We'd like whatever Lucene adds to fit in with
> their existing logging framework, not conflict with it.  Thus the motivation
> to use a meta-logging framwork like commons logging or slf4j.  And articles
> like the following point towards slf4j, not commons logging.
>
> http://www.qos.ch/logging/thinkAgain.jsp
>
>
> Doug
>
>
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-dev-help@lucene.apache.org
>
>

Re: Java logging in Lucene

Posted by Grant Ingersoll <gs...@apache.org>.
For some additional context, go over to the Solr mail archive and  
search for "Logging, SLF4J" or see http://lucene.markmail.org/message/gxifhjzmn6hgloy7?q=Solr+logging+SLF4J

I personally don't like JUL and would be against using it.  I could,  
maybe, just maybe, be talked into SLF4J.

The other thing I worry about is that the logging will probably be  
carefully crafted at first, but then will grow and grow and end up in  
some tight loops, etc.

Ah, for the days of the C preprocessor, where we could easily deliver  
a version of Lucene w/ logging and without for this kind of  
debugging...  ;-)

-Grant

On Dec 5, 2008, at 4:19 PM, Doug Cutting wrote:

> John Wang wrote:
>> If we were to depend on a jar for logging, then why not log4j or  
>> commons-logging?
>
> Lucene is used by many applications.  Many of those applications  
> already perform some kind of logging.  We'd like whatever Lucene  
> adds to fit in with their existing logging framework, not conflict  
> with it.  Thus the motivation to use a meta-logging framwork like  
> commons logging or slf4j.  And articles like the following point  
> towards slf4j, not commons logging.
>
> http://www.qos.ch/logging/thinkAgain.jsp
>
> Doug
>
>
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-dev-help@lucene.apache.org
>



---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-dev-help@lucene.apache.org


Re: Java logging in Lucene

Posted by Doug Cutting <cu...@apache.org>.
John Wang wrote:
> If we were to depend on a jar for logging, then why not log4j or 
> commons-logging?

Lucene is used by many applications.  Many of those applications already 
perform some kind of logging.  We'd like whatever Lucene adds to fit in 
with their existing logging framework, not conflict with it.  Thus the 
motivation to use a meta-logging framwork like commons logging or slf4j. 
  And articles like the following point towards slf4j, not commons logging.

http://www.qos.ch/logging/thinkAgain.jsp

Doug




---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-dev-help@lucene.apache.org


Re: Java logging in Lucene

Posted by John Wang <jo...@gmail.com>.
I thought the main point is to avoid a jar dependency.
If we were to depend on a jar for logging, then why not log4j or
commons-logging?

-John


On Fri, Dec 5, 2008 at 1:00 PM, Doug Cutting <cu...@apache.org> wrote:

> Shai Erera wrote:
>
>> Perhaps instead of introducing Java logging then (if you're too against
>> it), we could introdue a static InfoStream class, with a static message()
>> and isVerbose() methods.
>>
>
> It's tempting to add our own logging API, as you suggest, but I fear that
> would re-invent what so many have re-invented before.
>
>  As for the logging framework, I'd think that Java logging creates no
>> dependencies for Lucene. java.util.logging exists at least since 1.4.
>> So it's already in the JDK.
>>
>
> Good point.  Java's built-in logging would not add a dependency, but it can
> still conflict.  But in other projects with serious logging needs where I've
> tried using Java's built in logging, but we've always ended up switching to
> log4j.  So I worry that choosing Java's logging might not help those who
> need logging, and it would conflict with those who already use log4j.
>
>  You might argue that some applications
>> who embed a search component over Lucene use a different logging
>> system (such as Log4j), but in that case I think it'd be fair to say
>> that Java logging is what Lucene uses.
>>
>
> What do we tell folks who currently use both log4j and Lucene?  How would
> they benefit from this?
>
> A meta-logger like SLF4J seems preferable, since it could integrate with
> whatever logging system folks already use.  Adding this would be an
> incompatible change, since folks would need to add new jars into their
> applications besides the lucene jar.  But that's perhaps not a huge burden.
>  What do others think?
>
> Doug
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-dev-help@lucene.apache.org
>
>

Re: Java logging in Lucene

Posted by Doug Cutting <cu...@apache.org>.
Shai Erera wrote:
> Perhaps instead of introducing Java logging then (if you're too against 
> it), we could introdue a static InfoStream class, with a static 
> message() and isVerbose() methods.

It's tempting to add our own logging API, as you suggest, but I fear 
that would re-invent what so many have re-invented before.

> As for the logging framework, I'd think that Java logging creates no
> dependencies for Lucene. java.util.logging exists at least since 1.4.
> So it's already in the JDK.

Good point.  Java's built-in logging would not add a dependency, but it 
can still conflict.  But in other projects with serious logging needs 
where I've tried using Java's built in logging, but we've always ended 
up switching to log4j.  So I worry that choosing Java's logging might 
not help those who need logging, and it would conflict with those who 
already use log4j.

> You might argue that some applications
> who embed a search component over Lucene use a different logging
> system (such as Log4j), but in that case I think it'd be fair to say
> that Java logging is what Lucene uses.

What do we tell folks who currently use both log4j and Lucene?  How 
would they benefit from this?

A meta-logger like SLF4J seems preferable, since it could integrate with 
whatever logging system folks already use.  Adding this would be an 
incompatible change, since folks would need to add new jars into their 
applications besides the lucene jar.  But that's perhaps not a huge 
burden.  What do others think?

Doug

---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-dev-help@lucene.apache.org


Re: Java logging in Lucene

Posted by Shai Erera <se...@gmail.com>.
BTW, one thing I forgot to add: with infoStream, it's very difficult to
extend the level of output if one wants, for example, to add logging
messages to the search part (or other parts). The reason is one would need
to permeate infoStream down too many classes. Instead, with Java logging,
each class is responsible for its own logging (by obtaining a Logger
instance given the class name). You can later turn on/off logging per
package/class.

Perhaps instead of introducing Java logging then (if you're too against it),
we could introdue a static InfoStream class, with a static message() and
isVerbose() methods. That way, all classes who wish to log any message can
use it and it will be easier to add messages in the future from other
classes.
Even though it won't allow controlling which classes/packages will output to
the log file, it will give easier extension to Lucene logging. Would that
make more sense?

I still would prefer to see Java logging embedded, but if that's
unacceptable by the community, then having the above solution is better than
what we have today.

On Fri, Dec 5, 2008 at 9:38 PM, Shai Erera <se...@gmail.com> wrote:

> Have you ever tried to debug your search application after it was shipped
> to a customer? When problems occur on the customer end, you cannot very
> easily reproduce problems because customers don't like to give you access to
> their systems, not always they are willing to share the index with you and
> let alone the documents that have been indexed.
>
> Logging is very common in products just for that purpose. Of course I can
> use debugging when something happens in my development environment. But
> that's not the case after the product has shipped.
>
> As for the logging framework, I'd think that Java logging creates no
> dependencies for Lucene. java.util.logging exists at least since 1.4. So
> it's already in the JDK. You might argue that some applications who embed a
> search component over Lucene use a different logging system (such as Log4j),
> but in that case I think it'd be fair to say that Java logging is what
> Lucene uses.
>
> You already do it today - you say that you use infoStream which prints
> messages. Only the solution in Lucene today cannot be customized. I either
> turn on *logging* for the entire Lucene package (or actually just the
> indexing part) or not. I cannot, for example, turn on *logging* just for the
> merge part.
>
> The debugging on the customer side is mostly what I'm after. My experience
> with another search library (proprietary) with exactly the same *logging*
> capabilities like Lucene (you either turn on/off logging for everything),
> although it contained messages from other parts of the search library as
> well, show that it's extremely difficult to debug what's going on during
> search on the customer side. Sometimes, all the application can log is that
> it adds a document with some attributes, but if you really want to
> understand what's going on inside Lucene, it's impossible. One useful
> information might be what are the actual tokens that were added to the
> index. There's no way the application can tell you that, w/o running the
> Analyzer on the text. But then it needs to write code, which I think could
> have been written in Lucene.
> Another useful information is what is the query that's actually being run.
> I guess that printing the QueryParser Query output object might be enough,
> but you never know.
> Maybe you'd like to know what indexes participated in the search, in case
> of a distributed indexing scenario.
>
> And the list can only grow ...
>
> Like I said in my first email - logging is an approach the community has to
> make, w/o neccessarily going over all the existing code and add messages.
> Those can be added over time, by many people who'd like to get detailed
> information from Lucene.
>
> I hope my intentions are clearer now.
>
>
> On Fri, Dec 5, 2008 at 9:06 PM, Michael McCandless <
> lucene@mikemccandless.com> wrote:
>
>>
>> I also feel that the primary usage of the internal messaging in Lucene
>> today is debugging, and we don't need a logging framework for that.
>>
>> Mike
>>
>>
>> Doug Cutting wrote:
>>
>>  The infoStream stuff goes back to 1997, before there was log4j or any
>>> other Java logging framework.
>>>
>>> There's never been a big push to add logging to Lucene.  It would add a
>>> dependency, and Lucene's jar has always been standalone, which is nice.
>>>  Dependencies can conflict.  If Lucene requires one version of a dependency,
>>> then it may not work well with code that require a different version of that
>>> dependency.
>>>
>>> And it hasn't been clear which framework to adopt.  Log4j is the
>>> granddaddy, then there's Java logging and commons logging.  Today the
>>> preferred framework is probably SLF4J.  Good thing we didn't choose the
>>> wrong one years ago!
>>>
>>> And how many log entries would folks really want to see per query or
>>> document indexed?  In production I don't think most folks want to see more
>>> than one entry per query or document indexed.  So finer-grained logging
>>> would be for debugging.  For that one can instead use a debugger.  Hence the
>>> traditional lack of demand for detailed logging in Lucene.
>>>
>>> That's the history as I recall it.  The future is less clear.
>>>
>>> Doug
>>>
>>> Grant Ingersoll wrote:
>>>
>>>> I think the main motivation has always been to have no dependencies in
>>>> the core so as to keep it as fast and lightweight as possible.  Then, of
>>>> course, there is always the usual religious wars around which logging
>>>> framework to use, not to mention the nightmare that is trying to manage
>>>> multiple logging frameworks across several projects that are being
>>>> integrated.  Then, of course, there is the question of how useful any core
>>>> Lucene logs would be to users writing search applications.  For the most
>>>> part, my experience has been that I want logging to tell me when a document
>>>> was added, when searches occur, etc. but I don't necessarily need to know
>>>> things like the fact that Lucene is now entering the analysis phase of
>>>> Document inversion.  And, for all these needs, I can just as well do that
>>>> logging in the application and not in Lucene.
>>>> All that is not to say we couldn't add in logging, I'm just suggesting
>>>> reasons I can think of for why it has not been added to date and why I am
>>>> not sure it needs to be there going forward.  I believe various other people
>>>> have contributed reasons in the past.  I seem to recall Doug spelling some
>>>> out, but don't have the thread handy.
>>>> -Grant
>>>> On Dec 5, 2008, at 1:17 PM, Shai Erera wrote:
>>>>
>>>>> Hi
>>>>>
>>>>> I was wondering why doesn't the Lucene code uses Java logging, instead
>>>>> of the infoStream set in IndexWriter? Today, if I want to enable tracing of
>>>>> Lucene code, the only thing I can do is set an infoStream, but then I get
>>>>> many many messages. Moreoever, those messages seem to cover indexing code
>>>>> only.
>>>>>
>>>>> I hope to get some opinions on the use of Java logging instead of
>>>>> infoStream, and hopefully to start addind logging messages in other places
>>>>> in the code (like during search, query parsing etc.)
>>>>>
>>>>> I feel that this is an approach the community has to decide on before
>>>>> we start adding messages to the code. Using Java logging can greatly benefit
>>>>> tracing of indexing applications who use Lucene. If the vote is +1 for using
>>>>> Java logging, we can start by deprecating infoStream (in 2.9, remove in 3.0)
>>>>> and use logging instead.
>>>>>
>>>>> What do you think?
>>>>>
>>>>> Shai
>>>>>
>>>> ---------------------------------------------------------------------
>>>> To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
>>>> For additional commands, e-mail: java-dev-help@lucene.apache.org
>>>>
>>>
>>>
>>> ---------------------------------------------------------------------
>>> To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
>>> For additional commands, e-mail: java-dev-help@lucene.apache.org
>>>
>>>
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
>> For additional commands, e-mail: java-dev-help@lucene.apache.org
>>
>>
>

Re: Java logging in Lucene

Posted by Shai Erera <se...@gmail.com>.
Have you ever tried to debug your search application after it was shipped to
a customer? When problems occur on the customer end, you cannot very easily
reproduce problems because customers don't like to give you access to their
systems, not always they are willing to share the index with you and let
alone the documents that have been indexed.

Logging is very common in products just for that purpose. Of course I can
use debugging when something happens in my development environment. But
that's not the case after the product has shipped.

As for the logging framework, I'd think that Java logging creates no
dependencies for Lucene. java.util.logging exists at least since 1.4. So
it's already in the JDK. You might argue that some applications who embed a
search component over Lucene use a different logging system (such as Log4j),
but in that case I think it'd be fair to say that Java logging is what
Lucene uses.

You already do it today - you say that you use infoStream which prints
messages. Only the solution in Lucene today cannot be customized. I either
turn on *logging* for the entire Lucene package (or actually just the
indexing part) or not. I cannot, for example, turn on *logging* just for the
merge part.

The debugging on the customer side is mostly what I'm after. My experience
with another search library (proprietary) with exactly the same *logging*
capabilities like Lucene (you either turn on/off logging for everything),
although it contained messages from other parts of the search library as
well, show that it's extremely difficult to debug what's going on during
search on the customer side. Sometimes, all the application can log is that
it adds a document with some attributes, but if you really want to
understand what's going on inside Lucene, it's impossible. One useful
information might be what are the actual tokens that were added to the
index. There's no way the application can tell you that, w/o running the
Analyzer on the text. But then it needs to write code, which I think could
have been written in Lucene.
Another useful information is what is the query that's actually being run. I
guess that printing the QueryParser Query output object might be enough, but
you never know.
Maybe you'd like to know what indexes participated in the search, in case of
a distributed indexing scenario.

And the list can only grow ...

Like I said in my first email - logging is an approach the community has to
make, w/o neccessarily going over all the existing code and add messages.
Those can be added over time, by many people who'd like to get detailed
information from Lucene.

I hope my intentions are clearer now.

On Fri, Dec 5, 2008 at 9:06 PM, Michael McCandless <
lucene@mikemccandless.com> wrote:

>
> I also feel that the primary usage of the internal messaging in Lucene
> today is debugging, and we don't need a logging framework for that.
>
> Mike
>
>
> Doug Cutting wrote:
>
>  The infoStream stuff goes back to 1997, before there was log4j or any
>> other Java logging framework.
>>
>> There's never been a big push to add logging to Lucene.  It would add a
>> dependency, and Lucene's jar has always been standalone, which is nice.
>>  Dependencies can conflict.  If Lucene requires one version of a dependency,
>> then it may not work well with code that require a different version of that
>> dependency.
>>
>> And it hasn't been clear which framework to adopt.  Log4j is the
>> granddaddy, then there's Java logging and commons logging.  Today the
>> preferred framework is probably SLF4J.  Good thing we didn't choose the
>> wrong one years ago!
>>
>> And how many log entries would folks really want to see per query or
>> document indexed?  In production I don't think most folks want to see more
>> than one entry per query or document indexed.  So finer-grained logging
>> would be for debugging.  For that one can instead use a debugger.  Hence the
>> traditional lack of demand for detailed logging in Lucene.
>>
>> That's the history as I recall it.  The future is less clear.
>>
>> Doug
>>
>> Grant Ingersoll wrote:
>>
>>> I think the main motivation has always been to have no dependencies in
>>> the core so as to keep it as fast and lightweight as possible.  Then, of
>>> course, there is always the usual religious wars around which logging
>>> framework to use, not to mention the nightmare that is trying to manage
>>> multiple logging frameworks across several projects that are being
>>> integrated.  Then, of course, there is the question of how useful any core
>>> Lucene logs would be to users writing search applications.  For the most
>>> part, my experience has been that I want logging to tell me when a document
>>> was added, when searches occur, etc. but I don't necessarily need to know
>>> things like the fact that Lucene is now entering the analysis phase of
>>> Document inversion.  And, for all these needs, I can just as well do that
>>> logging in the application and not in Lucene.
>>> All that is not to say we couldn't add in logging, I'm just suggesting
>>> reasons I can think of for why it has not been added to date and why I am
>>> not sure it needs to be there going forward.  I believe various other people
>>> have contributed reasons in the past.  I seem to recall Doug spelling some
>>> out, but don't have the thread handy.
>>> -Grant
>>> On Dec 5, 2008, at 1:17 PM, Shai Erera wrote:
>>>
>>>> Hi
>>>>
>>>> I was wondering why doesn't the Lucene code uses Java logging, instead
>>>> of the infoStream set in IndexWriter? Today, if I want to enable tracing of
>>>> Lucene code, the only thing I can do is set an infoStream, but then I get
>>>> many many messages. Moreoever, those messages seem to cover indexing code
>>>> only.
>>>>
>>>> I hope to get some opinions on the use of Java logging instead of
>>>> infoStream, and hopefully to start addind logging messages in other places
>>>> in the code (like during search, query parsing etc.)
>>>>
>>>> I feel that this is an approach the community has to decide on before we
>>>> start adding messages to the code. Using Java logging can greatly benefit
>>>> tracing of indexing applications who use Lucene. If the vote is +1 for using
>>>> Java logging, we can start by deprecating infoStream (in 2.9, remove in 3.0)
>>>> and use logging instead.
>>>>
>>>> What do you think?
>>>>
>>>> Shai
>>>>
>>> ---------------------------------------------------------------------
>>> To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
>>> For additional commands, e-mail: java-dev-help@lucene.apache.org
>>>
>>
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
>> For additional commands, e-mail: java-dev-help@lucene.apache.org
>>
>>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-dev-help@lucene.apache.org
>
>

Re: Java logging in Lucene

Posted by Michael McCandless <lu...@mikemccandless.com>.
I also feel that the primary usage of the internal messaging in Lucene  
today is debugging, and we don't need a logging framework for that.

Mike

Doug Cutting wrote:

> The infoStream stuff goes back to 1997, before there was log4j or any
> other Java logging framework.
>
> There's never been a big push to add logging to Lucene.  It would  
> add a dependency, and Lucene's jar has always been standalone, which  
> is nice.  Dependencies can conflict.  If Lucene requires one version  
> of a dependency, then it may not work well with code that require a  
> different version of that dependency.
>
> And it hasn't been clear which framework to adopt.  Log4j is the  
> granddaddy, then there's Java logging and commons logging.  Today  
> the preferred framework is probably SLF4J.  Good thing we didn't  
> choose the wrong one years ago!
>
> And how many log entries would folks really want to see per query or  
> document indexed?  In production I don't think most folks want to  
> see more than one entry per query or document indexed.  So finer- 
> grained logging would be for debugging.  For that one can instead  
> use a debugger.  Hence the traditional lack of demand for detailed  
> logging in Lucene.
>
> That's the history as I recall it.  The future is less clear.
>
> Doug
>
> Grant Ingersoll wrote:
>> I think the main motivation has always been to have no dependencies  
>> in the core so as to keep it as fast and lightweight as possible.   
>> Then, of course, there is always the usual religious wars around  
>> which logging framework to use, not to mention the nightmare that  
>> is trying to manage multiple logging frameworks across several  
>> projects that are being integrated.  Then, of course, there is the  
>> question of how useful any core Lucene logs would be to users  
>> writing search applications.  For the most part, my experience has  
>> been that I want logging to tell me when a document was added, when  
>> searches occur, etc. but I don't necessarily need to know things  
>> like the fact that Lucene is now entering the analysis phase of  
>> Document inversion.  And, for all these needs, I can just as well  
>> do that logging in the application and not in Lucene.
>> All that is not to say we couldn't add in logging, I'm just  
>> suggesting reasons I can think of for why it has not been added to  
>> date and why I am not sure it needs to be there going forward.  I  
>> believe various other people have contributed reasons in the past.   
>> I seem to recall Doug spelling some out, but don't have the thread  
>> handy.
>> -Grant
>> On Dec 5, 2008, at 1:17 PM, Shai Erera wrote:
>>> Hi
>>>
>>> I was wondering why doesn't the Lucene code uses Java logging,  
>>> instead of the infoStream set in IndexWriter? Today, if I want to  
>>> enable tracing of Lucene code, the only thing I can do is set an  
>>> infoStream, but then I get many many messages. Moreoever, those  
>>> messages seem to cover indexing code only.
>>>
>>> I hope to get some opinions on the use of Java logging instead of  
>>> infoStream, and hopefully to start addind logging messages in  
>>> other places in the code (like during search, query parsing etc.)
>>>
>>> I feel that this is an approach the community has to decide on  
>>> before we start adding messages to the code. Using Java logging  
>>> can greatly benefit tracing of indexing applications who use  
>>> Lucene. If the vote is +1 for using Java logging, we can start by  
>>> deprecating infoStream (in 2.9, remove in 3.0) and use logging  
>>> instead.
>>>
>>> What do you think?
>>>
>>> Shai
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
>> For additional commands, e-mail: java-dev-help@lucene.apache.org
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-dev-help@lucene.apache.org
>


---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-dev-help@lucene.apache.org


Re: Java logging in Lucene

Posted by Doug Cutting <cu...@apache.org>.
The infoStream stuff goes back to 1997, before there was log4j or any
other Java logging framework.

There's never been a big push to add logging to Lucene.  It would add a 
dependency, and Lucene's jar has always been standalone, which is nice. 
  Dependencies can conflict.  If Lucene requires one version of a 
dependency, then it may not work well with code that require a different 
version of that dependency.

And it hasn't been clear which framework to adopt.  Log4j is the 
granddaddy, then there's Java logging and commons logging.  Today the 
preferred framework is probably SLF4J.  Good thing we didn't choose the 
wrong one years ago!

And how many log entries would folks really want to see per query or 
document indexed?  In production I don't think most folks want to see 
more than one entry per query or document indexed.  So finer-grained 
logging would be for debugging.  For that one can instead use a 
debugger.  Hence the traditional lack of demand for detailed logging in 
Lucene.

That's the history as I recall it.  The future is less clear.

Doug

Grant Ingersoll wrote:
> I think the main motivation has always been to have no dependencies in 
> the core so as to keep it as fast and lightweight as possible.  Then, of 
> course, there is always the usual religious wars around which logging 
> framework to use, not to mention the nightmare that is trying to manage 
> multiple logging frameworks across several projects that are being 
> integrated.  Then, of course, there is the question of how useful any 
> core Lucene logs would be to users writing search applications.  For the 
> most part, my experience has been that I want logging to tell me when a 
> document was added, when searches occur, etc. but I don't necessarily 
> need to know things like the fact that Lucene is now entering the 
> analysis phase of Document inversion.  And, for all these needs, I can 
> just as well do that logging in the application and not in Lucene.
> 
> All that is not to say we couldn't add in logging, I'm just suggesting 
> reasons I can think of for why it has not been added to date and why I 
> am not sure it needs to be there going forward.  I believe various other 
> people have contributed reasons in the past.  I seem to recall Doug 
> spelling some out, but don't have the thread handy.
> 
> -Grant
> 
> On Dec 5, 2008, at 1:17 PM, Shai Erera wrote:
> 
>> Hi
>>
>> I was wondering why doesn't the Lucene code uses Java logging, instead 
>> of the infoStream set in IndexWriter? Today, if I want to enable 
>> tracing of Lucene code, the only thing I can do is set an infoStream, 
>> but then I get many many messages. Moreoever, those messages seem to 
>> cover indexing code only.
>>
>> I hope to get some opinions on the use of Java logging instead of 
>> infoStream, and hopefully to start addind logging messages in other 
>> places in the code (like during search, query parsing etc.)
>>
>> I feel that this is an approach the community has to decide on before 
>> we start adding messages to the code. Using Java logging can greatly 
>> benefit tracing of indexing applications who use Lucene. If the vote 
>> is +1 for using Java logging, we can start by deprecating infoStream 
>> (in 2.9, remove in 3.0) and use logging instead.
>>
>> What do you think?
>>
>> Shai
> 
> 
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-dev-help@lucene.apache.org
> 


---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-dev-help@lucene.apache.org


Re: Java logging in Lucene

Posted by Grant Ingersoll <gs...@apache.org>.
I think the main motivation has always been to have no dependencies in  
the core so as to keep it as fast and lightweight as possible.  Then,  
of course, there is always the usual religious wars around which  
logging framework to use, not to mention the nightmare that is trying  
to manage multiple logging frameworks across several projects that are  
being integrated.  Then, of course, there is the question of how  
useful any core Lucene logs would be to users writing search  
applications.  For the most part, my experience has been that I want  
logging to tell me when a document was added, when searches occur,  
etc. but I don't necessarily need to know things like the fact that  
Lucene is now entering the analysis phase of Document inversion.  And,  
for all these needs, I can just as well do that logging in the  
application and not in Lucene.

All that is not to say we couldn't add in logging, I'm just suggesting  
reasons I can think of for why it has not been added to date and why I  
am not sure it needs to be there going forward.  I believe various  
other people have contributed reasons in the past.  I seem to recall  
Doug spelling some out, but don't have the thread handy.

-Grant

On Dec 5, 2008, at 1:17 PM, Shai Erera wrote:

> Hi
>
> I was wondering why doesn't the Lucene code uses Java logging,  
> instead of the infoStream set in IndexWriter? Today, if I want to  
> enable tracing of Lucene code, the only thing I can do is set an  
> infoStream, but then I get many many messages. Moreoever, those  
> messages seem to cover indexing code only.
>
> I hope to get some opinions on the use of Java logging instead of  
> infoStream, and hopefully to start addind logging messages in other  
> places in the code (like during search, query parsing etc.)
>
> I feel that this is an approach the community has to decide on  
> before we start adding messages to the code. Using Java logging can  
> greatly benefit tracing of indexing applications who use Lucene. If  
> the vote is +1 for using Java logging, we can start by deprecating  
> infoStream (in 2.9, remove in 3.0) and use logging instead.
>
> What do you think?
>
> Shai



---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-dev-help@lucene.apache.org