You are viewing a plain text version of this content. The canonical link for it is here.

Posted to java-user@lucene.apache.org by abdul aleem <ja...@yahoo.com> on 2006/12/11 13:04:13 UTC

Using Lucene to search log files

Hi All,

Im a Lucene newbie,


Requirement : 
==============
a) Build a log viewer tool, search log files for
keywords and time stamp

b)  files in production approx 200 logs per day and
each log file may range from 1MB - 5MB

Lucene 
========
We wanted to utilize Lucene's search capabilities
especially search all 200 log files content quickly

a) Search criteria:
    i) Timestamp search: Fetch contents between any
two timestamps 

   ii) Fetch log file contents for specified keyword 


Query 
========
    a ) Would greatly appreciate if some suggestions 
        whether Lucene will be appropriate tool for
the requirement ??


    b) I have tried to use SpanQuery however
struggling to fetch entire conents e.g. (between two
timestamps) 

    c) I had also looked at
LargeScaleDateRangeProcessing in the wiki, is that a
right approach for the requirement



  Any help / suggestion would be greatly appreciated,


  Many thanks in advance,
  Abdul    



 
____________________________________________________________________________________
Do you Yahoo!?
Everyone is raving about the all-new Yahoo! Mail beta.
http://new.mail.yahoo.com

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org

Re: Using Lucene to search log files

Posted by Erick Erickson <er...@gmail.com>.

As far as the appropriateness of Lucene, it's an open question, but I think
it'd be fine. If it isn't, you have an "interesting" problem <G>.

About timestamps. This has been discussed a LOT on the thread, since they're
not as straight-forward as you might assume. See the thread *"Date ranges -
getting the approach right" *for an exposition on what it's all about. The
thing you *must* understand is that some forms of a query will throw a "too
many clauses" exception. Especially if you store your dates to, say,
millisecond resolution and use the intuitive query forms. Under the covers
if you ask for, say, all queries between 12:00 and 13:00, Lucene will expand
this to a big query with a clause for every value in your index that
satisfies the range. For instance, if there are 2,000 different time
valuesin your index between the two times, there will be 2,000
clauses. If there
are 10,000 documents, but only 10 different times between these two values,
you'll get 10 clauses. Lucene defaults to 1024 maximum clauses, and if your
query expands to more than this, you get the TooManyClauses exception.

This does not apply to Filters, and there are specialty classes for dealing
with this issue. Also, you have some control over how many clauses by
choosing the resolution you store in your index. In the above, if you stored
only by minute, you'd never get more than 60 clauses in an hour.

And there are more graceful ways around this, so don't be discouraged.

I'm sure this is confusing (I know it certainly confused me for a long
time). My hope  is that as you work with the process and run across issues,
you'll be able to say "Oh, that is what they were talking about". And be of
good cheer, these are not show-stoppers at all, they have been dealt with
successfully on a wide range of projects.

Search the mail archive for date, daterange, toomanyclauses, etc and you'll
see the discussions.....

Best
Erick

On 12/11/06, abdul aleem <ja...@yahoo.com> wrote:
>
> Hi All,
>
> Im a Lucene newbie,
>
>
> Requirement :
> ==============
> a) Build a log viewer tool, search log files for
> keywords and time stamp
>
> b)  files in production approx 200 logs per day and
> each log file may range from 1MB - 5MB
>
> Lucene
> ========
> We wanted to utilize Lucene's search capabilities
> especially search all 200 log files content quickly
>
> a) Search criteria:
>     i) Timestamp search: Fetch contents between any
> two timestamps
>
>    ii) Fetch log file contents for specified keyword
>
>
> Query
> ========
>     a ) Would greatly appreciate if some suggestions
>         whether Lucene will be appropriate tool for
> the requirement ??
>
>
>     b) I have tried to use SpanQuery however
> struggling to fetch entire conents e.g. (between two
> timestamps)
>
>     c) I had also looked at
> LargeScaleDateRangeProcessing in the wiki, is that a
> right approach for the requirement
>
>
>
>   Any help / suggestion would be greatly appreciated,
>
>
>   Many thanks in advance,
>   Abdul
>
>
>
>
>
> ____________________________________________________________________________________
> Do you Yahoo!?
> Everyone is raving about the all-new Yahoo! Mail beta.
> http://new.mail.yahoo.com
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org
>
>

Re: Using Lucene to search log files

Posted by abdul aleem <ja...@yahoo.com>.

Many thanks Grant,
I will now dirty my hands with Lucene to get our
requirements

regards,
Abdul

--- Grant Ingersoll <gs...@apache.org> wrote:

> See below
> 
> On Dec 11, 2006, at 7:04 AM, abdul aleem wrote:
> 
> > Hi All,
> >
> > Im a Lucene newbie,
> >
> >
> > Requirement :
> > ==============
> > a) Build a log viewer tool, search log files for
> > keywords and time stamp
> >
> > b)  files in production approx 200 logs per day
> and
> > each log file may range from 1MB - 5MB
> >
> > Lucene
> > ========
> > We wanted to utilize Lucene's search capabilities
> > especially search all 200 log files content
> quickly
> >
> > a) Search criteria:
> >     i) Timestamp search: Fetch contents between
> any
> > two timestamps
> >
> >    ii) Fetch log file contents for specified
> keyword
> >
> >
> > Query
> > ========
> >     a ) Would greatly appreciate if some
> suggestions
> >         whether Lucene will be appropriate tool
> for
> > the requirement ??
> >
> 
> Yes, I think this is a reasonable application of
> Lucene.  You  
> probably need a to customize analysis to handle your
> log files, but  
> the rest should be pretty straightforward.
> 
> >
> >     b) I have tried to use SpanQuery however
> > struggling to fetch entire conents e.g. (between
> two
> > timestamps)
> 
> You're probably better off using the approach
> mentioned in c).   Also  
> search this list for usage of RangeFilter and 
> search this list for  
> log file searching, as I am sure this topic has come
> up before.
> 
> >
> >     c) I had also looked at
> > LargeScaleDateRangeProcessing in the wiki, is that
> a
> > right approach for the requirement
> 
> --------------------------
> Grant Ingersoll
> Center for Natural Language Processing
> http://www.cnlp.org
> 
> Read the Lucene Java FAQ at
> http://wiki.apache.org/jakarta-lucene/ 
> LuceneFAQ
> 
> 
> 
>
---------------------------------------------------------------------
> To unsubscribe, e-mail:
> java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail:
> java-user-help@lucene.apache.org
> 
> 



 
____________________________________________________________________________________
Need a quick answer? Get one in minutes from people who know.
Ask your question on www.Answers.yahoo.com

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org

Re: Using Lucene to search log files

Posted by Grant Ingersoll <gs...@apache.org>.

See below

On Dec 11, 2006, at 7:04 AM, abdul aleem wrote:

> Hi All,
>
> Im a Lucene newbie,
>
>
> Requirement :
> ==============
> a) Build a log viewer tool, search log files for
> keywords and time stamp
>
> b)  files in production approx 200 logs per day and
> each log file may range from 1MB - 5MB
>
> Lucene
> ========
> We wanted to utilize Lucene's search capabilities
> especially search all 200 log files content quickly
>
> a) Search criteria:
>     i) Timestamp search: Fetch contents between any
> two timestamps
>
>    ii) Fetch log file contents for specified keyword
>
>
> Query
> ========
>     a ) Would greatly appreciate if some suggestions
>         whether Lucene will be appropriate tool for
> the requirement ??
>

Yes, I think this is a reasonable application of Lucene.  You  
probably need a to customize analysis to handle your log files, but  
the rest should be pretty straightforward.

>
>     b) I have tried to use SpanQuery however
> struggling to fetch entire conents e.g. (between two
> timestamps)

You're probably better off using the approach mentioned in c).   Also  
search this list for usage of RangeFilter and  search this list for  
log file searching, as I am sure this topic has come up before.

>
>     c) I had also looked at
> LargeScaleDateRangeProcessing in the wiki, is that a
> right approach for the requirement

--------------------------
Grant Ingersoll
Center for Natural Language Processing
http://www.cnlp.org

Read the Lucene Java FAQ at http://wiki.apache.org/jakarta-lucene/ 
LuceneFAQ



---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org

Re: Lucene id generation

Posted by karl wettin <ka...@gmail.com>.

11 dec 2006 kl. 16.15 skrev Waheed Mohammed:

>
> Is there a way to influence lucene's generation of ids while indexing.

If you speak of the Lucene "document number", then no. And are you  
aware of the fact that document numbers are eligable for change at  
any time (optimization) without giving you any notification of what  
was changed to what?

>
> my requirement is. I want to have different indexes where no index  
> should have
> ids that have been assigned to an index earlier.

You'll have to handle and add thoses identities manually in a stored  
field.

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org

Re: Lucene id generation

Posted by Erick Erickson <er...@gmail.com>.

I don't believe that this is possible. Or desirable. Lucene IDs are mutable,
even within an index. That is, if you index docs that get, say, IDs 1, 2, 3,
4, 5 and delete doc 2 and optimize, Docs 4 and 5 get reassigned IDs 3 and 4
(or something similar).

You're far better off controlling this yourself. That is, forget Lucene IDs
and make up your own unique IDs that you can guarantee form disjoint sets
across your multiple indexes, then work with those.

Best
Erick

On 12/11/06, Waheed Mohammed <wa...@fiz-technik.de> wrote:
>
> Hello,
>
> Is there a way to influence lucene's generation of ids while indexing.
>
> my requirement is. I want to have different indexes where no index should
> have
> ids that have been assigned to an index earlier.
> for instance
> IDX1 : {0.........100}
> IDX2: {101.......200}
> IDX3: {201.......300}
> but not
> IDX1 : {0.........100}
> IDX2 : {0.........100}
> IDX3 : {0.........100}
>
> any help is greatly appriciated.
> --
> A W Mohammed
> Software Entwickler
> FIZ-technik e.V
> Frankfut am Main
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org
>
>

Re: Lucene id generation

Posted by Waheed Mohammed <wa...@fiz-technik.de>.

Thanks for the instant reply,
I see what rajesh advises is something lilke what MultiReader does.
That would be my last approach becouse of the complexities it will introduce 
in developing the business case I have.
Any thing other than that would be a appriciable ppointer


On Monday 11 December 2006 17:10, Ramana Jelda wrote:
> I really lack this feature from lucene too.
> Whatever the requirements from Mohammed, There surely I see some
> improvements in search performance.
>
> My argument here is, why not lucene provides a mechanism to be able to
> provide custom document ids?
>
> > -----Original Message-----
> > From: Find Me [mailto:findmath@gmail.com]
> > Sent: Monday, December 11, 2006 4:34 PM
> > To: java-user@lucene.apache.org
> > Subject: Re: Lucene id generation
> >
> > On 12/11/06, Waheed Mohammed <wa...@fiz-technik.de> wrote:
> > > Hello,
> > >
> > > Is there a way to influence lucene's generation of ids
> >
> > while indexing.
> >
> > > my requirement is. I want to have different indexes where no index
> > > should have ids that have been assigned to an index earlier.
> > > for instance
> > > IDX1 : {0.........100}
> > > IDX2: {101.......200}
> > > IDX3: {201.......300}
> > > but not
> > > IDX1 : {0.........100}
> > > IDX2 : {0.........100}
> > > IDX3 : {0.........100}
> >
> > I dont think you should be doing that. If you want to have
> > the same effect,
> > during searching you can package hits from different indices with a
> > predetermined offset for each index. For ex: IDX1 will have
> > an offset 0,
> > IDX2 will have 101...and so on.
> >
> > --Rajesh Munavalli
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org

-- 
A W Mohammed
Software Entwickler
FIZ-technik e.V
Frankfut am Main


---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org

RE: Lucene id generation

Posted by Chris Hostetter <ho...@fucit.org>.

: Exactly in this scenario, I would love to use my custom generated document
: id.
: The array reference number is MyId & its value is some-interested-value
: matched to MyID.
:
: So,how can I generate custom document id.?

you can't .. you can index you custom "MyID" value and use the FieldCache
to look it up -- using the lucene docid.

That's what Karl is refering to: he uses the lucene docid as the *index*
in the array, and his MyId values are *stored* in the array.


: > -----Original Message-----
: > From: Chris Hostetter [mailto:hossman_lucene@fucit.org]
: > Sent: Friday, December 15, 2006 6:35 AM
: > To: java-user@lucene.apache.org
: > Subject: Re: Lucene id generation
: >
: >
: > Karl: it sounds like you are just refering to using the
: > lucene docid as an array index for the FieldCache of your
: > "MyID" field ... that's a perfectly valid use of the docid,
: > the key being that you aren't expecting the id to contain any
: > meaningful data itself -- it's just a refrence number.
: >
: > : > if you are trying to think of Lucene's docid as a meaningful
: > : > number, you
: > : > are doing something wrong.
: > :
: > : There is this one place where I use it. The index is add only, and
: > : the only data that interests me is the stored field MyID, also kept
: > : track in an int[docid]. In case of index operation that
: > change docid,
: > : I simply repopulate the int[].
: > :
: > : I throw this MyID value around quite a bit, starting in the hit
: > : collector. It save me time from deserializing all hits.
: > :
: > : Is this reasonable?
: >
: >
: >
: > -Hoss
: >
: >
: > ---------------------------------------------------------------------
: > To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
: > For additional commands, e-mail: java-user-help@lucene.apache.org
: >
:
:
: ---------------------------------------------------------------------
: To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
: For additional commands, e-mail: java-user-help@lucene.apache.org
:



-Hoss


---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org

RE: Lucene id generation

Posted by Ramana Jelda <ra...@ciao-group.com>.

Hi Hoss,
Exactly in this scenario, I would love to use my custom generated document
id.
The array reference number is MyId & its value is some-interested-value
matched to MyID.

So,how can I generate custom document id.?

Jelda

> -----Original Message-----
> From: Chris Hostetter [mailto:hossman_lucene@fucit.org] 
> Sent: Friday, December 15, 2006 6:35 AM
> To: java-user@lucene.apache.org
> Subject: Re: Lucene id generation
> 
> 
> Karl: it sounds like you are just refering to using the 
> lucene docid as an array index for the FieldCache of your 
> "MyID" field ... that's a perfectly valid use of the docid, 
> the key being that you aren't expecting the id to contain any 
> meaningful data itself -- it's just a refrence number.
> 
> : > if you are trying to think of Lucene's docid as a meaningful
> : > number, you
> : > are doing something wrong.
> :
> : There is this one place where I use it. The index is add only, and
> : the only data that interests me is the stored field MyID, also kept
> : track in an int[docid]. In case of index operation that 
> change docid,
> : I simply repopulate the int[].
> :
> : I throw this MyID value around quite a bit, starting in the hit
> : collector. It save me time from deserializing all hits.
> :
> : Is this reasonable?
> 
> 
> 
> -Hoss
> 
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org
> 


---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org

Re: Lucene id generation

Posted by Chris Hostetter <ho...@fucit.org>.

Karl: it sounds like you are just refering to using the lucene docid as an
array index for the FieldCache of your "MyID" field ... that's a perfectly
valid use of the docid, the key being that you aren't expecting the id to
contain any meaningful data itself -- it's just a refrence number.

: > if you are trying to think of Lucene's docid as a meaningful
: > number, you
: > are doing something wrong.
:
: There is this one place where I use it. The index is add only, and
: the only data that interests me is the stored field MyID, also kept
: track in an int[docid]. In case of index operation that change docid,
: I simply repopulate the int[].
:
: I throw this MyID value around quite a bit, starting in the hit
: collector. It save me time from deserializing all hits.
:
: Is this reasonable?



-Hoss


---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org

Re: Lucene id generation

Posted by karl wettin <ka...@gmail.com>.

11 dec 2006 kl. 20.04 skrev Chris Hostetter:

> if you are trying to think of Lucene's docid as a meaningful  
> number, you
> are doing something wrong.

There is this one place where I use it. The index is add only, and  
the only data that interests me is the stored field MyID, also kept  
track in an int[docid]. In case of index operation that change docid,  
I simply repopulate the int[].

I throw this MyID value around quite a bit, starting in the hit  
collector. It save me time from deserializing all hits.

Is this reasonable?




---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org

RE: Lucene id generation

Posted by Chris Hostetter <ho...@fucit.org>.

if you are trying to think of Lucene's docid as a meaningful number, you
are doing something wrong.

A lot of people want to view Lucene docids the same way they look at
auto-incrimented unique keys in a database -- don't do that.  Instead
think of them as memory addresses in C or C++ ... they are a handy
numberic value that tells Lucene at what offset in various segment files
it can find data about that document -- as your index changes, as data
gets moved arround, docids change.

the best corrallary that can be made to a database is not auto-generated
unique keys, it's row numbers ... the physical row number of where that
row is in the sequence of rows in your table -- a number most databases
never give you access to unless you are dealing withthe low level
internals of hte table, because as you add or deleted lots of data, as you
drop and load new indexes those numbers can change.

if you want control of a unique ID for each of hte documents in your index
-- at one as a field just like any other.



: Date: Mon, 11 Dec 2006 17:10:18 +0100
: From: Ramana Jelda <ra...@ciao-group.com>
: Reply-To: java-user@lucene.apache.org
: To: java-user@lucene.apache.org
: Subject: RE: Lucene id generation
:
: I really lack this feature from lucene too.
: Whatever the requirements from Mohammed, There surely I see some
: improvements in search performance.
:
: My argument here is, why not lucene provides a mechanism to be able to
: provide custom document ids?
:
:
: > -----Original Message-----
: > From: Find Me [mailto:findmath@gmail.com]
: > Sent: Monday, December 11, 2006 4:34 PM
: > To: java-user@lucene.apache.org
: > Subject: Re: Lucene id generation
: >
: > On 12/11/06, Waheed Mohammed <wa...@fiz-technik.de> wrote:
: > >
: > > Hello,
: > >
: > > Is there a way to influence lucene's generation of ids
: > while indexing.
: > >
: > > my requirement is. I want to have different indexes where no index
: > > should have ids that have been assigned to an index earlier.
: > > for instance
: > > IDX1 : {0.........100}
: > > IDX2: {101.......200}
: > > IDX3: {201.......300}
: > > but not
: > > IDX1 : {0.........100}
: > > IDX2 : {0.........100}
: > > IDX3 : {0.........100}
: >
: >
: > I dont think you should be doing that. If you want to have
: > the same effect,
: > during searching you can package hits from different indices with a
: > predetermined offset for each index. For ex: IDX1 will have
: > an offset 0,
: > IDX2 will have 101...and so on.
: >
: > --Rajesh Munavalli
: >
:
:
: ---------------------------------------------------------------------
: To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
: For additional commands, e-mail: java-user-help@lucene.apache.org
:



-Hoss


---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org

RE: Lucene id generation

Posted by Ramana Jelda <ra...@ciao-group.com>.

I really lack this feature from lucene too.
Whatever the requirements from Mohammed, There surely I see some
improvements in search performance.

My argument here is, why not lucene provides a mechanism to be able to
provide custom document ids?


> -----Original Message-----
> From: Find Me [mailto:findmath@gmail.com] 
> Sent: Monday, December 11, 2006 4:34 PM
> To: java-user@lucene.apache.org
> Subject: Re: Lucene id generation
> 
> On 12/11/06, Waheed Mohammed <wa...@fiz-technik.de> wrote:
> >
> > Hello,
> >
> > Is there a way to influence lucene's generation of ids 
> while indexing.
> >
> > my requirement is. I want to have different indexes where no index 
> > should have ids that have been assigned to an index earlier.
> > for instance
> > IDX1 : {0.........100}
> > IDX2: {101.......200}
> > IDX3: {201.......300}
> > but not
> > IDX1 : {0.........100}
> > IDX2 : {0.........100}
> > IDX3 : {0.........100}
> 
> 
> I dont think you should be doing that. If you want to have 
> the same effect,
> during searching you can package hits from different indices with a
> predetermined offset for each index. For ex: IDX1 will have 
> an offset 0,
> IDX2 will have 101...and so on.
> 
> --Rajesh Munavalli
> 


---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org

Re: Lucene id generation

Posted by Find Me <fi...@gmail.com>.

On 12/11/06, Waheed Mohammed <wa...@fiz-technik.de> wrote:
>
> Hello,
>
> Is there a way to influence lucene's generation of ids while indexing.
>
> my requirement is. I want to have different indexes where no index should
> have
> ids that have been assigned to an index earlier.
> for instance
> IDX1 : {0.........100}
> IDX2: {101.......200}
> IDX3: {201.......300}
> but not
> IDX1 : {0.........100}
> IDX2 : {0.........100}
> IDX3 : {0.........100}

I dont think you should be doing that. If you want to have the same effect,
during searching you can package hits from different indices with a
predetermined offset for each index. For ex: IDX1 will have an offset 0,
IDX2 will have 101...and so on.

--Rajesh Munavalli

Lucene id generation

Posted by Waheed Mohammed <wa...@fiz-technik.de>.

Hello,

Is there a way to influence lucene's generation of ids while indexing.

my requirement is. I want to have different indexes where no index should have  
ids that have been assigned to an index earlier.
for instance
IDX1 : {0.........100}
IDX2: {101.......200}
IDX3: {201.......300}
but not 
IDX1 : {0.........100}
IDX2 : {0.........100}
IDX3 : {0.........100}

any help is greatly appriciated.
-- 
A W Mohammed
Software Entwickler
FIZ-technik e.V
Frankfut am Main


---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org

RE: Using Lucene to search log files

Posted by Mike Streeton <mi...@ardentiasearch.com>.

I would use a RangeFilter instead of using the default Boolean query as
this will always break at some point with Too many Boolean clauses.
Extend QueryParser to sort this out.  As far as extracting information
from log files I would look at creating yourself a LogAnalyzer that can
interpret the contents and indexing it appropriately.

Hope this helps

Mike

www.ardentiasearch.com the home of NetSearch
-----Original Message-----
From: abdul aleem [mailto:janaabdulaleem@yahoo.com] 
Sent: 11 December 2006 12:04
To: java-user@lucene.apache.org
Subject: Using Lucene to search log files

Hi All,

Im a Lucene newbie,


Requirement : 
==============
a) Build a log viewer tool, search log files for
keywords and time stamp

b)  files in production approx 200 logs per day and
each log file may range from 1MB - 5MB

Lucene 
========
We wanted to utilize Lucene's search capabilities
especially search all 200 log files content quickly

a) Search criteria:
    i) Timestamp search: Fetch contents between any
two timestamps 

   ii) Fetch log file contents for specified keyword 


Query 
========
    a ) Would greatly appreciate if some suggestions 
        whether Lucene will be appropriate tool for
the requirement ??


    b) I have tried to use SpanQuery however
struggling to fetch entire conents e.g. (between two
timestamps) 

    c) I had also looked at
LargeScaleDateRangeProcessing in the wiki, is that a
right approach for the requirement



  Any help / suggestion would be greatly appreciated,


  Many thanks in advance,
  Abdul    



 
________________________________________________________________________
____________
Do you Yahoo!?
Everyone is raving about the all-new Yahoo! Mail beta.
http://new.mail.yahoo.com

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org




---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org