You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@lucy.apache.org by Shahab <sh...@gmail.com> on 2014/08/27 18:29:37 UTC

[lucy-user] Re: How to make Lucy return multiple matches on same index (file)

Peter Karman <pe...@...> writes:

> 
> Shahab wrote on 8/27/14 7:12 AM:
> > Hi,
> > 
> > I am trying Lucy for first time.
> > 
> > I will like to know if we can have Lucy return all matches of a 
search query 
> > in a single file.
> > 
> > ex: indexing log file funny.txt having content
> > 1: This is a lion and this is a tiger
> > 2: This is a lion only
> > 3: This is a tiger and a lion both
> > 
> > search : lion
> > 
> > Result :
> > funny.txt "...This is a lion and..."
> > funny.txt "...This is a lion ..."
> > funny.txt "...and a lion ..."
> > (some thing similar)
> > 
> 
> Shahab,
> 
> Lucy works on collections of documents, where each hit in a search 
result set
> represents a single document.
> 
> That said, people use tools like Lucy to search logs, e.g., by 
creating a single
> "document" for each line in the log file.
> 
> If you just want to find all the matches for a term in a single file, 
though,
> you're better off with something like 'grep'.
> 
> HTH,
> pek
> 

Dear Pek,
Thanks for your reply. I am looking for indexing big files > 2 GB and 
grep is not to be used as per the use case. Lucy looked liked a good 
option to me other than creating my own index file, indexing on line 
number and some fixed keywords..

Regards
Shahab



Re: [lucy-user] Re: How to make Lucy return multiple matches on same index (file)

Posted by Peter Karman <pe...@peknet.com>.
Shahab wrote on 8/27/14 11:29 AM:

> Thanks for your reply. I am looking for indexing big files > 2 GB and 
> grep is not to be used as per the use case. Lucy looked liked a good 
> option to me other than creating my own index file, indexing on line 
> number and some fixed keywords..
> 

Yes, you could use Lucy. Create a virtual "document" for each line in the file,
and fields for the line number and line content. If the line contains "fields"
like a logfile might, you can create a separate field for each segment of the
line. I do that same kind of thing for server log files.


-- 
Peter Karman  .  http://peknet.com/  .  peter@peknet.com