You are viewing a plain text version of this content. The canonical link for it is here.
Posted to java-user@lucene.apache.org by Luke Shannon <ls...@hypermedia.com> on 2004/12/08 00:26:18 UTC

Weird Behavior On Windows

Hello All;

Things have been running smoothly on Linux for sometime. We set up a version
of the site on a Win2K machine, this is when all the "fun" started.

A pdf would be added to the system. The indexer would run, find the new
file, index it and successfully complete the update of the index folder. No
IO error, no errors of any kind. Just like on the Linux box.

Now we would try to search for a term in the document. 0 results would be
returned? To make matters worse if I run a search on a term that shows up in
a bunch of documents on windows it only find 2 results, where in Linux it
would find 50 (same content).

Using "Luke" I was able to verify that the pdf in question is in the index.
Why can't the searcher find it?

Any ideas would be welcome.

Luke



---------------------------------------------------------------------
To unsubscribe, e-mail: lucene-user-unsubscribe@jakarta.apache.org
For additional commands, e-mail: lucene-user-help@jakarta.apache.org


Re: Empty/non-empty field indexing question

Posted by Otis Gospodnetic <ot...@yahoo.com>.
Correct.
No, there is no point in putting an empty field there.

Otis

--- "amigo@max3d.com" <am...@max3d.com> wrote:

> Hi Otis
> 
> What kind of implications does that produce on the search?
> 
> If I understand correctly that record would not be searched for if
> the 
> field is not there, correct?
> But then is there a point putting an empty value in it, if an 
> application will never search for empty values?
> 
> 
> thanks
> 
> -pedja
> 
> 
> Otis Gospodnetic said the following on 12/8/2004 1:31 AM:
> 
> >Empty fields won't add any value, you can skip them.  Documents in
> an
> >index don't have to be uniform.  Each Document could have a
> different
> >set of fields.  Of course, that has some obvious implications for
> >search, but is perfectly fine technically.
> >
> >Otis
> >
> >--- "amigo@max3d.com" <am...@max3d.com> wrote:
> >
> >  
> >
> >>Here's probably a silly question, very newbish, but I had to ask.
> >>Since I have mysql documents that contain over 30 fields each and
> >>most of them
> >>are added to the index, is it a common practice to add fields to
> the
> >>index with 
> >>empty values, for that perticular record, or should the field be
> >>totally omitted.
> >>
> >>What I mean is if let's say a Title field is empty on a specific
> >>record (in mysql)
> >>should I still add that field into Lucene index with an empty value
> >>or just
> >>skip it and only add the fields that contain non-empty values?
> >>
> >>thanks
> >>
> >>-pedja
> >>
> >>
> >>
> >>
>
>>---------------------------------------------------------------------
> >>To unsubscribe, e-mail: lucene-user-unsubscribe@jakarta.apache.org
> >>For additional commands, e-mail:
> lucene-user-help@jakarta.apache.org
> >>
> >>
> >>    
> >>
> >
> >
>
>---------------------------------------------------------------------
> >To unsubscribe, e-mail: lucene-user-unsubscribe@jakarta.apache.org
> >For additional commands, e-mail: lucene-user-help@jakarta.apache.org
> >
> >
> >
> >  
> >
> 


---------------------------------------------------------------------
To unsubscribe, e-mail: lucene-user-unsubscribe@jakarta.apache.org
For additional commands, e-mail: lucene-user-help@jakarta.apache.org


Re: Empty/non-empty field indexing question

Posted by "amigo@max3d.com" <am...@max3d.com>.
Hi Otis

What kind of implications does that produce on the search?

If I understand correctly that record would not be searched for if the 
field is not there, correct?
But then is there a point putting an empty value in it, if an 
application will never search for empty values?


thanks

-pedja


Otis Gospodnetic said the following on 12/8/2004 1:31 AM:

>Empty fields won't add any value, you can skip them.  Documents in an
>index don't have to be uniform.  Each Document could have a different
>set of fields.  Of course, that has some obvious implications for
>search, but is perfectly fine technically.
>
>Otis
>
>--- "amigo@max3d.com" <am...@max3d.com> wrote:
>
>  
>
>>Here's probably a silly question, very newbish, but I had to ask.
>>Since I have mysql documents that contain over 30 fields each and
>>most of them
>>are added to the index, is it a common practice to add fields to the
>>index with 
>>empty values, for that perticular record, or should the field be
>>totally omitted.
>>
>>What I mean is if let's say a Title field is empty on a specific
>>record (in mysql)
>>should I still add that field into Lucene index with an empty value
>>or just
>>skip it and only add the fields that contain non-empty values?
>>
>>thanks
>>
>>-pedja
>>
>>
>>
>>
>>---------------------------------------------------------------------
>>To unsubscribe, e-mail: lucene-user-unsubscribe@jakarta.apache.org
>>For additional commands, e-mail: lucene-user-help@jakarta.apache.org
>>
>>
>>    
>>
>
>
>---------------------------------------------------------------------
>To unsubscribe, e-mail: lucene-user-unsubscribe@jakarta.apache.org
>For additional commands, e-mail: lucene-user-help@jakarta.apache.org
>
>
>
>  
>

Re: Empty/non-empty field indexing question

Posted by Otis Gospodnetic <ot...@yahoo.com>.
Empty fields won't add any value, you can skip them.  Documents in an
index don't have to be uniform.  Each Document could have a different
set of fields.  Of course, that has some obvious implications for
search, but is perfectly fine technically.

Otis

--- "amigo@max3d.com" <am...@max3d.com> wrote:

> Here's probably a silly question, very newbish, but I had to ask.
> Since I have mysql documents that contain over 30 fields each and
> most of them
> are added to the index, is it a common practice to add fields to the
> index with 
> empty values, for that perticular record, or should the field be
> totally omitted.
> 
> What I mean is if let's say a Title field is empty on a specific
> record (in mysql)
> should I still add that field into Lucene index with an empty value
> or just
> skip it and only add the fields that contain non-empty values?
> 
> thanks
> 
> -pedja
> 
> 
> 
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: lucene-user-unsubscribe@jakarta.apache.org
> For additional commands, e-mail: lucene-user-help@jakarta.apache.org
> 
> 


---------------------------------------------------------------------
To unsubscribe, e-mail: lucene-user-unsubscribe@jakarta.apache.org
For additional commands, e-mail: lucene-user-help@jakarta.apache.org


Empty/non-empty field indexing question

Posted by "amigo@max3d.com" <am...@max3d.com>.
Here's probably a silly question, very newbish, but I had to ask.
Since I have mysql documents that contain over 30 fields each and most of them
are added to the index, is it a common practice to add fields to the index with 
empty values, for that perticular record, or should the field be totally omitted.

What I mean is if let's say a Title field is empty on a specific record (in mysql)
should I still add that field into Lucene index with an empty value or just
skip it and only add the fields that contain non-empty values?

thanks

-pedja




---------------------------------------------------------------------
To unsubscribe, e-mail: lucene-user-unsubscribe@jakarta.apache.org
For additional commands, e-mail: lucene-user-help@jakarta.apache.org


Re: Weird Behavior On Windows

Posted by Luke Shannon <ls...@hypermedia.com>.
Hey Ottis;

You're right again. Turned out there was a exception around the usage of the
Digester class that wasn't being written to the log. This exception was
being thrown as a result of a configuration issue with the server.

Everything is back to normal.

Thanks!

Luke
----- Original Message ----- 
From: "Otis Gospodnetic" <ot...@yahoo.com>
To: "Lucene Users List" <lu...@jakarta.apache.org>
Sent: Tuesday, December 07, 2004 6:27 PM
Subject: Re: Weird Behavior On Windows


> The index has been modified, so you need a new IndexSearcher.  Could
> there be logic in the flaw (swap that), or could you be catching an
> Exception that is thrown only on Winblows due to Windows not letting
> you do certain things with referenced files and dirs?
>
> Otis
>
> --- Luke Shannon <ls...@hypermedia.com> wrote:
>
> > Hello All;
> >
> > Things have been running smoothly on Linux for sometime. We set up a
> > version
> > of the site on a Win2K machine, this is when all the "fun" started.
> >
> > A pdf would be added to the system. The indexer would run, find the
> > new
> > file, index it and successfully complete the update of the index
> > folder. No
> > IO error, no errors of any kind. Just like on the Linux box.
> >
> > Now we would try to search for a term in the document. 0 results
> > would be
> > returned? To make matters worse if I run a search on a term that
> > shows up in
> > a bunch of documents on windows it only find 2 results, where in
> > Linux it
> > would find 50 (same content).
> >
> > Using "Luke" I was able to verify that the pdf in question is in the
> > index.
> > Why can't the searcher find it?
> >
> > Any ideas would be welcome.
> >
> > Luke
> >
> >
> >
> > ---------------------------------------------------------------------
> > To unsubscribe, e-mail: lucene-user-unsubscribe@jakarta.apache.org
> > For additional commands, e-mail: lucene-user-help@jakarta.apache.org
> >
> >
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: lucene-user-unsubscribe@jakarta.apache.org
> For additional commands, e-mail: lucene-user-help@jakarta.apache.org
>
>



---------------------------------------------------------------------
To unsubscribe, e-mail: lucene-user-unsubscribe@jakarta.apache.org
For additional commands, e-mail: lucene-user-help@jakarta.apache.org


Re: Weird Behavior On Windows

Posted by Luke Shannon <ls...@hypermedia.com>.
Hi Otis;

Each time a search request comes in I create a new searcher (same analyzer
as used during indexing). The idea about catching an error somewhere is
interesting, although in most of the cases where I catch an exception I
write to a log file. Anyway, this is all I have to gone on so I am looking
into exceptions now...

Luke
----- Original Message ----- 
From: "Otis Gospodnetic" <ot...@yahoo.com>
To: "Lucene Users List" <lu...@jakarta.apache.org>
Sent: Tuesday, December 07, 2004 6:27 PM
Subject: Re: Weird Behavior On Windows


> The index has been modified, so you need a new IndexSearcher.  Could
> there be logic in the flaw (swap that), or could you be catching an
> Exception that is thrown only on Winblows due to Windows not letting
> you do certain things with referenced files and dirs?
>
> Otis
>
> --- Luke Shannon <ls...@hypermedia.com> wrote:
>
> > Hello All;
> >
> > Things have been running smoothly on Linux for sometime. We set up a
> > version
> > of the site on a Win2K machine, this is when all the "fun" started.
> >
> > A pdf would be added to the system. The indexer would run, find the
> > new
> > file, index it and successfully complete the update of the index
> > folder. No
> > IO error, no errors of any kind. Just like on the Linux box.
> >
> > Now we would try to search for a term in the document. 0 results
> > would be
> > returned? To make matters worse if I run a search on a term that
> > shows up in
> > a bunch of documents on windows it only find 2 results, where in
> > Linux it
> > would find 50 (same content).
> >
> > Using "Luke" I was able to verify that the pdf in question is in the
> > index.
> > Why can't the searcher find it?
> >
> > Any ideas would be welcome.
> >
> > Luke
> >
> >
> >
> > ---------------------------------------------------------------------
> > To unsubscribe, e-mail: lucene-user-unsubscribe@jakarta.apache.org
> > For additional commands, e-mail: lucene-user-help@jakarta.apache.org
> >
> >
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: lucene-user-unsubscribe@jakarta.apache.org
> For additional commands, e-mail: lucene-user-help@jakarta.apache.org
>
>



---------------------------------------------------------------------
To unsubscribe, e-mail: lucene-user-unsubscribe@jakarta.apache.org
For additional commands, e-mail: lucene-user-help@jakarta.apache.org


Re: Weird Behavior On Windows

Posted by Otis Gospodnetic <ot...@yahoo.com>.
The index has been modified, so you need a new IndexSearcher.  Could
there be logic in the flaw (swap that), or could you be catching an
Exception that is thrown only on Winblows due to Windows not letting
you do certain things with referenced files and dirs?

Otis

--- Luke Shannon <ls...@hypermedia.com> wrote:

> Hello All;
> 
> Things have been running smoothly on Linux for sometime. We set up a
> version
> of the site on a Win2K machine, this is when all the "fun" started.
> 
> A pdf would be added to the system. The indexer would run, find the
> new
> file, index it and successfully complete the update of the index
> folder. No
> IO error, no errors of any kind. Just like on the Linux box.
> 
> Now we would try to search for a term in the document. 0 results
> would be
> returned? To make matters worse if I run a search on a term that
> shows up in
> a bunch of documents on windows it only find 2 results, where in
> Linux it
> would find 50 (same content).
> 
> Using "Luke" I was able to verify that the pdf in question is in the
> index.
> Why can't the searcher find it?
> 
> Any ideas would be welcome.
> 
> Luke
> 
> 
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: lucene-user-unsubscribe@jakarta.apache.org
> For additional commands, e-mail: lucene-user-help@jakarta.apache.org
> 
> 


---------------------------------------------------------------------
To unsubscribe, e-mail: lucene-user-unsubscribe@jakarta.apache.org
For additional commands, e-mail: lucene-user-help@jakarta.apache.org