You are viewing a plain text version of this content. The canonical link for it is here.
Posted to general@lucene.apache.org by John Powers <jp...@configureone.com> on 2007/03/22 20:41:55 UTC

retreiving wrong categories

Hello,

 

I have an installation of lucene that is retrieving the wrong documents,
consistently.    The code hasn't changed and works fine in other
installations. I have been using lucene successfully for a couple years
now and I haven't seen this problem since I was originally implementing
lucene.

 

I have a field "cat" (short for categories) that this item belongs to.
It's a hierarchy and this field looks like:

 

Bases[2]{0}

 

Or 

 

Computer Cradles Docking Stations[7]{3}|Motorola[15]{1}|ML850[43]{1}|

 

 

Its cat name [catID] {cat sequence number}   |    another category |
another category, etc etc

 

So if I search cat for "2" I should find anything that belongs in that
category.  In the second example here, the item belongs to cat 43, and
by this ancestry belongs to cat 15 and cat 7 as well.

 

For the most part this works.  However, in this problem install, there
are a couple items that consistently are appearing out of their
category.     If I search for cat7, an item keeps coming up with the
"Bases[2]{0}" category.   Also, if I look in "bases", a couple of the
"Computer Cradles Docking Stations" items come up, amoung other problem
items.        If I print out the category from lucene right there on the
display it shows that it is still in that category..  so I don't feel
that it's an indexing issue or before indexing.    For somereason a
search of 

 

+cat:("2") +(cartable:1) 

 

 

Is getting me mostly cat2 items but a few others.      

 

What can I do to start tracking down why this is?

Any thoughts?

 

Thanks for your help

 

-jN


Re: retreiving wrong categories

Posted by Chris Hostetter <ho...@fucit.org>.
FYI: general@lucene is a very high level, low subscriber list for
discussing broad topics relating to the entire lucene Top Levle Project
(Lucene-Java, Nutch, Hadoop, Solr, Lucy, Lucene.Net, etc...)  your
question is probably best asked on the java-user list, unless it relates
to a port to another language, in which case you should use the
appropriate user list)

that said...

: I have an installation of lucene that is retrieving the wrong documents,
: consistently.    The code hasn't changed and works fine in other
: installations. I have been using lucene successfully for a couple years
: now and I haven't seen this problem since I was originally implementing
: lucene.

: I have a field "cat" (short for categories) that this item belongs to.
: It's a hierarchy and this field looks like:

: Computer Cradles Docking Stations[7]{3}|Motorola[15]{1}|ML850[43]{1}|

: +cat:("2") +(cartable:1)

: Is getting me mostly cat2 items but a few others.

: What can I do to start tracking down why this is?

I would start by using Luke to inspect what is actually indexed for each
of these docs, if i had to guess i would suppose that maybe cat:("2") is
matching not only on categoryId #2, but categories where "2" is the
sequence number.




-Hoss