You are viewing a plain text version of this content. The canonical link for it is here.
Posted to java-user@lucene.apache.org by Karthik N S <ka...@controlnet.co.in> on 2005/03/09 10:22:21 UTC
SPAN QUERY [HOW TO]
Hi
Guys
Apologies..........
The new Feature of lucene 'span query' really is interesting
But need expert suggestions on achieveing the same.
I have 3 documents
Document 1 contains = ELECTRONICS DIGITAL CAMERA
Document 2 contains = ELECTRONICS DIGITAL CAMERA 0PTICS
Document 3 contains = ELECTRONICS DIGITAL CAMERA ACCESSIORIES
search word = " DIGITAL CAMERA "
Returned hits = 1st doc ONLY [ 2 and 3rd document should not be in the
hit ]
SpanQuery /PharseQuery ????
How would one achieve this ??? Please
WITH WARM REGARDS
HAVE A NICE DAY
[ N.S.KARTHIK]
RE: SPAN QUERY [HOW TO]
Posted by Miles Barr <mi...@runtime-collective.com>.
What fields do you have and what are you putting in them?
On Thu, 2005-03-10 at 17:56 +0530, Karthik N S wrote:
> Hi Guys
>
> Apologies.......
>
>
> I ditto as u said but the SpanNearQuery is
>
> returning me all the 3 documents containing for rollover of words
>
> 'DIGITAL CAMERAS' instead of returning me the 1st doc, Or none by changing
> the slop factor
>
> Any more ideas Please do .......... B(
>
> with regards
> karthik
>
>
> -----Original Message-----
> From: Miles Barr [mailto:miles@runtime-collective.com]
> Sent: Thursday, March 10, 2005 2:53 PM
> To: java-user@lucene.apache.org
> Subject: RE: SPAN QUERY [HOW TO]
>
>
> On Thu, 2005-03-10 at 12:02 +0530, Karthik N S wrote:
> > U got it bingo,Am trying to do something similar as u replied.
> > But there is a glitch in the process
> >
> > If the search is done on the 'leaf_category' as u said
> >
> > with word such as 'CAMERA DIGITAL' instead of 'DIGITAL CAMERA' the
> > resultant
> >
> > return hits will be ZERO '0'. Usage of SpanQuery for such conditions
> > applied should return still
> >
> > the 1st document of 3.
> >
> > A permutation combination of words entered should result in the specific
> > document being returned.
>
> If depends what the type of leaf_category is. If you made it Keyword as
> I suggested then it won't be tokenized. i.e. there's one token 'DIGITAL
> CAMERA' instead of the two tokens you normally get, 'digital' and
> 'camera'.
>
> If you change the field type to Text you should be able to use a
> SpanNearQuery to do your search.
>
> --
> Miles Barr <mi...@runtime-collective.com>
> Runtime Collective Ltd.
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org
>
>
--
Miles Barr <mi...@runtime-collective.com>
Runtime Collective Ltd.
---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org
RE: SPAN QUERY [HOW TO]
Posted by Karthik N S <ka...@controlnet.co.in>.
Hi Guys
Apologies.......
I ditto as u said but the SpanNearQuery is
returning me all the 3 documents containing for rollover of words
'DIGITAL CAMERAS' instead of returning me the 1st doc, Or none by changing
the slop factor
Any more ideas Please do .......... B(
with regards
karthik
-----Original Message-----
From: Miles Barr [mailto:miles@runtime-collective.com]
Sent: Thursday, March 10, 2005 2:53 PM
To: java-user@lucene.apache.org
Subject: RE: SPAN QUERY [HOW TO]
On Thu, 2005-03-10 at 12:02 +0530, Karthik N S wrote:
> U got it bingo,Am trying to do something similar as u replied.
> But there is a glitch in the process
>
> If the search is done on the 'leaf_category' as u said
>
> with word such as 'CAMERA DIGITAL' instead of 'DIGITAL CAMERA' the
> resultant
>
> return hits will be ZERO '0'. Usage of SpanQuery for such conditions
> applied should return still
>
> the 1st document of 3.
>
> A permutation combination of words entered should result in the specific
> document being returned.
If depends what the type of leaf_category is. If you made it Keyword as
I suggested then it won't be tokenized. i.e. there's one token 'DIGITAL
CAMERA' instead of the two tokens you normally get, 'digital' and
'camera'.
If you change the field type to Text you should be able to use a
SpanNearQuery to do your search.
--
Miles Barr <mi...@runtime-collective.com>
Runtime Collective Ltd.
---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org
RE: SPAN QUERY [HOW TO]
Posted by Miles Barr <mi...@runtime-collective.com>.
On Thu, 2005-03-10 at 12:02 +0530, Karthik N S wrote:
> U got it bingo,Am trying to do something similar as u replied.
> But there is a glitch in the process
>
> If the search is done on the 'leaf_category' as u said
>
> with word such as 'CAMERA DIGITAL' instead of 'DIGITAL CAMERA' the
> resultant
>
> return hits will be ZERO '0'. Usage of SpanQuery for such conditions
> applied should return still
>
> the 1st document of 3.
>
> A permutation combination of words entered should result in the specific
> document being returned.
If depends what the type of leaf_category is. If you made it Keyword as
I suggested then it won't be tokenized. i.e. there's one token 'DIGITAL
CAMERA' instead of the two tokens you normally get, 'digital' and
'camera'.
If you change the field type to Text you should be able to use a
SpanNearQuery to do your search.
--
Miles Barr <mi...@runtime-collective.com>
Runtime Collective Ltd.
---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org
RE: SPAN QUERY [HOW TO]
Posted by Karthik N S <ka...@controlnet.co.in>.
Hi
Guys.
Apologies........
U got it bingo,Am trying to do something similar as u replied.
But there is a glitch in the process
If the search is done on the 'leaf_category' as u said
with word such as 'CAMERA DIGITAL' instead of 'DIGITAL CAMERA' the
resultant
return hits will be ZERO '0'. Usage of SpanQuery for such conditions
applied should return still
the 1st document of 3.
A permutation combination of words entered should result in the specific
document being returned.
with regards
Karthik
-----Original Message-----
From: Miles Barr [mailto:miles@runtime-collective.com]
Sent: Wednesday, March 09, 2005 7:10 PM
To: java-user@lucene.apache.org
Subject: RE: SPAN QUERY [HOW TO]
It's not clear what you're trying to achieve. PhraseQuery and
SpanNearQuery can help you find tokens that are close to each other. It
you're using the standard analyzer, tokens are words. They won't help
you group documents under a topic.
You should setup some other fields in your Lucene document to hold
category information. e.g. for document 1:
text = ELECTRONICS DIGITAL CAMERA
parent_category = ELECTRONICS
leaf_category = DIGITAL CAMERA
for document 2:
text = ELECTRONICS DIGITAL CAMERA OPTICS
parent_category = ELECTRONICS
parent_category = DIGITAL CAMERA
leaf_category = OPTICS
Then search on the leaf_category. Make sure you setup the category
fields to be type KEYWORD, i.e. not tokenized.
On Wed, 2005-03-09 at 18:07 +0530, Karthik N S wrote:
> Hi Guys
>
> Apologies....
>
> Some body Please Help me for this Form
>
>
> with regards
> Karthik
>
>
> -----Original Message-----
> From: Miles Barr [mailto:miles@runtime-collective.com]
> Sent: Wednesday, March 09, 2005 3:02 PM
> To: java-user@lucene.apache.org
> Subject: Re: SPAN QUERY [HOW TO]
>
>
> On Wed, 2005-03-09 at 14:52 +0530, Karthik N S wrote:
> > The new Feature of lucene 'span query' really is interesting
> >
> > But need expert suggestions on achieveing the same.
> >
> > I have 3 documents
> >
> > Document 1 contains = ELECTRONICS DIGITAL CAMERA
> > Document 2 contains = ELECTRONICS DIGITAL CAMERA 0PTICS
> > Document 3 contains = ELECTRONICS DIGITAL CAMERA ACCESSIORIES
> >
> >
> >
> > search word = " DIGITAL CAMERA "
> >
> > Returned hits = 1st doc ONLY [ 2 and 3rd document should not be in
> > the hit ]
> >
> > SpanQuery /PharseQuery ????
> >
> >
> >
> > How would one achieve this ??? Please
>
> I've used span queries to boost the scores of results where words appear
> close together. I'm not sure exactly what you're trying to achieve. All
> three documents contain the search phrase, so both span and phrase
> queries would return all the documents.
>
> Are you trying to setup a taxonomy? i.e. only display documents in the
> category Electronics > Digital Camera, and not those in sub categories?
> If this is the case you should try to build the categorisation at the
> same time as the indexing process and either add explicit clauses in the
> search query or filter afterwards.
>
>
>
--
Miles Barr <mi...@runtime-collective.com>
Runtime Collective Ltd.
---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org
RE: SPAN QUERY [HOW TO]
Posted by Miles Barr <mi...@runtime-collective.com>.
It's not clear what you're trying to achieve. PhraseQuery and
SpanNearQuery can help you find tokens that are close to each other. It
you're using the standard analyzer, tokens are words. They won't help
you group documents under a topic.
You should setup some other fields in your Lucene document to hold
category information. e.g. for document 1:
text = ELECTRONICS DIGITAL CAMERA
parent_category = ELECTRONICS
leaf_category = DIGITAL CAMERA
for document 2:
text = ELECTRONICS DIGITAL CAMERA OPTICS
parent_category = ELECTRONICS
parent_category = DIGITAL CAMERA
leaf_category = OPTICS
Then search on the leaf_category. Make sure you setup the category
fields to be type KEYWORD, i.e. not tokenized.
On Wed, 2005-03-09 at 18:07 +0530, Karthik N S wrote:
> Hi Guys
>
> Apologies....
>
> Some body Please Help me for this Form
>
>
> with regards
> Karthik
>
>
> -----Original Message-----
> From: Miles Barr [mailto:miles@runtime-collective.com]
> Sent: Wednesday, March 09, 2005 3:02 PM
> To: java-user@lucene.apache.org
> Subject: Re: SPAN QUERY [HOW TO]
>
>
> On Wed, 2005-03-09 at 14:52 +0530, Karthik N S wrote:
> > The new Feature of lucene 'span query' really is interesting
> >
> > But need expert suggestions on achieveing the same.
> >
> > I have 3 documents
> >
> > Document 1 contains = ELECTRONICS DIGITAL CAMERA
> > Document 2 contains = ELECTRONICS DIGITAL CAMERA 0PTICS
> > Document 3 contains = ELECTRONICS DIGITAL CAMERA ACCESSIORIES
> >
> >
> >
> > search word = " DIGITAL CAMERA "
> >
> > Returned hits = 1st doc ONLY [ 2 and 3rd document should not be in
> > the hit ]
> >
> > SpanQuery /PharseQuery ????
> >
> >
> >
> > How would one achieve this ??? Please
>
> I've used span queries to boost the scores of results where words appear
> close together. I'm not sure exactly what you're trying to achieve. All
> three documents contain the search phrase, so both span and phrase
> queries would return all the documents.
>
> Are you trying to setup a taxonomy? i.e. only display documents in the
> category Electronics > Digital Camera, and not those in sub categories?
> If this is the case you should try to build the categorisation at the
> same time as the indexing process and either add explicit clauses in the
> search query or filter afterwards.
>
>
>
--
Miles Barr <mi...@runtime-collective.com>
Runtime Collective Ltd.
---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org
RE: SPAN QUERY [HOW TO]
Posted by Karthik N S <ka...@controlnet.co.in>.
Hi Guys
Apologies....
Some body Please Help me for this Form
with regards
Karthik
-----Original Message-----
From: Miles Barr [mailto:miles@runtime-collective.com]
Sent: Wednesday, March 09, 2005 3:02 PM
To: java-user@lucene.apache.org
Subject: Re: SPAN QUERY [HOW TO]
On Wed, 2005-03-09 at 14:52 +0530, Karthik N S wrote:
> The new Feature of lucene 'span query' really is interesting
>
> But need expert suggestions on achieveing the same.
>
> I have 3 documents
>
> Document 1 contains = ELECTRONICS DIGITAL CAMERA
> Document 2 contains = ELECTRONICS DIGITAL CAMERA 0PTICS
> Document 3 contains = ELECTRONICS DIGITAL CAMERA ACCESSIORIES
>
>
>
> search word = " DIGITAL CAMERA "
>
> Returned hits = 1st doc ONLY [ 2 and 3rd document should not be in
> the hit ]
>
> SpanQuery /PharseQuery ????
>
>
>
> How would one achieve this ??? Please
I've used span queries to boost the scores of results where words appear
close together. I'm not sure exactly what you're trying to achieve. All
three documents contain the search phrase, so both span and phrase
queries would return all the documents.
Are you trying to setup a taxonomy? i.e. only display documents in the
category Electronics > Digital Camera, and not those in sub categories?
If this is the case you should try to build the categorisation at the
same time as the indexing process and either add explicit clauses in the
search query or filter afterwards.
--
Miles Barr <mi...@runtime-collective.com>
Runtime Collective Ltd.
---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org
RE: SPAN QUERY [HOW TO]
Posted by Karthik N S <ka...@controlnet.co.in>.
Hi
Guys
Apologies..........
Yes some similar concept in my head is thundering.
I will be using a field 'text' for the same
Can u guys please tell me using spanQuery or pahrse Query would do the job
,If so How to proceeed.
Thx in advance
Karthik
-----Original Message-----
From: Miles Barr [mailto:miles@runtime-collective.com]
Sent: Wednesday, March 09, 2005 3:02 PM
To: java-user@lucene.apache.org
Subject: Re: SPAN QUERY [HOW TO]
On Wed, 2005-03-09 at 14:52 +0530, Karthik N S wrote:
> The new Feature of lucene 'span query' really is interesting
>
> But need expert suggestions on achieveing the same.
>
> I have 3 documents
>
> Document 1 contains = ELECTRONICS DIGITAL CAMERA
> Document 2 contains = ELECTRONICS DIGITAL CAMERA 0PTICS
> Document 3 contains = ELECTRONICS DIGITAL CAMERA ACCESSIORIES
>
>
>
> search word = " DIGITAL CAMERA "
>
> Returned hits = 1st doc ONLY [ 2 and 3rd document should not be in
> the hit ]
>
> SpanQuery /PharseQuery ????
>
>
>
> How would one achieve this ??? Please
I've used span queries to boost the scores of results where words appear
close together. I'm not sure exactly what you're trying to achieve. All
three documents contain the search phrase, so both span and phrase
queries would return all the documents.
Are you trying to setup a taxonomy? i.e. only display documents in the
category Electronics > Digital Camera, and not those in sub categories?
If this is the case you should try to build the categorisation at the
same time as the indexing process and either add explicit clauses in the
search query or filter afterwards.
--
Miles Barr <mi...@runtime-collective.com>
Runtime Collective Ltd.
---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org
Re: SPAN QUERY [HOW TO]
Posted by Miles Barr <mi...@runtime-collective.com>.
On Wed, 2005-03-09 at 14:52 +0530, Karthik N S wrote:
> The new Feature of lucene 'span query' really is interesting
>
> But need expert suggestions on achieveing the same.
>
> I have 3 documents
>
> Document 1 contains = ELECTRONICS DIGITAL CAMERA
> Document 2 contains = ELECTRONICS DIGITAL CAMERA 0PTICS
> Document 3 contains = ELECTRONICS DIGITAL CAMERA ACCESSIORIES
>
>
>
> search word = " DIGITAL CAMERA "
>
> Returned hits = 1st doc ONLY [ 2 and 3rd document should not be in
> the hit ]
>
> SpanQuery /PharseQuery ????
>
>
>
> How would one achieve this ??? Please
I've used span queries to boost the scores of results where words appear
close together. I'm not sure exactly what you're trying to achieve. All
three documents contain the search phrase, so both span and phrase
queries would return all the documents.
Are you trying to setup a taxonomy? i.e. only display documents in the
category Electronics > Digital Camera, and not those in sub categories?
If this is the case you should try to build the categorisation at the
same time as the indexing process and either add explicit clauses in the
search query or filter afterwards.
--
Miles Barr <mi...@runtime-collective.com>
Runtime Collective Ltd.
---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org